regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
       [not found] <CAO_iuKEL8tHnovpGiQGUxg7JUpZFxHpxhOHbqAMgbt5R4Eftgg@mail.gmail.com>
@ 2022-02-15  8:25 ` Thorsten Leemhuis
  2022-02-28 14:30   ` Thorsten Leemhuis
  0 siblings, 1 reply; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-02-15  8:25 UTC (permalink / raw)
  To: Nico Sneck, linux-wireless, regressions


[TLDR: I'm adding the regression report below to regzbot, the Linux
kernel regression tracking bot; all text you find below is compiled from
a few templates paragraphs you might have encountered already already
from similar mails.]

Hi, this is your Linux kernel regression tracker speaking.

CCing the regression mailing list, as it should be in the loop for all
regressions, as explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

To be sure this issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced v5.15..v5.16
#regzbot title net: wireless: rtw_8822ce: Wifi connection doesn't really
work anymore
#regzbot ignore-activity

Reminder for developers: when fixing the issue, please add a 'Link:'
tags pointing to the report (the mail quoted above) using
lore.kernel.org/r/, as explained in
'Documentation/process/submitting-patches.rst' and
'Documentation/process/5.Posting.rst'. This allows the bot to connect
the report with any patches posted or committed to fix the issue; this
again allows the bot to show the current status of regressions and
automatically resolve the issue when the fix hits the right tree.

I'm sending this to everyone that got the initial report, to make them
aware of the tracking. I also hope that messages like this motivate
people to directly get at least the regression mailing list and ideally
even regzbot involved when dealing with regressions, as messages like
this wouldn't be needed then.

Don't worry, I'll send further messages wrt to this regression just to
the lists (with a tag in the subject so people can filter them away), if
they are relevant just for regzbot. With a bit of luck no such messages
will be needed anyway.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.



On 14.02.22 20:25, Nico Sneck wrote:
> Hi,
> 
> I'm running Fedora 35 on a Huawei Matestation S (HUAWEI PUM-WDX9), AMD
> Renoir with Realtek rtw_8822ce handling wifi stuff.
> 
> Ever since the kernel update from 5.15.13-200.fc35 to 5.16.8-200.fc35
> (which I performed Feb 12th), I noticed that my Wifi connection
> doesn't really work anymore. I'm connecting to a Zyxel VMG3927-B50A,
> and it appears to be using 5 GHz connection always. I also tested that
> 5.17-rc4 also suffers from this issue.
> 
> The issue is that even trying to ping my routers gateway address will
> result in connection timeouts, and ping times are in the thousands to
> tens of thousands of milliseconds (normally peak ping times are ~3-6
> ms), making wireless unusable with 5.16+.
> I can also see that in dmesg logs there are two types of rtw_8822ce
> driver warnings flooding the logs, which I didn't see with 5.15:
> 
> "helmi 13 18:20:03 fedora kernel: rtw_8822ce 0000:06:00.0: timed out
> to flush queue {1,2}"
> "helmi 13 18:16:23 fedora kernel: rtw_8822ce 0000:06:00.0: failed to
> get tx report from firmware"
> 
> Some stats:
> On kernel 5.15.13-200.fc35 running for 29 days:
> [nico@fedora ~]$ journalctl -k -b -18 | grep 'timed out to flush queue' | wc -l
> 0
> 
> [nico@fedora ~]$ journalctl -k -b -18 | grep 'failed to get tx report
> from firmware' | wc -l
> 0
> 
> On kernel 5.16.8-200.fc35 running for 4 hours:
> [nico@fedora ~]$ journalctl -k -b -17 | grep 'timed out to flush queue' | wc -l
> 45370
> 
> [nico@fedora ~]$ journalctl -k -b -17 | grep 'failed to get tx report
> from firmware' | wc -l
> 502
> 
> I tried bisecting which commit introduced this regression, but after
> some 12 hours of recompiling and testing, it seems like I failed
> somehow. I tried a bisect with first known good revision as
> 8bb7eca972ad (5.15 release commit), and first known bad revision as
> df0cc57e057f (5.16 release commit). I managed to identify that
> revision
> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
> is bad, but then all other revisions were good apart from
> 8a33dcc2f6d5 (refs/bisect/bad) Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
> which was also bad.
> But here's the baffling part, commit 6b278c0cb378 was good, and it's
> the last commit in the merge (8a33dcc2f6d5) which appeared bad.
> Now I retested with 8a33dcc2f6d5, and I don't see the issues anymore,
> so I guess I tested a wrong kernel version at that point or something.
> shrug.
> 
> So I can only assume that the regression came in one of the commits inside
> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
> but it'll take me a while to be try bisecting the commits in that merge again.
> 
> If anyone has any idea about what could cause these issues I'm seeing,
> I can try out patches / test different things. But I'll try
> rebisecting this again soon.
> 
> - Nico

-- 
Additional information about regzbot:

If you want to know more about regzbot, check out its web-interface, the
getting start guide, and the references documentation:

https://linux-regtracking.leemhuis.info/regzbot/
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md

The last two documents will explain how you can interact with regzbot
yourself if your want to.

Hint for reporters: when reporting a regression it's in your interest to
CC the regression list and tell regzbot about the issue, as that ensures
the regression makes it onto the radar of the Linux kernel's regression
tracker -- that's in your interest, as it ensures your report won't fall
through the cracks unnoticed.

Hint for developers: you normally don't need to care about regzbot once
it's involved. Fix the issue as you normally would, just remember to
include 'Link:' tag in the patch descriptions pointing to all reports
about the issue. This has been expected from developers even before
regzbot showed up for reasons explained in
'Documentation/process/submitting-patches.rst' and
'Documentation/process/5.Posting.rst'.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-02-15  8:25 ` rtw_8822ce wifi regression after kernel update from 5.15 to 5.16 Thorsten Leemhuis
@ 2022-02-28 14:30   ` Thorsten Leemhuis
  2022-02-28 22:07     ` Larry Finger
  0 siblings, 1 reply; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-02-28 14:30 UTC (permalink / raw)
  To: Nico Sneck, Yan-Hsuan Chuang; +Cc: regressions, linux-wireless

Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Yan-Hsuan Chuang, sorry, I failed to notice that you didn't get the
regression report below. Could you take a look what's wrong there?

BTW: Nico, did you try another bisection? And is the problem still
happening? Did you maybe give 5.17-rc a shot to check if the problem
still happens there?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

#regzbot poke

On 15.02.22 09:25, Thorsten Leemhuis wrote:
> [...]
> On 14.02.22 20:25, Nico Sneck wrote:
>> Hi,
>>
>> I'm running Fedora 35 on a Huawei Matestation S (HUAWEI PUM-WDX9), AMD
>> Renoir with Realtek rtw_8822ce handling wifi stuff.
>>
>> Ever since the kernel update from 5.15.13-200.fc35 to 5.16.8-200.fc35
>> (which I performed Feb 12th), I noticed that my Wifi connection
>> doesn't really work anymore. I'm connecting to a Zyxel VMG3927-B50A,
>> and it appears to be using 5 GHz connection always. I also tested that
>> 5.17-rc4 also suffers from this issue.
>>
>> The issue is that even trying to ping my routers gateway address will
>> result in connection timeouts, and ping times are in the thousands to
>> tens of thousands of milliseconds (normally peak ping times are ~3-6
>> ms), making wireless unusable with 5.16+.
>> I can also see that in dmesg logs there are two types of rtw_8822ce
>> driver warnings flooding the logs, which I didn't see with 5.15:
>>
>> "helmi 13 18:20:03 fedora kernel: rtw_8822ce 0000:06:00.0: timed out
>> to flush queue {1,2}"
>> "helmi 13 18:16:23 fedora kernel: rtw_8822ce 0000:06:00.0: failed to
>> get tx report from firmware"
>>
>> Some stats:
>> On kernel 5.15.13-200.fc35 running for 29 days:
>> [nico@fedora ~]$ journalctl -k -b -18 | grep 'timed out to flush queue' | wc -l
>> 0
>>
>> [nico@fedora ~]$ journalctl -k -b -18 | grep 'failed to get tx report
>> from firmware' | wc -l
>> 0
>>
>> On kernel 5.16.8-200.fc35 running for 4 hours:
>> [nico@fedora ~]$ journalctl -k -b -17 | grep 'timed out to flush queue' | wc -l
>> 45370
>>
>> [nico@fedora ~]$ journalctl -k -b -17 | grep 'failed to get tx report
>> from firmware' | wc -l
>> 502
>>
>> I tried bisecting which commit introduced this regression, but after
>> some 12 hours of recompiling and testing, it seems like I failed
>> somehow. I tried a bisect with first known good revision as
>> 8bb7eca972ad (5.15 release commit), and first known bad revision as
>> df0cc57e057f (5.16 release commit). I managed to identify that
>> revision
>> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
>> is bad, but then all other revisions were good apart from
>> 8a33dcc2f6d5 (refs/bisect/bad) Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
>> which was also bad.
>> But here's the baffling part, commit 6b278c0cb378 was good, and it's
>> the last commit in the merge (8a33dcc2f6d5) which appeared bad.
>> Now I retested with 8a33dcc2f6d5, and I don't see the issues anymore,
>> so I guess I tested a wrong kernel version at that point or something.
>> shrug.
>>
>> So I can only assume that the regression came in one of the commits inside
>> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
>> but it'll take me a while to be try bisecting the commits in that merge again.
>>
>> If anyone has any idea about what could cause these issues I'm seeing,
>> I can try out patches / test different things. But I'll try
>> rebisecting this again soon.
>>
>> - Nico
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-02-28 14:30   ` Thorsten Leemhuis
@ 2022-02-28 22:07     ` Larry Finger
  2022-03-04  6:33       ` Thorsten Leemhuis
  0 siblings, 1 reply; 10+ messages in thread
From: Larry Finger @ 2022-02-28 22:07 UTC (permalink / raw)
  To: Thorsten Leemhuis, Nico Sneck, Yan-Hsuan Chuang
  Cc: regressions, linux-wireless

On 2/28/22 08:30, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. Top-posting for once,
> to make this easily accessible to everyone.
> 
> Yan-Hsuan Chuang, sorry, I failed to notice that you didn't get the
> regression report below. Could you take a look what's wrong there?
> 
> BTW: Nico, did you try another bisection? And is the problem still
> happening? Did you maybe give 5.17-rc a shot to check if the problem
> still happens there?
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
> reports on my table. I can only look briefly into most of them and lack
> knowledge about most of the areas they concern. I thus unfortunately
> will sometimes get things wrong or miss something important. I hope
> that's not the case here; if you think it is, don't hesitate to tell me
> in a public reply, it's in everyone's interest to set the public record
> straight.
> 
> #regzbot poke
> 
> On 15.02.22 09:25, Thorsten Leemhuis wrote:
>> [...]
>> On 14.02.22 20:25, Nico Sneck wrote:
>>> Hi,
>>>
>>> I'm running Fedora 35 on a Huawei Matestation S (HUAWEI PUM-WDX9), AMD
>>> Renoir with Realtek rtw_8822ce handling wifi stuff.
>>>
>>> Ever since the kernel update from 5.15.13-200.fc35 to 5.16.8-200.fc35
>>> (which I performed Feb 12th), I noticed that my Wifi connection
>>> doesn't really work anymore. I'm connecting to a Zyxel VMG3927-B50A,
>>> and it appears to be using 5 GHz connection always. I also tested that
>>> 5.17-rc4 also suffers from this issue.
>>>
>>> The issue is that even trying to ping my routers gateway address will
>>> result in connection timeouts, and ping times are in the thousands to
>>> tens of thousands of milliseconds (normally peak ping times are ~3-6
>>> ms), making wireless unusable with 5.16+.
>>> I can also see that in dmesg logs there are two types of rtw_8822ce
>>> driver warnings flooding the logs, which I didn't see with 5.15:
>>>
>>> "helmi 13 18:20:03 fedora kernel: rtw_8822ce 0000:06:00.0: timed out
>>> to flush queue {1,2}"
>>> "helmi 13 18:16:23 fedora kernel: rtw_8822ce 0000:06:00.0: failed to
>>> get tx report from firmware"
>>>
>>> Some stats:
>>> On kernel 5.15.13-200.fc35 running for 29 days:
>>> [nico@fedora ~]$ journalctl -k -b -18 | grep 'timed out to flush queue' | wc -l
>>> 0
>>>
>>> [nico@fedora ~]$ journalctl -k -b -18 | grep 'failed to get tx report
>>> from firmware' | wc -l
>>> 0
>>>
>>> On kernel 5.16.8-200.fc35 running for 4 hours:
>>> [nico@fedora ~]$ journalctl -k -b -17 | grep 'timed out to flush queue' | wc -l
>>> 45370
>>>
>>> [nico@fedora ~]$ journalctl -k -b -17 | grep 'failed to get tx report
>>> from firmware' | wc -l
>>> 502
>>>
>>> I tried bisecting which commit introduced this regression, but after
>>> some 12 hours of recompiling and testing, it seems like I failed
>>> somehow. I tried a bisect with first known good revision as
>>> 8bb7eca972ad (5.15 release commit), and first known bad revision as
>>> df0cc57e057f (5.16 release commit). I managed to identify that
>>> revision
>>> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
>>> is bad, but then all other revisions were good apart from
>>> 8a33dcc2f6d5 (refs/bisect/bad) Merge
>>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
>>> which was also bad.
>>> But here's the baffling part, commit 6b278c0cb378 was good, and it's
>>> the last commit in the merge (8a33dcc2f6d5) which appeared bad.
>>> Now I retested with 8a33dcc2f6d5, and I don't see the issues anymore,
>>> so I guess I tested a wrong kernel version at that point or something.
>>> shrug.
>>>
>>> So I can only assume that the regression came in one of the commits inside
>>> fc02cb2b37fe Merge tag 'net-next-for-5.16' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
>>> but it'll take me a while to be try bisecting the commits in that merge again.
>>>
>>> If anyone has any idea about what could cause these issues I'm seeing,
>>> I can try out patches / test different things. But I'll try
>>> rebisecting this again soon.

Nico,

Your use of rtw_8822ce in the title finally registered on me. With that driver 
in use, that means that you are using my GitHub repo; however, newer kernels 
have the driver built in, but with names such as rtw88_8822ce. The difference in 
the name is deliberate. If you want to use the GitHub version, you must 
blacklist the ones from the kernel.

To check this, run 'lsmod | grep 88'. If you see a mixture of rtw_xxx and 
rtw88_xxx, then this is your problem.

Larry


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-02-28 22:07     ` Larry Finger
@ 2022-03-04  6:33       ` Thorsten Leemhuis
  2022-03-04 14:45         ` Nico Sneck
  0 siblings, 1 reply; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-03-04  6:33 UTC (permalink / raw)
  To: Larry Finger, Nico Sneck, Yan-Hsuan Chuang; +Cc: regressions, linux-wireless

Hi, this is your Linux kernel regression tracker.

On 28.02.22 23:07, Larry Finger wrote:
> On 2/28/22 08:30, Thorsten Leemhuis wrote:
> [...]
>
> Your use of rtw_8822ce in the title finally registered on me. With that
> driver in use, that means that you are using my GitHub repo; however,
> newer kernels have the driver built in, but with names such as
> rtw88_8822ce. The difference in the name is deliberate.

Many thx for this, the names already had made me a bit suspicious, but
wasn't aware of this!

> If you want to
> use the GitHub version, you must blacklist the ones from the kernel.
> 
> To check this, run 'lsmod | grep 88'. If you see a mixture of rtw_xxx
> and rtw88_xxx, then this is your problem.

Nico didn't reply to your mail, so I'll assume for now that this is not
a kernel regression:

#regzbot invalid: seems it's a regression in a external module of
similar name

@Yan-Hsuan Chuang: sorry for the noise!

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-03-04  6:33       ` Thorsten Leemhuis
@ 2022-03-04 14:45         ` Nico Sneck
  2022-03-07  7:39           ` Kalle Valo
  0 siblings, 1 reply; 10+ messages in thread
From: Nico Sneck @ 2022-03-04 14:45 UTC (permalink / raw)
  To: Thorsten Leemhuis, Larry Finger
  Cc: Yan-Hsuan Chuang, regressions, linux-wireless

Hi Larry, Thorsten.

Sorry I'm a bit late, been really busy with work lately. Haven't had
time to continue bisecting, hopefully I can find some time this
sunday.

I still think this is a kernel regression - I don't believe I'm using
the driver from Larry's repo. This is a stock Fedora 35 installation,
I've not installed the driver from Larry's repo, and I don't believe
Fedora packages it by default.

The reason I used "rtw_8822ce" in the title is because "rtw_8822ce" is
printed by my kernel. However, thinking about it a bit, I think that
string may not be the driver name, but rather, maybe the device name
printed by the kernel?

See here: https://github.com/torvalds/linux/blob/38f80f42147ff658aff218edb0a88c37e58bf44f/drivers/net/wireless/realtek/rtw88/mac.c#L968
>rtw_warn(rtwdev, "timed out to flush queue %d\n", prio_queue);
It prints the rtwdev variable, which in my case is "rtw_8822ce".

And looking some more, I can see this:
https://github.com/torvalds/linux/blob/38f80f42147ff658aff218edb0a88c37e58bf44f/drivers/net/wireless/realtek/rtw88/rtw8822ce.c#L24
>.name = "rtw_8822ce",
So at least some name inside the driver itself is "rtw_" instead of "rtw88_".

Furthermore, here's what lsmod has to say:
>[nico@fedora linux]$ lsmod | grep 88
>rtw88_8822ce           16384  0
>rtw88_8822c           487424  1 rtw88_8822ce
>rtw88_pci              28672  1 rtw88_8822ce
>rtw88_core            167936  2 rtw88_pci,rtw88_8822c
>mac80211             1179648  2 rtw88_pci,rtw88_core
>kvm                  1036288  1 kvm_amd
>cfg80211             1024000  2 rtw88_core,mac80211

So I still believe this is a regression somewhere in the net stack -
maybe in wireless/ or wireless/rtw88/, but could be elsewhere as well,
as there are quite a few moving pieces involved with this machinery. I
tried to trace the logic leading to the function
"__rtw_mac_flush_prio_queue", but it becomes pretty difficult
considering all the places where the struct "ieee80211_ops rtw_ops"
member ".flush" in mac80211.c is handled. Add to that my poor
understanding of C, and the difficulty of bisecting this, it's not
easy to pinpoint where the regression came from. I'll try latest -RC,
and if it doesn't work, I'll try bisecting the net merge commits as
soon as I have time.

p.s. sorry for top posting, writing this in a hurry in gmail web client.

On Fri, Mar 4, 2022 at 8:33 AM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
> Hi, this is your Linux kernel regression tracker.
>
> On 28.02.22 23:07, Larry Finger wrote:
> > On 2/28/22 08:30, Thorsten Leemhuis wrote:
> > [...]
> >
> > Your use of rtw_8822ce in the title finally registered on me. With that
> > driver in use, that means that you are using my GitHub repo; however,
> > newer kernels have the driver built in, but with names such as
> > rtw88_8822ce. The difference in the name is deliberate.
>
> Many thx for this, the names already had made me a bit suspicious, but
> wasn't aware of this!
>
> > If you want to
> > use the GitHub version, you must blacklist the ones from the kernel.
> >
> > To check this, run 'lsmod | grep 88'. If you see a mixture of rtw_xxx
> > and rtw88_xxx, then this is your problem.
>
> Nico didn't reply to your mail, so I'll assume for now that this is not
> a kernel regression:
>
> #regzbot invalid: seems it's a regression in a external module of
> similar name
>
> @Yan-Hsuan Chuang: sorry for the noise!
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
> reports on my table. I can only look briefly into most of them and lack
> knowledge about most of the areas they concern. I thus unfortunately
> will sometimes get things wrong or miss something important. I hope
> that's not the case here; if you think it is, don't hesitate to tell me
> in a public reply, it's in everyone's interest to set the public record
> straight.
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-03-04 14:45         ` Nico Sneck
@ 2022-03-07  7:39           ` Kalle Valo
  2022-03-16 10:14             ` Thorsten Leemhuis
  0 siblings, 1 reply; 10+ messages in thread
From: Kalle Valo @ 2022-03-07  7:39 UTC (permalink / raw)
  To: Nico Sneck
  Cc: Thorsten Leemhuis, Larry Finger, Yan-Hsuan Chuang, regressions,
	linux-wireless

Nico Sneck <snecknico@gmail.com> writes:

> Sorry I'm a bit late, been really busy with work lately. Haven't had
> time to continue bisecting, hopefully I can find some time this
> sunday.
>
> I still think this is a kernel regression - I don't believe I'm using
> the driver from Larry's repo. This is a stock Fedora 35 installation,
> I've not installed the driver from Larry's repo, and I don't believe
> Fedora packages it by default.

It's not clear for me if you are using a vanilla release from
kernel.org. But _if_ you are using a Fedora kernel you should report
your problem to Fedora. We have no knowledge what changes distros do to
their kernels.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-03-07  7:39           ` Kalle Valo
@ 2022-03-16 10:14             ` Thorsten Leemhuis
  2022-03-16 17:50               ` Nico Sneck
  0 siblings, 1 reply; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-03-16 10:14 UTC (permalink / raw)
  To: Kalle Valo, Nico Sneck
  Cc: Larry Finger, Yan-Hsuan Chuang, regressions, linux-wireless

On 07.03.22 08:39, Kalle Valo wrote:
> Nico Sneck <snecknico@gmail.com> writes:
> 
>> Sorry I'm a bit late, been really busy with work lately. Haven't had
>> time to continue bisecting, hopefully I can find some time this
>> sunday.
>>
>> I still think this is a kernel regression - I don't believe I'm using
>> the driver from Larry's repo. This is a stock Fedora 35 installation,
>> I've not installed the driver from Larry's repo, and I don't believe
>> Fedora packages it by default.
> 
> It's not clear for me if you are using a vanilla release from
> kernel.org.

FWIW, in the initial report Nico mentioned he tried to bisect the issue
and there mentioned a few git commit id's which are from mainline, so it
seems he tried mainline. But maybe there is some DKMS or akmod package
that is interfering.

Nico, it would help a lot if you could clarify the situation and maybe
try another shot at a bisection.

Anyway, this seems to be one of those issues that progress slowly, so
I'm marking it as "on back-burner" in regzbot to reduce the noise in the
reports and the UI:

#regzbot back-burner: root-cause still no found, reporter slow to respond
#regzbot poke

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
  2022-03-16 10:14             ` Thorsten Leemhuis
@ 2022-03-16 17:50               ` Nico Sneck
  0 siblings, 0 replies; 10+ messages in thread
From: Nico Sneck @ 2022-03-16 17:50 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Kalle Valo, Larry Finger, Yan-Hsuan Chuang, regressions, linux-wireless

Hi all,

On Wed, Mar 16, 2022 at 12:14 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
> FWIW, in the initial report Nico mentioned he tried to bisect the issue
> and there mentioned a few git commit id's which are from mainline, so it
> seems he tried mainline. But maybe there is some DKMS or akmod package
> that is interfering.

Indeed, I've reproduced this issue on mainline kernels, and also bisected
using the mainline kernel.
No DKMS or anything like that.

> Nico, it would help a lot if you could clarify the situation and maybe
> try another shot at a bisection.

Yeah, sorry for being slow to respond. These days I tend to run my personal
computer just on the weekends, thus this is a bit slow. Also, this issue is
hard to reproduce, usually it happens within a couple minutes after booting,
but I've actually now been running a vanilla 5.17-rc7 (commit: ea4424be1688)
for a day, and the issue just popped up.

So I can confirm, this issue is still present as of ea4424be1688
(soon-to-be 5.18).

This is how it appears when pinging my routers gateway address:
> 64 bytes from 192.168.10.1: icmp_seq=20846 ttl=64 time=29780 ms
> 64 bytes from 192.168.10.1: icmp_seq=20847 ttl=64 time=28771 ms
> 64 bytes from 192.168.10.1: icmp_seq=20848 ttl=64 time=27768 ms
> 64 bytes from 192.168.10.1: icmp_seq=20849 ttl=64 time=26763 ms
> 64 bytes from 192.168.10.1: icmp_seq=20850 ttl=64 time=25757 ms
> 64 bytes from 192.168.10.1: icmp_seq=20851 ttl=64 time=24752 ms
> 64 bytes from 192.168.10.1: icmp_seq=20852 ttl=64 time=23747 ms
> 64 bytes from 192.168.10.1: icmp_seq=20853 ttl=64 time=22742 ms
> 64 bytes from 192.168.10.1: icmp_seq=20854 ttl=64 time=21737 ms
> 64 bytes from 192.168.10.1: icmp_seq=20855 ttl=64 time=20734 ms
> 64 bytes from 192.168.10.1: icmp_seq=20874 ttl=64 time=1298 ms
> 64 bytes from 192.168.10.1: icmp_seq=20875 ttl=64 time=455 ms
> 64 bytes from 192.168.10.1: icmp_seq=20876 ttl=64 time=3128 ms
[...]
> From 192.168.10.107 icmp_seq=20925 Destination Host Unreachable
> From 192.168.10.107 icmp_seq=20926 Destination Host Unreachable
[...]
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available
> ping: sendmsg: No buffer space available

At the same time dmesg is flooded with "timed out to flush queue"
and "failed to get tx report from firmware".

And just to reiterate; this does not happen on 5.15. Only 5.16 and up.

I'll kick off another round of bisecting now. Will probably take me at
least a week to follow up though. A lot depends on how fast this
issue pops up with each round of bisecting.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
       [not found]   ` <CAFPFaMKpwmGqc_Cm1fv4psR6m+waax6YZO2ugOPhgmnG4mGJ4A@mail.gmail.com>
@ 2022-04-05 15:02     ` Larry Finger
  0 siblings, 0 replies; 10+ messages in thread
From: Larry Finger @ 2022-04-05 15:02 UTC (permalink / raw)
  To: G. P. B.
  Cc: snecknico, kvalo, linux-wireless, regressions, regressions, tony0620emma

On 4/5/22 06:39, G. P. B. wrote:
> On Mon, 4 Apr 2022 at 15:49, Larry Finger <Larry.Finger@lwfinger.net 
> <mailto:Larry.Finger@lwfinger.net>> wrote:
> 
>     George,
> 
>     I do not know of any regression in 5.16 with regard to the driver for
>     RTL8822CE.
>     Certainly, I saw no regressions in my testing of that driver from before it was
>     in the kernel up to the present. That said, I can only comment on the
>     user-space
>     part of openSUSE Tumbleweed, which is probably not your distro of choice.
> 
>     Are you using the drivers at https://GitHub.com/lwfinger/rtw88.git
>     <https://GitHub.com/lwfinger/rtw88.git> rather than
>     the ones in the kernel? Your posted errors that refer to rtw_8822ce indicate
>     that to be true. If the drivers came from the kernel, the reference would be to
>     rtw88_8822ce! If so, do a 'git pull' to get the drivers updated to match the
>     code in kernel 5.18. A lot of things have been fixed.
> 
>     In your system, please do a 'lsmod | grep rtw'. If any items refer to rtw88_*,
>     you have mixed drivers loaded. In that case, you should blacklist the rtw88_*
>     driver.
> 
>     Larry
> 
> 
> I haven't had time to rollback to 5.15 to check if this fixes the issue but I 
> have the following command line output:
> [girgias@fedora ~]$ lsmod | grep rtw
> rtw88_8822ce           16384  0
> rtw88_8822c           483328  1 rtw88_8822ce
> rtw88_pci              28672  1 rtw88_8822ce
> rtw88_core            167936  2 rtw88_pci,rtw88_8822c
> mac80211             1175552  2 rtw88_pci,rtw88_core
> cfg80211             1036288  2 rtw88_core,mac80211
> 
> Which if I understand your email correctly means I have mixed drivers?
> I personally did not start to use the drivers you provide on GitHub as I just do 
> dnf update to update my packages.
> Therefore, does this implies there is an issue with how Fedora is packaging the 
> WiFi drivers?
> 
> If I need to blacklist drivers, I imagine I need to do this at the package 
> manager level?

George,

No, you have the in-kernel version - the rtw drivers all start with "rtw88".

Your "regression" between 5.15 and 5.16 is that you switched from the GitHub 
repo to the in-kernel drivers. There have been many improvements in the kernel 
version since 5.16. Those are included in the GitHub version. One or more of 
them helped your system. I am not that familiar with fedora, but to get the 
kernel versions of the drivers, it is not necessary to "dnf" anything other than 
the kernel itself. All the in-kernel drivers come along with it.

Blacklisting is done by creating (as root) a blacklist file in /etc/modprobe.d/, 
not in the package manager. It is at a much lower level. On my system is such a 
file named /etc/modprobe.d/60-blacklist-rtw8822c.conf with contents of:
blacklist rtw88_8822ce
blacklist rtw88_8822c
blacklist rtw88_pci
blacklist rtw88_core

With this file, I am assured that only the drivers from GitHub will be loaded.

Larry


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtw_8822ce wifi regression after kernel update from 5.15 to 5.16
       [not found] <CAFPFaMLHXhHMhuAuvXWHb3c-tX_9qRxsquEUHXY0fMxh_VsKtw@mail.gmail.com>
@ 2022-04-04 14:49 ` Larry Finger
       [not found]   ` <CAFPFaMKpwmGqc_Cm1fv4psR6m+waax6YZO2ugOPhgmnG4mGJ4A@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Larry Finger @ 2022-04-04 14:49 UTC (permalink / raw)
  To: G. P. B., snecknico
  Cc: kvalo, linux-wireless, regressions, regressions, tony0620emma

On 4/3/22 13:11, G. P. B. wrote:
> Dear all,
> 
> Hopefully this email gets added to the thread correctly as I came here from 
> https://lore.kernel.org/linux-wireless/CAO_iuKG0gE=5fEKMF2A+iWUhsxtnPOQtTQTkBRo2vH5CmKu7iA@mail.gmail.com/ 
> <https://lore.kernel.org/linux-wireless/CAO_iuKG0gE=5fEKMF2A+iWUhsxtnPOQtTQTkBRo2vH5CmKu7iA@mail.gmail.com/>
> and using the mailto link with Gmail.
> 
> I'm also hitting this issue but I'm not sure if this is a regression in 5.16. 
> I've been struggling with weird random disconnects for a while but I blamed it 
> on the known bad router that I usually connect to at my university (at least 
> October 2021 when I got this laptop brand new).
> 
> The laptop is a HP Pavilion Laptop 15-eh0014na running Fedora 34:
> Linux fedora 5.16.18-100.fc34.x86_64 #1 SMP PREEMPT Mon Mar 28 14:46:06 UTC 2022 
> x86_64 x86_64 x86_64 GNU/Linux
> 
> Network driver:
> 02:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac 
> PCIe Wireless Network Adapter DeviceName: Realtek Wireless LAN + BT Subsystem: 
> Hewlett-Packard Company Device 85f7 Kernel driver in use: rtw_8822ce Kernel 
> modules: rtw88_8822ce
> 
> A sample, similar to the ones of Nico, of the output of dmesg -w:
> [ 915.489081] rtw_8822ce 0000:02:00.0: timed out to flush queue 1 [ 915.599086] 
> rtw_8822ce 0000:02:00.0: timed out to flush queue 2 [ 915.711096] rtw_8822ce 
> 0000:02:00.0: timed out to flush queue 1 [ 915.822106] rtw_8822ce 0000:02:00.0: 
> timed out to flush queue 2 [ 916.265097] rtw_8822ce 0000:02:00.0: timed out to 
> flush queue 0 [ 916.376085] rtw_8822ce 0000:02:00.0: timed out to flush queue 1 
> [ 916.449083] rtw_8822ce 0000:02:00.0: failed to get tx report from firmware
> 
> I'm not very proficient at debugging Linux so not sure how much more I can help 
> to narrow down the issue.
> But maybe a description of my experience might help, the WiFi icon still 
> considers at all time to be connected to the router and have a perfect signal.
> Sometimes enabling and immediately disabling Airplane mode fixes the issue 
> (probably due to a restart of the module?), and the issue is more likely to come 
> up after waking up from sleep.
> 
> I will try to see if I can rollback the kernel to 5.15 and see if that fixes the 
> issue and report back.
> 
> If I can be of any other assistance please let me know.

George,

I do not know of any regression in 5.16 with regard to the driver for RTL8822CE. 
Certainly, I saw no regressions in my testing of that driver from before it was 
in the kernel up to the present. That said, I can only comment on the user-space 
part of openSUSE Tumbleweed, which is probably not your distro of choice.

Are you using the drivers at https://GitHub.com/lwfinger/rtw88.git rather than 
the ones in the kernel? Your posted errors that refer to rtw_8822ce indicate 
that to be true. If the drivers came from the kernel, the reference would be to 
rtw88_8822ce! If so, do a 'git pull' to get the drivers updated to match the 
code in kernel 5.18. A lot of things have been fixed.

In your system, please do a 'lsmod | grep rtw'. If any items refer to rtw88_*, 
you have mixed drivers loaded. In that case, you should blacklist the rtw88_* 
driver.

Larry

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-05 15:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAO_iuKEL8tHnovpGiQGUxg7JUpZFxHpxhOHbqAMgbt5R4Eftgg@mail.gmail.com>
2022-02-15  8:25 ` rtw_8822ce wifi regression after kernel update from 5.15 to 5.16 Thorsten Leemhuis
2022-02-28 14:30   ` Thorsten Leemhuis
2022-02-28 22:07     ` Larry Finger
2022-03-04  6:33       ` Thorsten Leemhuis
2022-03-04 14:45         ` Nico Sneck
2022-03-07  7:39           ` Kalle Valo
2022-03-16 10:14             ` Thorsten Leemhuis
2022-03-16 17:50               ` Nico Sneck
     [not found] <CAFPFaMLHXhHMhuAuvXWHb3c-tX_9qRxsquEUHXY0fMxh_VsKtw@mail.gmail.com>
2022-04-04 14:49 ` Larry Finger
     [not found]   ` <CAFPFaMKpwmGqc_Cm1fv4psR6m+waax6YZO2ugOPhgmnG4mGJ4A@mail.gmail.com>
2022-04-05 15:02     ` Larry Finger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).