linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Wetzel <alexander@wetzel-home.de>
To: Thorsten Leemhuis <regressions@leemhuis.info>,
	Johannes Berg <johannes@sipsolutions.net>
Cc: "regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	misac1987@gmail.com
Subject: Re: [Regression] Bug 216672 - soft lockup in ieee80211_select_queue -- system freezing random time on msi laptop
Date: Mon, 14 Nov 2022 22:38:29 +0100	[thread overview]
Message-ID: <00e8e836-7a5e-3c65-b09b-b1e71d79a6c6@wetzel-home.de> (raw)
In-Reply-To: <83b28e2d-7af7-f91a-7e67-7f224bcf0557@leemhuis.info>

On 13.11.22 09:22, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker speaking.
> 
> I noticed a slightly vague regression report in bugzilla.kernel.org. As
> many (most?) kernel developer don't keep an eye on it, I decided to
> forward it by mail. Quoting from
> https://bugzilla.kernel.org/show_bug.cgi?id=216672 :
> 

I've tried to extrapolate the info in mail/ticket to get something we 
can work with. But the result is insane: The CPU can't get stuck where 
the trace claims it does. Not without some really strange and unlikely 
HW defect.

Based on the loaded modules the issue must be with the rtl8723ae card 
and - according to the bug content - affect at least the kernels 5.19 
and 6.0.6. (which are not supporting wake_tx_queue in 6.0.6)

The core error message from a 6.0.6 (Ubuntu?) kernel is:
   watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [ksoftirqd/1:23]
   RIP: 0010:ieee80211_select_queue+0x1b/0x110 [mac80211]

According to the trace history and the identified driver the problematic 
softirg should be a scheduled run of _rtl_pci_irq_tasklet().
And it looks like a RX packet triggered a TCP RST reply. Which then 
triggered the issue.

I ten checked with a Gentoo 6.0.6 mac80211 module the reference to 
ieee80211_select_queue+0x1b:

And at least in my build that's the local->ops->wake_tx_queue *check* in 
ieee80211_select_queue(). Which of course does not make any sense short 
of some fundamental assumption to be wrong...

185             struct sta_info *sta = NULL;
186             const u8 *ra = NULL;
187             u16 ret;
188
189             /* when using iTXQ, we can do this later */
190             if (local->ops->wake_tx_queue)
191                     return 0;
192

Now my module is for sure far from the original but 
ieee80211_select_queue() looks pretty harmless:
No obvious way how we can get stuck in there...

CPU broken? Strange compiler bug?
Some stupid error from my site reading the trace?

Are the traces all looking the same? Any other strange errors on the system?

And can you verify that the error is indeed a regression by going back 
to a kernel "known" to be not affected in the past?

Other extreme would be to try the wireless development kernel 
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-testing.git 
and hope, that it also shows a more sane problem.
(ieee80211_select_queue() has been dropped, changing the tx flow 
drastically when compared to 6.0.6)

In short, I'm also stuck what that can be. We can try some different 
angles and hope to hit something.


Alexander


      reply	other threads:[~2022-11-14 21:44 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-13  8:22 [Regression] Bug 216672 - soft lockup in ieee80211_select_queue -- system freezing random time on msi laptop Thorsten Leemhuis
2022-11-14 21:38 ` Alexander Wetzel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=00e8e836-7a5e-3c65-b09b-b1e71d79a6c6@wetzel-home.de \
    --to=alexander@wetzel-home.de \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=misac1987@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).