All of lore.kernel.org
 help / color / mirror / Atom feed
From: <paul.gortmaker@windriver.com>
To: LKML <linux-kernel@vger.kernel.org>,
	linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>,
	Wen Yang <wenyang.linux@foxmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>
Subject: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
Date: Fri, 18 Aug 2023 16:07:57 -0400	[thread overview]
Message-ID: <20230818200757.1808398-1-paul.gortmaker@windriver.com> (raw)

From: Paul Gortmaker <paul.gortmaker@windriver.com>

In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
the new function report_idle_softirq() was created by breaking code out
of the existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

	if (a && b && c)
		warn();

to a three conditional:

	if (!a)
		return;
	if (!b)
		return;
	if (!c)
		return;
	warn();

However, it seems one of the conditionals didn't get a "!" removed.
Compare the instance of local_bh_blocked() in the old code:

-               if (ratelimit < 10 && !local_bh_blocked() &&
-                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
-                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
-                               (unsigned int) local_softirq_pending());
-                       ratelimit++;
-               }

...to the usage in the new (5.18+) code:

+       /* On RT, softirqs handling may be waiting on some lock */
+       if (!local_bh_blocked())
+               return false;

It seems apparent that the "!" should be removed from the new code.

This issue lay dormant until another fixup for the same commit was added
in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
This commit realized the ratelimit was essentially set to zero instead
of ten, and hence *no* softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4
preempt-rt kernels started printing out 10 instances of this at boot:

  NOHZ tick-stop error: local softirq work is pending, handler #80!!!

Just to double check my understanding of things, I confirmed that the
v5.18-rt did print the pending-80 messages with a cherry pick of the
ratelimit fix, and then confirmed no pending softirq messages were
printed with a revert of mainline's 034569 on a v5.18-rt baseline.

Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
also didn't break anything on a defconfig of mainline master of today.

Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Cc: Wen Yang <wenyang.linux@foxmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2b865cb77feb..b52e1861b913 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
 		return false;
 
 	/* On RT, softirqs handling may be waiting on some lock */
-	if (!local_bh_blocked())
+	if (local_bh_blocked())
 		return false;
 
 	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
-- 
2.40.0


             reply	other threads:[~2023-08-18 20:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-18 20:07 paul.gortmaker [this message]
2023-08-20 17:23 ` [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT Wen Yang
2023-08-21 22:03   ` Paul E. McKenney
2023-08-28 15:03     ` Frederic Weisbecker
2023-08-31 13:32       ` Sebastian Andrzej Siewior
2023-09-01  9:56         ` Thomas Gleixner
2023-08-24 16:00 ` Ahmad Fatoum
2023-08-30 10:30 ` [tip: timers/urgent] tick/rcu: Fix false positive "softirq work is pending" messages tip-bot2 for Paul Gortmaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230818200757.1808398-1-paul.gortmaker@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=wenyang.linux@foxmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.