linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "tip-bot2 for Thomas Gleixner" <tip-bot2@linutronix.de>
To: linux-tip-commits@vger.kernel.org
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andy Lutomirski <luto@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>, x86 <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [tip: x86/entry] x86/entry: Force rcu_irq_enter() when in idle task
Date: Fri, 12 Jun 2020 19:50:07 -0000	[thread overview]
Message-ID: <159199140796.16989.8322013716817906984.tip-bot2@tip-bot2> (raw)
In-Reply-To: <87wo4cxubv.fsf@nanos.tec.linutronix.de>

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     0bf3924bfabd13ba21aa702344fc00b3b3263e5a
Gitweb:        https://git.kernel.org/tip/0bf3924bfabd13ba21aa702344fc00b3b3263e5a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Fri, 12 Jun 2020 15:55:00 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 12 Jun 2020 21:36:33 +02:00

x86/entry: Force rcu_irq_enter() when in idle task

The idea of conditionally calling into rcu_irq_enter() only when RCU is
not watching turned out to be not completely thought through.

Paul noticed occasional premature end of grace periods in RCU torture
testing. Bisection led to the commit which made the invocation of
rcu_irq_enter() conditional on !rcu_is_watching().

It turned out that this conditional breaks RCU assumptions about the idle
task when the scheduler tick happens to be a nested interrupt. Nested
interrupts can happen when the first interrupt invokes softirq processing
on return which enables interrupts.

If that nested tick interrupt does not invoke rcu_irq_enter() then the
RCU's irq-nesting checks will believe that this interrupt came directly
from idle, which will cause RCU to report a quiescent state.  Because this
interrupt instead came from a softirq handler which might have been
executing an RCU read-side critical section, this can cause the grace
period to end prematurely.

Change the condition from !rcu_is_watching() to is_idle_task(current) which
enforces that interrupts in the idle task unconditionally invoke
rcu_irq_enter() independent of the RCU state.

This is also correct vs. user mode entries in NOHZ full scenarios because
user mode entries bring RCU out of EQS and force the RCU irq nesting state
accounting to nested. As only the first interrupt can enter from user mode
a nested tick interrupt will enter from kernel mode and as the nesting
state accounting is forced to nesting it will not do anything stupid even
if rcu_irq_enter() has not been invoked.

Fixes: 3eeec3858488 ("x86/entry: Provide idtentry_entry/exit_cond_rcu()")
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/87wo4cxubv.fsf@nanos.tec.linutronix.de

---
 arch/x86/entry/common.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index f4d5778..bd3f141 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -557,14 +557,34 @@ bool noinstr idtentry_enter_cond_rcu(struct pt_regs *regs)
 		return false;
 	}
 
-	if (!__rcu_is_watching()) {
+	/*
+	 * If this entry hit the idle task invoke rcu_irq_enter() whether
+	 * RCU is watching or not.
+	 *
+	 * Interupts can nest when the first interrupt invokes softirq
+	 * processing on return which enables interrupts.
+	 *
+	 * Scheduler ticks in the idle task can mark quiescent state and
+	 * terminate a grace period, if and only if the timer interrupt is
+	 * not nested into another interrupt.
+	 *
+	 * Checking for __rcu_is_watching() here would prevent the nesting
+	 * interrupt to invoke rcu_irq_enter(). If that nested interrupt is
+	 * the tick then rcu_flavor_sched_clock_irq() would wrongfully
+	 * assume that it is the first interupt and eventually claim
+	 * quiescient state and end grace periods prematurely.
+	 *
+	 * Unconditionally invoke rcu_irq_enter() so RCU state stays
+	 * consistent.
+	 *
+	 * TINY_RCU does not support EQS, so let the compiler eliminate
+	 * this part when enabled.
+	 */
+	if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
 		/*
 		 * If RCU is not watching then the same careful
 		 * sequence vs. lockdep and tracing is required
 		 * as in enter_from_user_mode().
-		 *
-		 * This only happens for IRQs that hit the idle
-		 * loop, i.e. if idle is not using MWAIT.
 		 */
 		lockdep_hardirqs_off(CALLER_ADDR0);
 		rcu_irq_enter();
@@ -576,9 +596,10 @@ bool noinstr idtentry_enter_cond_rcu(struct pt_regs *regs)
 	}
 
 	/*
-	 * If RCU is watching then RCU only wants to check
-	 * whether it needs to restart the tick in NOHZ
-	 * mode.
+	 * If RCU is watching then RCU only wants to check whether it needs
+	 * to restart the tick in NOHZ mode. rcu_irq_enter_check_tick()
+	 * already contains a warning when RCU is not watching, so no point
+	 * in having another one here.
 	 */
 	instrumentation_begin();
 	rcu_irq_enter_check_tick();

  parent reply	other threads:[~2020-06-12 19:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 23:53 [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Paul E. McKenney
2020-06-11 23:54 ` Paul E. McKenney
2020-06-12  5:30 ` Andy Lutomirski
2020-06-12 12:40   ` Thomas Gleixner
2020-06-12 13:55     ` [PATCH x86/entry: Force rcu_irq_enter() when in idle task Thomas Gleixner
2020-06-12 14:26       ` Frederic Weisbecker
2020-06-12 14:47         ` Thomas Gleixner
2020-06-12 15:32       ` Andy Lutomirski
2020-06-12 17:49       ` Paul E. McKenney
2020-06-12 19:19         ` Paul E. McKenney
2020-06-12 19:25           ` Thomas Gleixner
2020-06-12 19:28             ` Andy Lutomirski
2020-06-12 19:34             ` Thomas Gleixner
2020-06-12 21:56               ` Paul E. McKenney
2020-06-12 19:50       ` tip-bot2 for Thomas Gleixner [this message]
2020-06-15 20:16       ` Joel Fernandes
2020-06-16  8:40         ` Thomas Gleixner
2020-06-16 14:30           ` Joel Fernandes
2020-06-16 16:52             ` Andy Lutomirski
2020-06-12  9:27 ` [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Thomas Gleixner
2020-06-12 13:57   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=159199140796.16989.8322013716817906984.tip-bot2@tip-bot2 \
    --to=tip-bot2@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).