From: "Paul E. McKenney" <paulmck@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
rcu@vger.kernel.org, Andrew Lutomirski <luto@kernel.org>,
X86 ML <x86@kernel.org>,
Frederic Weisbecker <frederic@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Joel Fernandes <joel@joelfernandes.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH x86/entry: Force rcu_irq_enter() when in idle task
Date: Fri, 12 Jun 2020 12:19:21 -0700 [thread overview]
Message-ID: <20200612191921.GA18255@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200612174953.GA19188@paulmck-ThinkPad-P72>
On Fri, Jun 12, 2020 at 10:49:53AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 12, 2020 at 03:55:00PM +0200, Thomas Gleixner wrote:
> > The idea of conditionally calling into rcu_irq_enter() only when RCU is
> > not watching turned out to be not completely thought through.
> >
> > Paul noticed occasional premature end of grace periods in RCU torture
> > testing. Bisection led to the commit which made the invocation of
> > rcu_irq_enter() conditional on !rcu_is_watching().
> >
> > It turned out that this conditional breaks RCU assumptions about the idle
> > task when the scheduler tick happens to be a nested interrupt. Nested
> > interrupts can happen when the first interrupt invokes softirq processing
> > on return which enables interrupts. If that nested tick interrupt does not
> > invoke rcu_irq_enter() then the nest accounting in RCU claims that this is
> > the first interrupt which might mark a quiescient state and end grace
> > periods prematurely.
>
> For this last sentence, how about the following?
>
> If that nested tick interrupt does not invoke rcu_irq_enter() then the
> RCU's irq-nesting checks will believe that this interrupt came directly
> from idle, which will cause RCU to report a quiescent state. Because
> this interrupt instead came from a softirq handler which might have
> been executing an RCU read-side critical section, this can cause the
> grace period to end prematurely.
>
> > Change the condition from !rcu_is_watching() to is_idle_task(current) which
> > enforces that interrupts in the idle task unconditionally invoke
> > rcu_irq_enter() independent of the RCU state.
> >
> > This is also correct vs. user mode entries in NOHZ full scenarios because
> > user mode entries bring RCU out of EQS and force the RCU irq nesting state
> > accounting to nested. As only the first interrupt can enter from user mode
> > a nested tick interrupt will enter from kernel mode and as the nesting
> > state accounting is forced to nesting it will not do anything stupid even
> > if rcu_irq_enter() has not been invoked.
>
> On the testing front, just like with my busted patch yesterday, this
> patch breaks the TASKS03 rcutorture scenario by preventing the Tasks
> RCU grace periods from ever completing. However, this is an unusual
> configuration with NO_HZ_FULL and one CPU actually being nohz_full.
> The more conventional TASKS01 and TASKS02 scenarios do just fine.
>
> I will therefore address this issue in a follow-on patch.
I should add that -your- patch from yesterday did -not- cause this
problem, in case that is of interest.
Thanx, Paul
> > Fixes: 3eeec3858488 ("x86/entry: Provide idtentry_entry/exit_cond_rcu()")
> > Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org>
> Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
>
> > ---
> > arch/x86/entry/common.c | 35 ++++++++++++++++++++++++++++-------
> > 1 file changed, 28 insertions(+), 7 deletions(-)
> > --- a/arch/x86/entry/common.c
> > +++ b/arch/x86/entry/common.c
> > @@ -557,14 +557,34 @@ bool noinstr idtentry_enter_cond_rcu(str
> > return false;
> > }
> >
> > - if (!__rcu_is_watching()) {
> > + /*
> > + * If this entry hit the idle task invoke rcu_irq_enter() whether
> > + * RCU is watching or not.
> > + *
> > + * Interupts can nest when the first interrupt invokes softirq
> > + * processing on return which enables interrupts.
> > + *
> > + * Scheduler ticks in the idle task can mark quiescent state and
> > + * terminate a grace period, if and only if the timer interrupt is
> > + * not nested into another interrupt.
> > + *
> > + * Checking for __rcu_is_watching() here would prevent the nesting
> > + * interrupt to invoke rcu_irq_enter(). If that nested interrupt is
> > + * the tick then rcu_flavor_sched_clock_irq() would wrongfully
> > + * assume that it is the first interupt and eventually claim
> > + * quiescient state and end grace periods prematurely.
> > + *
> > + * Unconditionally invoke rcu_irq_enter() so RCU state stays
> > + * consistent.
> > + *
> > + * TINY_RCU does not support EQS, so let the compiler eliminate
> > + * this part when enabled.
> > + */
> > + if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
> > /*
> > * If RCU is not watching then the same careful
> > * sequence vs. lockdep and tracing is required
> > * as in enter_from_user_mode().
> > - *
> > - * This only happens for IRQs that hit the idle
> > - * loop, i.e. if idle is not using MWAIT.
> > */
> > lockdep_hardirqs_off(CALLER_ADDR0);
> > rcu_irq_enter();
> > @@ -576,9 +596,10 @@ bool noinstr idtentry_enter_cond_rcu(str
> > }
> >
> > /*
> > - * If RCU is watching then RCU only wants to check
> > - * whether it needs to restart the tick in NOHZ
> > - * mode.
> > + * If RCU is watching then RCU only wants to check whether it needs
> > + * to restart the tick in NOHZ mode. rcu_irq_enter_check_tick()
> > + * already contains a warning when RCU is not watching, so no point
> > + * in having another one here.
> > */
> > instrumentation_begin();
> > rcu_irq_enter_check_tick();
next prev parent reply other threads:[~2020-06-12 19:19 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-11 23:53 [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Paul E. McKenney
2020-06-11 23:54 ` Paul E. McKenney
2020-06-12 5:30 ` Andy Lutomirski
2020-06-12 12:40 ` Thomas Gleixner
2020-06-12 13:55 ` [PATCH x86/entry: Force rcu_irq_enter() when in idle task Thomas Gleixner
2020-06-12 14:26 ` Frederic Weisbecker
2020-06-12 14:47 ` Thomas Gleixner
2020-06-12 15:32 ` Andy Lutomirski
2020-06-12 17:49 ` Paul E. McKenney
2020-06-12 19:19 ` Paul E. McKenney [this message]
2020-06-12 19:25 ` Thomas Gleixner
2020-06-12 19:28 ` Andy Lutomirski
2020-06-12 19:34 ` Thomas Gleixner
2020-06-12 21:56 ` Paul E. McKenney
2020-06-12 19:50 ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-06-15 20:16 ` [PATCH " Joel Fernandes
2020-06-16 8:40 ` Thomas Gleixner
2020-06-16 14:30 ` Joel Fernandes
2020-06-16 16:52 ` Andy Lutomirski
2020-06-12 9:27 ` [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Thomas Gleixner
2020-06-12 13:57 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200612191921.GA18255@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=frederic@kernel.org \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).