linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: linux-kernel@vger.kernel.org, rcu@vger.kernel.org
Cc: tglx@linutronix.de, luto@kernel.org, x86@kernel.org,
	frederic@kernel.org, rostedt@goodmis.org, joel@joelfernandes.org,
	mathieu.desnoyers@efficios.com, will@kernel.org,
	peterz@infradead.org
Subject: Re: [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}()
Date: Thu, 11 Jun 2020 16:54:23 -0700	[thread overview]
Message-ID: <20200611235423.GA32454@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200611235305.GA32342@paulmck-ThinkPad-P72>

On Thu, Jun 11, 2020 at 04:53:05PM -0700, Paul E. McKenney wrote:
> RCU needs to detect when one if its interrupt handlers interrupted an idle
> state, where an idle state is either the idle loop itself or nohz_full
> userspace execution.  When a CPU has been interrupted from one of these
> idle states, RCU can report a quiescent state, helping the current grace
> period to come to an end.
> 
> However, there are optimized kernel-entry paths where handlers that
> would normally be executed in interrupt context are instead executed
> directly from the base process context, but with interrupts disabled.
> When such a directly-executed handler is followed by softirq processing
> (which re-enables interrupts), it is possible that one of RCU's interrupt
> handlers will interrupt this softirq processing.  This situation can cause
> RCU to incorrectly assume that the CPU has passed through a quiescent
> state, when in fact the CPU is instead in the midst of softirq processing,
> and might well be within an RCU read-side critical section.  In that case,
> reporting a quiescent state can result in too-short RCU grace periods,
> leading to arbitrary memory corruption and a sharp degradation in the
> actuarial statistics of your kernel.
> 
> The fix is for the optimized kernel-entry paths to replace the current
> call to __rcu_is_watching() with a call to a new rcu_needs_irq_enter()
> function, which returns true iff RCU needs explicit calls to
> rcu_irq_enter() and rcu_irq_exit() surrounding the optimized invocation
> of the handler.  These explicit calls are never required in Tiny RCU,
> and in Tree RCU are required only if the CPU might have interrupted
> nohz_full userspace execution or the idle loop.  There is the usual
> precision/overhead tradeoff, with the current implementation majoring
> in low common-case overhead.
> 
> While in the area, the commit also returns rcu_is_cpu_rrupt_from_idle()
> to its original semantics.
> 
> This has been subjected only to very light rcutorture testing, so use
> appropriate caution.  The placement of the new rcu_needs_irq_enter()
> is not ideal, but the more natural include/linux/hardirq.h location has
> #include issues.

And if you want more details on how I got to this patch, please see below.

							Thanx, Paul

------------------------------------------------------------------------

Thomas supplied a patch and suggested that there be an RCU-supplied
rcu_needs_irq_enter() function that says whether the full
rcu_irq_enter()/rcu_irq_exit() dance is required.  The function needing
the dance is rcu_is_cpu_rrupt_from_idle().

Assumptions:

o	It would be simpler for idtentry_enter_cond_rcu() to check a
	new rcu_needs_irq_enter() function than to do a fragile check
	of "!__rcu_is_watching() || is_idle_task(current)".  Note also
	that this check does not account for expedited grace periods
	interacting with softirq handlers having interrupted nohz_full
	userspace execution.

	This assumption seems likely to hold.

o	If CONFIG_CONTEXT_TRACKING=y, assume that rcu_user_enter()
	and rcu_user_exit() might be invoked on any CPU.  Query in to
	Frederic on whether this can be more selective.

Functions of interest:

o	rcu_is_cpu_rrupt_from_idle().  See below.

o	__rcu_is_watching().  The only call to this will likely
	be eliminated.  If so, it can be removed.

o	idtentry_enter_cond_rcu().  Replace __rcu_is_watching()
	check with a check of rcu_needs_irq_enter().

o	idtentry_exit_cond_rcu().  No change.

o	rcu_irq_enter_check_tick().  Turn on tick for nohz_full
	CPUs when required.

o	rcu_irq_exit_check_preempt().  Straight lockdep validation.

Use cases for rcu_is_cpu_rrupt_from_idle():

o	rcu_sched_clock_irq(): If not interrupted from idle, need to
	ask the scheduler for help for ->urgent_qs request.

o	rcu_pending(): If a nohz_full CPU is interrupted from idle,
	don't raise_softirq() it.  Instead, let the grace-period
	kthread detect the quiescent state.

o	rcu_exp_handler() for PREEMPT_NONE kernels:  Directly report
	a quiescent state if interrupted from idle.

o	rcu_flavor_sched_clock_irq for PREEMPT kernels:  Report a
	voluntary context switch if interrupted from idle.  Here "idle"
	includes still in the kernel but on the way to/from nohz_full
	userspace execution.

o	rcu_flavor_sched_clock_irq for PREEMPT_NONE kernels:  Report a
	quiescent state if interrupted from idle.
	
In all of the above cases for NO_HZ_FULL kernels, "idle" includes still
in the kernel but on the way to/from nohz_full userspace execution.

Information available to rcu_needs_irq_enter():

o	IS_ENABLED(CONFIG_CONTEXT_TRACKING), which indicates that
	nohz_full userspace execution is possible, and that some
	CPUs might be invoking rcu_idle_enter() and rcu_idle_exit().

	There is also tick_nohz_full_cpu(), but it is not clear that if
	this returns false that the corresponding CPU is guaranteed not
	to be invoking rcu_idle_enter() and rcu_idle_exit().

o	is_idle_task(current), which returns true if the current task
	is an idle task.  Such tasks always need to execute
	rcu_irq_enter() and rcu_irq_exit().  Or, if instrumentation
	is prohibited, rcu_nmi_enter() and rcu_nmi_exit().

o	rdp->dynticks_nesting:	If non-zero, we are in process context,
	and don't need rcu_irq_enter() or rcu_irq_exit() regardless
	of other state.  But this requires that rcu_needs_irq_enter()
	be defined within Tree RCU, so it is not necessarily a win.

->	The simple approach is for rcu_needs_irq_enter() to return:

	!IS_ENABLED(CONFIG_TINY_RCU) &&
	(IS_ENABLED(CONFIG_CONTEXT_TRACKING) || is_idle_task(current))

	Except that Frederic points out context_tracking_enabled_cpu():

	!IS_ENABLED(CONFIG_TINY_RCU) &&
	(context_tracking_enabled_cpu(smp_processor_id()) || is_idle_task(current))

o	As a result, rcu_needs_irq_enter() might need to be defined
	outside of RCU to allow inlining and to avoid #include hell.
	One candidate location is include/linux/hardirq.h, the same
	place that rcu_irq_enter_check_tick() is defined.

  reply	other threads:[~2020-06-11 23:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 23:53 [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Paul E. McKenney
2020-06-11 23:54 ` Paul E. McKenney [this message]
2020-06-12  5:30 ` Andy Lutomirski
2020-06-12 12:40   ` Thomas Gleixner
2020-06-12 13:55     ` [PATCH x86/entry: Force rcu_irq_enter() when in idle task Thomas Gleixner
2020-06-12 14:26       ` Frederic Weisbecker
2020-06-12 14:47         ` Thomas Gleixner
2020-06-12 15:32       ` Andy Lutomirski
2020-06-12 17:49       ` Paul E. McKenney
2020-06-12 19:19         ` Paul E. McKenney
2020-06-12 19:25           ` Thomas Gleixner
2020-06-12 19:28             ` Andy Lutomirski
2020-06-12 19:34             ` Thomas Gleixner
2020-06-12 21:56               ` Paul E. McKenney
2020-06-12 19:50       ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-06-15 20:16       ` [PATCH " Joel Fernandes
2020-06-16  8:40         ` Thomas Gleixner
2020-06-16 14:30           ` Joel Fernandes
2020-06-16 16:52             ` Andy Lutomirski
2020-06-12  9:27 ` [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Thomas Gleixner
2020-06-12 13:57   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200611235423.GA32454@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).