All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: linux-kernel@vger.kernel.org, rcu@vger.kernel.org
Cc: tglx@linutronix.de, luto@kernel.org, x86@kernel.org,
	frederic@kernel.org, rostedt@goodmis.org, joel@joelfernandes.org,
	mathieu.desnoyers@efficios.com, will@kernel.org,
	peterz@infradead.org
Subject: Re: [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}()
Date: Thu, 11 Jun 2020 16:54:23 -0700	[thread overview]
Message-ID: <20200611235423.GA32454@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200611235305.GA32342@paulmck-ThinkPad-P72>

On Thu, Jun 11, 2020 at 04:53:05PM -0700, Paul E. McKenney wrote:
> RCU needs to detect when one if its interrupt handlers interrupted an idle
> state, where an idle state is either the idle loop itself or nohz_full
> userspace execution.  When a CPU has been interrupted from one of these
> idle states, RCU can report a quiescent state, helping the current grace
> period to come to an end.
> 
> However, there are optimized kernel-entry paths where handlers that
> would normally be executed in interrupt context are instead executed
> directly from the base process context, but with interrupts disabled.
> When such a directly-executed handler is followed by softirq processing
> (which re-enables interrupts), it is possible that one of RCU's interrupt
> handlers will interrupt this softirq processing.  This situation can cause
> RCU to incorrectly assume that the CPU has passed through a quiescent
> state, when in fact the CPU is instead in the midst of softirq processing,
> and might well be within an RCU read-side critical section.  In that case,
> reporting a quiescent state can result in too-short RCU grace periods,
> leading to arbitrary memory corruption and a sharp degradation in the
> actuarial statistics of your kernel.
> 
> The fix is for the optimized kernel-entry paths to replace the current
> call to __rcu_is_watching() with a call to a new rcu_needs_irq_enter()
> function, which returns true iff RCU needs explicit calls to
> rcu_irq_enter() and rcu_irq_exit() surrounding the optimized invocation
> of the handler.  These explicit calls are never required in Tiny RCU,
> and in Tree RCU are required only if the CPU might have interrupted
> nohz_full userspace execution or the idle loop.  There is the usual
> precision/overhead tradeoff, with the current implementation majoring
> in low common-case overhead.
> 
> While in the area, the commit also returns rcu_is_cpu_rrupt_from_idle()
> to its original semantics.
> 
> This has been subjected only to very light rcutorture testing, so use
> appropriate caution.  The placement of the new rcu_needs_irq_enter()
> is not ideal, but the more natural include/linux/hardirq.h location has
> #include issues.

And if you want more details on how I got to this patch, please see below.

							Thanx, Paul

------------------------------------------------------------------------

Thomas supplied a patch and suggested that there be an RCU-supplied
rcu_needs_irq_enter() function that says whether the full
rcu_irq_enter()/rcu_irq_exit() dance is required.  The function needing
the dance is rcu_is_cpu_rrupt_from_idle().

Assumptions:

o	It would be simpler for idtentry_enter_cond_rcu() to check a
	new rcu_needs_irq_enter() function than to do a fragile check
	of "!__rcu_is_watching() || is_idle_task(current)".  Note also
	that this check does not account for expedited grace periods
	interacting with softirq handlers having interrupted nohz_full
	userspace execution.

	This assumption seems likely to hold.

o	If CONFIG_CONTEXT_TRACKING=y, assume that rcu_user_enter()
	and rcu_user_exit() might be invoked on any CPU.  Query in to
	Frederic on whether this can be more selective.

Functions of interest:

o	rcu_is_cpu_rrupt_from_idle().  See below.

o	__rcu_is_watching().  The only call to this will likely
	be eliminated.  If so, it can be removed.

o	idtentry_enter_cond_rcu().  Replace __rcu_is_watching()
	check with a check of rcu_needs_irq_enter().

o	idtentry_exit_cond_rcu().  No change.

o	rcu_irq_enter_check_tick().  Turn on tick for nohz_full
	CPUs when required.

o	rcu_irq_exit_check_preempt().  Straight lockdep validation.

Use cases for rcu_is_cpu_rrupt_from_idle():

o	rcu_sched_clock_irq(): If not interrupted from idle, need to
	ask the scheduler for help for ->urgent_qs request.

o	rcu_pending(): If a nohz_full CPU is interrupted from idle,
	don't raise_softirq() it.  Instead, let the grace-period
	kthread detect the quiescent state.

o	rcu_exp_handler() for PREEMPT_NONE kernels:  Directly report
	a quiescent state if interrupted from idle.

o	rcu_flavor_sched_clock_irq for PREEMPT kernels:  Report a
	voluntary context switch if interrupted from idle.  Here "idle"
	includes still in the kernel but on the way to/from nohz_full
	userspace execution.

o	rcu_flavor_sched_clock_irq for PREEMPT_NONE kernels:  Report a
	quiescent state if interrupted from idle.
	
In all of the above cases for NO_HZ_FULL kernels, "idle" includes still
in the kernel but on the way to/from nohz_full userspace execution.

Information available to rcu_needs_irq_enter():

o	IS_ENABLED(CONFIG_CONTEXT_TRACKING), which indicates that
	nohz_full userspace execution is possible, and that some
	CPUs might be invoking rcu_idle_enter() and rcu_idle_exit().

	There is also tick_nohz_full_cpu(), but it is not clear that if
	this returns false that the corresponding CPU is guaranteed not
	to be invoking rcu_idle_enter() and rcu_idle_exit().

o	is_idle_task(current), which returns true if the current task
	is an idle task.  Such tasks always need to execute
	rcu_irq_enter() and rcu_irq_exit().  Or, if instrumentation
	is prohibited, rcu_nmi_enter() and rcu_nmi_exit().

o	rdp->dynticks_nesting:	If non-zero, we are in process context,
	and don't need rcu_irq_enter() or rcu_irq_exit() regardless
	of other state.  But this requires that rcu_needs_irq_enter()
	be defined within Tree RCU, so it is not necessarily a win.

->	The simple approach is for rcu_needs_irq_enter() to return:

	!IS_ENABLED(CONFIG_TINY_RCU) &&
	(IS_ENABLED(CONFIG_CONTEXT_TRACKING) || is_idle_task(current))

	Except that Frederic points out context_tracking_enabled_cpu():

	!IS_ENABLED(CONFIG_TINY_RCU) &&
	(context_tracking_enabled_cpu(smp_processor_id()) || is_idle_task(current))

o	As a result, rcu_needs_irq_enter() might need to be defined
	outside of RCU to allow inlining and to avoid #include hell.
	One candidate location is include/linux/hardirq.h, the same
	place that rcu_irq_enter_check_tick() is defined.

  reply	other threads:[~2020-06-11 23:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 23:53 [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Paul E. McKenney
2020-06-11 23:54 ` Paul E. McKenney [this message]
2020-06-12  5:30 ` Andy Lutomirski
2020-06-12 12:40   ` Thomas Gleixner
2020-06-12 13:55     ` [PATCH x86/entry: Force rcu_irq_enter() when in idle task Thomas Gleixner
2020-06-12 14:26       ` Frederic Weisbecker
2020-06-12 14:47         ` Thomas Gleixner
2020-06-12 15:32       ` Andy Lutomirski
2020-06-12 17:49       ` Paul E. McKenney
2020-06-12 19:19         ` Paul E. McKenney
2020-06-12 19:25           ` Thomas Gleixner
2020-06-12 19:28             ` Andy Lutomirski
2020-06-12 19:34             ` Thomas Gleixner
2020-06-12 21:56               ` Paul E. McKenney
2020-06-12 19:50       ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-06-15 20:16       ` [PATCH " Joel Fernandes
2020-06-16  8:40         ` Thomas Gleixner
2020-06-16 14:30           ` Joel Fernandes
2020-06-16 16:52             ` Andy Lutomirski
2020-06-12  9:27 ` [PATCH RFC] x86/entry: Ask RCU if it needs rcu_irq_{enter,exit}() Thomas Gleixner
2020-06-12 13:57   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200611235423.GA32454@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.