linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, frederic@kernel.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, cai@lca.pw,
	mgorman@techsingularity.net, joel@joelfernandes.org,
	valentin.schneider@arm.com
Subject: Re: [RFC][PATCH 4/7] smp: Optimize send_call_function_single_ipi()
Date: Thu, 21 Jan 2021 16:20:12 -0800	[thread overview]
Message-ID: <20210122002012.GB2743@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <YAmyVW1r0xQOwneB@hirez.programming.kicks-ass.net>

On Thu, Jan 21, 2021 at 05:56:53PM +0100, Peter Zijlstra wrote:
> On Wed, May 27, 2020 at 07:12:36PM +0200, Peter Zijlstra wrote:
> > Subject: rcu: Allow for smp_call_function() running callbacks from idle
> > 
> > Current RCU hard relies on smp_call_function() callbacks running from
> > interrupt context. A pending optimization is going to break that, it
> > will allow idle CPUs to run the callbacks from the idle loop. This
> > avoids raising the IPI on the requesting CPU and avoids handling an
> > exception on the receiving CPU.
> > 
> > Change rcu_is_cpu_rrupt_from_idle() to also accept task context,
> > provided it is the idle task.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  kernel/rcu/tree.c   | 25 +++++++++++++++++++------
> >  kernel/sched/idle.c |  4 ++++
> >  2 files changed, 23 insertions(+), 6 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index d8e9dbbefcfa..c716eadc7617 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -418,16 +418,23 @@ void rcu_momentary_dyntick_idle(void)
> >  EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
> >  
> >  /**
> > - * rcu_is_cpu_rrupt_from_idle - see if interrupted from idle
> > + * rcu_is_cpu_rrupt_from_idle - see if 'interrupted' from idle
> >   *
> >   * If the current CPU is idle and running at a first-level (not nested)
> > - * interrupt from idle, return true.  The caller must have at least
> > - * disabled preemption.
> > + * interrupt, or directly, from idle, return true.
> > + *
> > + * The caller must have at least disabled IRQs.
> >   */
> >  static int rcu_is_cpu_rrupt_from_idle(void)
> >  {
> > -	/* Called only from within the scheduling-clock interrupt */
> > -	lockdep_assert_in_irq();
> > +	long nesting;
> > +
> > +	/*
> > +	 * Usually called from the tick; but also used from smp_function_call()
> > +	 * for expedited grace periods. This latter can result in running from
> > +	 * the idle task, instead of an actual IPI.
> > +	 */
> > +	lockdep_assert_irqs_disabled();
> >  
> >  	/* Check for counter underflows */
> >  	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) < 0,
> > @@ -436,9 +443,15 @@ static int rcu_is_cpu_rrupt_from_idle(void)
> >  			 "RCU dynticks_nmi_nesting counter underflow/zero!");
> >  
> >  	/* Are we at first interrupt nesting level? */
> > -	if (__this_cpu_read(rcu_data.dynticks_nmi_nesting) != 1)
> > +	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> > +	if (nesting > 1)
> >  		return false;
> >  
> > +	/*
> > +	 * If we're not in an interrupt, we must be in the idle task!
> > +	 */
> > +	WARN_ON_ONCE(!nesting && !is_idle_task(current));
> > +
> >  	/* Does CPU appear to be idle from an RCU standpoint? */
> >  	return __this_cpu_read(rcu_data.dynticks_nesting) == 0;
> >  }
> 
> Let me revive this thread after yesterdays IRC conversation.
> 
> As said; it might be _extremely_ unlikely, but somewhat possible for us
> to send the IPI concurrent with hot-unplug, not yet observing
> rcutree_offline_cpu() or thereabout.
> 
> Then have the IPI 'delayed' enough to not happen until smpcfd_dying()
> and getting ran there.
> 
> This would then run the function from the stopper thread instead of the
> idle thread and trigger the warning, even though we're not holding
> rcu_read_lock() (which, IIRC, was the only constraint).
> 
> So would something like the below be acceptable?
> 
> ---
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 368749008ae8..2c8d4c3e341e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -445,7 +445,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	/*
>  	 * Usually called from the tick; but also used from smp_function_call()
>  	 * for expedited grace periods. This latter can result in running from
> -	 * the idle task, instead of an actual IPI.
> +	 * a (usually the idle) task, instead of an actual IPI.

The story is growing enough hair that we should tell it only once.
So here just where it is called from:

	/*
	 * Usually called from the tick; but also used from smp_function_call()
	 * for expedited grace periods.
	 */

>  	lockdep_assert_irqs_disabled();
>  
> @@ -461,9 +461,14 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  		return false;
>  
>  	/*
> -	 * If we're not in an interrupt, we must be in the idle task!
> +	 * If we're not in an interrupt, we must be in task context.
> +	 *
> +	 * This will typically be the idle task through:
> +	 *   flush_smp_call_function_from_idle(),
> +	 *
> +	 * but can also be in CPU HotPlug through smpcfd_dying().
>  	 */

Good, but how about like this?

	/*
	 * If we are not in an interrupt handler, we must be in
	 * smp_call_function() handler.
	 *
	 * Normally, smp_call_function() handlers are invoked from
	 * the idle task via flush_smp_call_function_from_idle().
	 * However, they can also be invoked from CPU hotplug
	 * operations via smpcfd_dying().
	 */

> -	WARN_ON_ONCE(!nesting && !is_idle_task(current));
> +	WARN_ON_ONCE(!nesting && !in_task(current));

This is used in time-critical contexts, so why not RCU_LOCKDEP_WARN()?
That should also allow checking more closely.  Would something like the
following work?

	RCU_LOCKDEP_WARN(!nesting && !is_idle_task(current) && (!in_task(current) || !lockdep_cpus_write_held()));

Where lockdep_cpus_write_held is defined in kernel/cpu.c:

void lockdep_cpus_write_held(void)
{
#ifdef CONFIG_PROVE_LOCKING
	if (system_state < SYSTEM_RUNNING)
		return false;
	return lockdep_is_held_type(&cpu_hotplug_lock, 0);
#else
	return false;
#endif
}

Seem reasonable?

							Thanx, Paul

>  	/* Does CPU appear to be idle from an RCU standpoint? */
>  	return __this_cpu_read(rcu_data.dynticks_nesting) == 0;

  reply	other threads:[~2021-01-22  0:21 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 16:10 [RFC][PATCH 0/7] Fix the scheduler-IPI mess Peter Zijlstra
2020-05-26 16:10 ` [RFC][PATCH 1/7] sched: Fix smp_call_function_single_async() usage for ILB Peter Zijlstra
2020-05-26 23:56   ` Frederic Weisbecker
2020-05-27 10:23   ` Vincent Guittot
2020-05-27 11:28     ` Frederic Weisbecker
2020-05-27 12:07       ` Vincent Guittot
2020-05-29 15:26   ` Valentin Schneider
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-06-01 11:40     ` Frederic Weisbecker
2020-05-26 16:10 ` [RFC][PATCH 2/7] smp: Optimize flush_smp_call_function_queue() Peter Zijlstra
2020-05-28 12:28   ` Frederic Weisbecker
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-05-26 16:11 ` [RFC][PATCH 3/7] smp: Move irq_work_run() out of flush_smp_call_function_queue() Peter Zijlstra
2020-05-29 13:04   ` Frederic Weisbecker
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-05-26 16:11 ` [RFC][PATCH 4/7] smp: Optimize send_call_function_single_ipi() Peter Zijlstra
2020-05-27  9:56   ` Peter Zijlstra
2020-05-27 10:15     ` Peter Zijlstra
2020-05-27 15:56       ` Paul E. McKenney
2020-05-27 16:35         ` Peter Zijlstra
2020-05-27 17:12           ` Peter Zijlstra
2020-05-27 19:39             ` Paul E. McKenney
2020-05-28  1:35               ` Joel Fernandes
2020-05-28  8:59             ` [tip: core/rcu] rcu: Allow for smp_call_function() running callbacks from idle tip-bot2 for Peter Zijlstra
2021-01-21 16:56             ` [RFC][PATCH 4/7] smp: Optimize send_call_function_single_ipi() Peter Zijlstra
2021-01-22  0:20               ` Paul E. McKenney [this message]
2021-01-22  8:31                 ` Peter Zijlstra
2021-01-22 15:35                   ` Paul E. McKenney
2020-05-29 13:01   ` Frederic Weisbecker
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-05-26 16:11 ` [RFC][PATCH 5/7] irq_work, smp: Allow irq_work on call_single_queue Peter Zijlstra
2020-05-28 23:40   ` Frederic Weisbecker
2020-05-29 13:36     ` Peter Zijlstra
2020-06-05  9:37       ` Peter Zijlstra
2020-06-05 15:02         ` Frederic Weisbecker
2020-06-05 16:17           ` Peter Zijlstra
2020-06-05 15:24         ` Kees Cook
2020-06-10 13:24         ` Frederic Weisbecker
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-05-26 16:11 ` [RFC][PATCH 6/7] sched: Add rq::ttwu_pending Peter Zijlstra
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-05-26 16:11 ` [RFC][PATCH 7/7] sched: Replace rq::wake_list Peter Zijlstra
2020-05-29 15:10   ` Valdis Klētnieks
2020-06-01  9:52   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-06-02 15:16     ` Frederic Weisbecker
2020-06-04 14:18   ` [RFC][PATCH 7/7] " Guenter Roeck
2020-06-05  0:24     ` Eric Biggers
2020-06-05  7:41       ` Peter Zijlstra
2020-06-05 16:15         ` Eric Biggers
2020-06-06 23:13           ` Guenter Roeck
2020-06-09 20:21             ` Eric Biggers
2020-06-09 21:25               ` Guenter Roeck
2020-06-09 21:38                 ` Eric Biggers
2020-06-09 22:06                   ` Peter Zijlstra
2020-06-09 23:03                     ` Guenter Roeck
2020-06-10  9:09                       ` Peter Zijlstra
2020-06-18 17:57                 ` Steven Rostedt
2020-06-18 19:06                   ` Guenter Roeck
2020-06-09 22:07               ` Peter Zijlstra
2020-06-05  8:10     ` Peter Zijlstra
2020-06-05 13:33       ` Guenter Roeck
2020-06-05 14:09         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210122002012.GB2743@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=cai@lca.pw \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).