linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Josh Triplett <josh@joshtriplett.org>
Subject: Re: [PATCH 08/10] rcu: Allow to deactivate nocb on a CPU
Date: Wed, 13 May 2020 11:38:31 -0700	[thread overview]
Message-ID: <20200513183831.GV2869@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200513164714.22557-9-frederic@kernel.org>

On Wed, May 13, 2020 at 06:47:12PM +0200, Frederic Weisbecker wrote:
> Allow a CPU's rdp to quit the callback offlined mode.
> The switch happens on the target with IRQs disabled and rdp->nocb_lock
> held to avoid races between local callbacks handling and kthread
> offloaded callbacks handling.
> 
> nocb_cb kthread is first parked to avoid any future race with
> concurrent rcu_do_batch() executions. Then the cblist is set to offloaded
> so that the nocb_gp kthread ignores this rdp.
> 
> Inspired-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> ---
>  include/linux/rcupdate.h   |  2 ++
>  kernel/rcu/rcu_segcblist.c |  4 +--
>  kernel/rcu/rcu_segcblist.h |  2 +-
>  kernel/rcu/tree_plugin.h   | 50 +++++++++++++++++++++++++++++++++++++-
>  4 files changed, 54 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 2678a37c3169..1d3a4c37c3c1 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -96,8 +96,10 @@ static inline void rcu_user_exit(void) { }
>  
>  #ifdef CONFIG_RCU_NOCB_CPU
>  void rcu_init_nohz(void);
> +void rcu_nocb_cpu_deoffload(int cpu);
>  #else /* #ifdef CONFIG_RCU_NOCB_CPU */
>  static inline void rcu_init_nohz(void) { }
> +static inline void rcu_nocb_cpu_deoffload(int cpu) { }
>  #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
>  
>  /**
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index d8ea2bef5574..4bed48da7702 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -171,9 +171,9 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
>   * Mark the specified rcu_segcblist structure as offloaded.  This
>   * structure must be empty.
>   */
> -void rcu_segcblist_offload(struct rcu_segcblist *rsclp)
> +void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
>  {
> -	rsclp->offloaded = 1;
> +	rsclp->offloaded = offload;
>  }
>  #endif
>  
> diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> index 4c1503a82492..8f7c6c34cb1b 100644
> --- a/kernel/rcu/rcu_segcblist.h
> +++ b/kernel/rcu/rcu_segcblist.h
> @@ -83,7 +83,7 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
>  void rcu_segcblist_init(struct rcu_segcblist *rsclp);
>  void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
>  #ifdef CONFIG_RCU_NOCB_CPU
> -void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
> +void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload);
>  #endif
>  bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
>  bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index f19e81e0c691..c74a4df8d5f2 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1943,6 +1943,10 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
>  	for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) {
>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
>  		raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
> +		if (!rcu_segcblist_is_offloaded(&rdp->cblist)) {
> +			raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
> +			continue;
> +		}
>  		bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
>  		if (bypass_ncbs &&
>  		    (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
> @@ -2176,6 +2180,50 @@ static void do_nocb_deferred_wakeup(struct rcu_data *rdp)
>  		do_nocb_deferred_wakeup_common(rdp);
>  }
>  
> +static void __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
> +{
> +	unsigned long flags;
> +	struct rcu_node *rnp = rdp->mynode;
> +
> +	printk("De-offloading %d\n", rdp->cpu);
> +	kthread_park(rdp->nocb_cb_kthread);

I am a bit concerned about this, because from this point until the
end of this function, no RCU callbacks can be invoked for this CPU.
This could be a problem if there is a callback flood in progress, and
such callback floods are in fact one reason that the sysadm might want
to be switching from offloaded to non-offloaded.  Is it possible to
move this kthread_park() to the end of this function?  Yes, this can
result in concurrent invocation of different callbacks for this CPU,
but because we have excluded rcu_barrier(), that should be OK.

Or is there some failure mode that I am failing to see?  (Wouldn't
be the first time...)

> +	raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
> +	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
> +	raw_spin_lock_rcu_node(rnp);
> +	rcu_segcblist_offload(&rdp->cblist, false);
> +	raw_spin_unlock_rcu_node(rnp);
> +	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
> +}
> +
> +static long rcu_nocb_rdp_deoffload(void *arg)
> +{
> +	struct rcu_data *rdp = arg;
> +
> +	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
> +	__rcu_nocb_rdp_deoffload(rdp);
> +
> +	return 0;
> +}

For example, is the problem caused by invocations of this
rcu_nocb_rdp_deoffload() function?  But if so, do we really need to
acquire rcu_state.barrier_mutex?

That aside, if it is possible to do the switch without interrupting
callback invocation?  Or is your idea that because we are always executing
on the CPU being deoffloaded, that CPU has been prevented from posting
callbacks in any case?  If the latter, does that mean that it is not
possible to deoffload offlined CPUs?  (Not sure whether this restriction
is a real problem, but figured that I should ask.)

							Thanx, Paul

> +void rcu_nocb_cpu_deoffload(int cpu)
> +{
> +	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> +
> +	mutex_lock(&rcu_state.barrier_mutex);
> +	cpus_read_lock();
> +	if (rcu_segcblist_is_offloaded(&rdp->cblist)) {
> +		if (cpu_online(cpu)) {
> +			work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
> +		} else {
> +			__rcu_nocb_rdp_deoffload(rdp);
> +		}
> +		cpumask_clear_cpu(cpu, rcu_nocb_mask);
> +	}
> +	cpus_read_unlock();
> +	mutex_unlock(&rcu_state.barrier_mutex);
> +}
> +
>  void __init rcu_init_nohz(void)
>  {
>  	int cpu;
> @@ -2218,7 +2266,7 @@ void __init rcu_init_nohz(void)
>  		rdp = per_cpu_ptr(&rcu_data, cpu);
>  		if (rcu_segcblist_empty(&rdp->cblist))
>  			rcu_segcblist_init(&rdp->cblist);
> -		rcu_segcblist_offload(&rdp->cblist);
> +		rcu_segcblist_offload(&rdp->cblist, true);
>  	}
>  	rcu_organize_nocb_kthreads();
>  }
> -- 
> 2.25.0
> 

  reply	other threads:[~2020-05-13 18:38 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13 16:47 [PATCH 00/10] rcu: Allow a CPU to leave and reenter NOCB state Frederic Weisbecker
2020-05-13 16:47 ` [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints Frederic Weisbecker
2020-05-20 12:29   ` Joel Fernandes
2020-05-22 17:57     ` Paul E. McKenney
2020-05-26 15:21       ` Joel Fernandes
2020-05-26 16:29         ` Paul E. McKenney
2020-05-26 20:18           ` Joel Fernandes
2020-05-26 21:09             ` Paul E. McKenney
2020-05-26 21:27               ` Joel Fernandes
2020-05-26 22:29                 ` Paul E. McKenney
2020-05-27  0:45                   ` Joel Fernandes
2020-05-27  0:58                     ` Paul E. McKenney
2020-06-04 11:41       ` Frederic Weisbecker
2020-06-04 16:36         ` Paul E. McKenney
2020-06-08 12:57           ` Frederic Weisbecker
2020-06-09 18:02             ` Paul E. McKenney
2020-06-10 13:12               ` Frederic Weisbecker
2020-06-10 14:02                 ` Paul E. McKenney
2020-06-10 22:12                   ` Frederic Weisbecker
2020-06-10 23:21                     ` Paul E. McKenney
2020-06-11  1:32                       ` Joel Fernandes
2020-05-13 16:47 ` [PATCH 02/10] rcu: Use direct rdp->nocb_lock operations on local calls Frederic Weisbecker
2020-05-13 16:47 ` [PATCH 03/10] rcu: Make locking explicit in do_nocb_deferred_wakeup_common() Frederic Weisbecker
2020-05-26 19:54   ` Joel Fernandes
2020-05-26 19:59   ` Joel Fernandes
2020-05-13 16:47 ` [PATCH 04/10] rcu: Implement rcu_segcblist_is_offloaded() config dependent Frederic Weisbecker
2020-05-13 18:20   ` Paul E. McKenney
2020-05-13 23:03     ` Frederic Weisbecker
2020-05-14 15:47       ` Paul E. McKenney
2020-05-13 16:47 ` [PATCH 05/10] rcu: Remove useless conditional nocb unlock Frederic Weisbecker
2020-05-13 16:47 ` [PATCH 06/10] rcu: Make nocb_cb kthread parkable Frederic Weisbecker
2020-06-11  1:34   ` Joel Fernandes
2020-05-13 16:47 ` [PATCH 07/10] rcu: Temporarily assume that nohz full CPUs might not be NOCB Frederic Weisbecker
2020-05-13 18:25   ` Paul E. McKenney
2020-05-13 23:08     ` Frederic Weisbecker
2020-05-14 15:50       ` Paul E. McKenney
2020-05-14 22:49         ` Frederic Weisbecker
2020-05-13 16:47 ` [PATCH 08/10] rcu: Allow to deactivate nocb on a CPU Frederic Weisbecker
2020-05-13 18:38   ` Paul E. McKenney [this message]
2020-05-13 22:45     ` Frederic Weisbecker
2020-05-14 15:47       ` Paul E. McKenney
2020-05-14 22:30         ` Frederic Weisbecker
2020-05-14 22:47           ` Paul E. McKenney
2020-05-14 22:55             ` Frederic Weisbecker
2020-05-26 21:20   ` Joel Fernandes
2020-05-26 22:49     ` Joel Fernandes
2020-06-04 13:10       ` Frederic Weisbecker
2020-06-11  1:32         ` Joel Fernandes
2020-06-11 17:03           ` Paul E. McKenney
2020-06-04 13:14     ` Frederic Weisbecker
2020-05-13 16:47 ` [PATCH 09/10] rcu: Allow to re-offload a CPU that used to be nocb Frederic Weisbecker
2020-05-13 18:41   ` Paul E. McKenney
2020-05-13 16:47 ` [PATCH 10/10] rcu: Nocb (de)activate through sysfs Frederic Weisbecker
2020-05-13 18:42   ` Paul E. McKenney
2020-05-13 23:23     ` Frederic Weisbecker
2020-05-14 15:51       ` Paul E. McKenney
2020-05-13 18:15 ` [PATCH 00/10] rcu: Allow a CPU to leave and reenter NOCB state Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200513183831.GV2869@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).