linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/3] CPU-hotplug changes
@ 2012-08-30 19:02 Paul E. McKenney
  2012-08-30 19:03 ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2012-08-30 19:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, sbw, patches

Hello!

This patch series contains fixes and improvements related to CPU hotplug:

1.	Remove _rcu_barrier() dependency on __stop_machine().
2.	Disallow callback registry on offline CPUs.
3.	Fix load avg vs cpu-hotplug (with Peter Zijlstra).

							Thanx, Paul

------------------------------------------------------------------------

 b/kernel/rcutree.c       |   83 ++++++-----------------------------------------
 b/kernel/rcutree.h       |    3 -
 b/kernel/rcutree_trace.c |    4 +-
 b/kernel/sched/core.c    |   41 +++++++++++------------
 kernel/rcutree.c         |   10 +++++
 5 files changed, 43 insertions(+), 98 deletions(-)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine()
  2012-08-30 19:02 [PATCH tip/core/rcu 0/3] CPU-hotplug changes Paul E. McKenney
@ 2012-08-30 19:03 ` Paul E. McKenney
  2012-08-30 19:03   ` [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs Paul E. McKenney
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Paul E. McKenney @ 2012-08-30 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, sbw, patches, Paul E. McKenney,
	Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

Currently, _rcu_barrier() relies on preempt_disable() to prevent
any CPU from going offline, which in turn depends on CPU hotplug's
use of __stop_machine().

This patch therefore makes _rcu_barrier() use get_online_cpus() to
block CPU-hotplug operations.  This has the added benefit of removing
the need for _rcu_barrier() to adopt callbacks:  Because CPU-hotplug
operations are excluded, there can be no callbacks to adopt.  This
commit simplifies the code accordingly.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c       |   83 ++++++-----------------------------------------
 kernel/rcutree.h       |    3 --
 kernel/rcutree_trace.c |    4 +-
 3 files changed, 13 insertions(+), 77 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f280e54..9854a00 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1390,17 +1390,6 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp)
 	int i;
 	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
 
-	/*
-	 * If there is an rcu_barrier() operation in progress, then
-	 * only the task doing that operation is permitted to adopt
-	 * callbacks.  To do otherwise breaks rcu_barrier() and friends
-	 * by causing them to fail to wait for the callbacks in the
-	 * orphanage.
-	 */
-	if (rsp->rcu_barrier_in_progress &&
-	    rsp->rcu_barrier_in_progress != current)
-		return;
-
 	/* Do the accounting first. */
 	rdp->qlen_lazy += rsp->qlen_lazy;
 	rdp->qlen += rsp->qlen;
@@ -1455,9 +1444,8 @@ static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
  * The CPU has been completely removed, and some other CPU is reporting
  * this fact from process context.  Do the remainder of the cleanup,
  * including orphaning the outgoing CPU's RCU callbacks, and also
- * adopting them, if there is no _rcu_barrier() instance running.
- * There can only be one CPU hotplug operation at a time, so no other
- * CPU can be attempting to update rcu_cpu_kthread_task.
+ * adopting them.  There can only be one CPU hotplug operation at a time,
+ * so no other CPU can be attempting to update rcu_cpu_kthread_task.
  */
 static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 {
@@ -1519,10 +1507,6 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 
 #else /* #ifdef CONFIG_HOTPLUG_CPU */
 
-static void rcu_adopt_orphan_cbs(struct rcu_state *rsp)
-{
-}
-
 static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
 {
 }
@@ -2326,13 +2310,10 @@ static void rcu_barrier_func(void *type)
 static void _rcu_barrier(struct rcu_state *rsp)
 {
 	int cpu;
-	unsigned long flags;
 	struct rcu_data *rdp;
-	struct rcu_data rd;
 	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
 	unsigned long snap_done;
 
-	init_rcu_head_on_stack(&rd.barrier_head);
 	_rcu_barrier_trace(rsp, "Begin", -1, snap);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
@@ -2372,70 +2353,30 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	/*
 	 * Initialize the count to one rather than to zero in order to
 	 * avoid a too-soon return to zero in case of a short grace period
-	 * (or preemption of this task).  Also flag this task as doing
-	 * an rcu_barrier().  This will prevent anyone else from adopting
-	 * orphaned callbacks, which could cause otherwise failure if a
-	 * CPU went offline and quickly came back online.  To see this,
-	 * consider the following sequence of events:
-	 *
-	 * 1.	We cause CPU 0 to post an rcu_barrier_callback() callback.
-	 * 2.	CPU 1 goes offline, orphaning its callbacks.
-	 * 3.	CPU 0 adopts CPU 1's orphaned callbacks.
-	 * 4.	CPU 1 comes back online.
-	 * 5.	We cause CPU 1 to post an rcu_barrier_callback() callback.
-	 * 6.	Both rcu_barrier_callback() callbacks are invoked, awakening
-	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
+	 * (or preemption of this task).  Exclude CPU-hotplug operations
+	 * to ensure that no offline CPU has callbacks queued.
 	 */
 	init_completion(&rsp->barrier_completion);
 	atomic_set(&rsp->barrier_cpu_count, 1);
-	raw_spin_lock_irqsave(&rsp->onofflock, flags);
-	rsp->rcu_barrier_in_progress = current;
-	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
+	get_online_cpus();
 
 	/*
-	 * Force every CPU with callbacks to register a new callback
-	 * that will tell us when all the preceding callbacks have
-	 * been invoked.  If an offline CPU has callbacks, wait for
-	 * it to either come back online or to finish orphaning those
-	 * callbacks.
+	 * Force each CPU with callbacks to register a new callback.
+	 * When that callback is invoked, we will know that all of the
+	 * corresponding CPU's preceding callbacks have been invoked.
 	 */
-	for_each_possible_cpu(cpu) {
-		preempt_disable();
+	for_each_online_cpu(cpu) {
 		rdp = per_cpu_ptr(rsp->rda, cpu);
-		if (cpu_is_offline(cpu)) {
-			_rcu_barrier_trace(rsp, "Offline", cpu,
-					   rsp->n_barrier_done);
-			preempt_enable();
-			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
-				schedule_timeout_interruptible(1);
-		} else if (ACCESS_ONCE(rdp->qlen)) {
+		if (ACCESS_ONCE(rdp->qlen)) {
 			_rcu_barrier_trace(rsp, "OnlineQ", cpu,
 					   rsp->n_barrier_done);
 			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
-			preempt_enable();
 		} else {
 			_rcu_barrier_trace(rsp, "OnlineNQ", cpu,
 					   rsp->n_barrier_done);
-			preempt_enable();
 		}
 	}
-
-	/*
-	 * Now that all online CPUs have rcu_barrier_callback() callbacks
-	 * posted, we can adopt all of the orphaned callbacks and place
-	 * an rcu_barrier_callback() callback after them.  When that is done,
-	 * we are guaranteed to have an rcu_barrier_callback() callback
-	 * following every callback that could possibly have been
-	 * registered before _rcu_barrier() was called.
-	 */
-	raw_spin_lock_irqsave(&rsp->onofflock, flags);
-	rcu_adopt_orphan_cbs(rsp);
-	rsp->rcu_barrier_in_progress = NULL;
-	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
-	atomic_inc(&rsp->barrier_cpu_count);
-	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
-	rd.rsp = rsp;
-	rsp->call(&rd.barrier_head, rcu_barrier_callback);
+	put_online_cpus();
 
 	/*
 	 * Now that we have an rcu_barrier_callback() callback on each
@@ -2456,8 +2397,6 @@ static void _rcu_barrier(struct rcu_state *rsp)
 
 	/* Other rcu_barrier() invocations can now safely proceed. */
 	mutex_unlock(&rsp->barrier_mutex);
-
-	destroy_rcu_head_on_stack(&rd.barrier_head);
 }
 
 /**
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 4d29169..94dfdf1 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -398,9 +398,6 @@ struct rcu_state {
 	struct rcu_head **orphan_donetail;	/* Tail of above. */
 	long qlen_lazy;				/* Number of lazy callbacks. */
 	long qlen;				/* Total number of callbacks. */
-	struct task_struct *rcu_barrier_in_progress;
-						/* Task doing rcu_barrier(), */
-						/*  or NULL if no barrier. */
 	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
index abffb48..6a2e52a 100644
--- a/kernel/rcutree_trace.c
+++ b/kernel/rcutree_trace.c
@@ -51,8 +51,8 @@ static int show_rcubarrier(struct seq_file *m, void *unused)
 	struct rcu_state *rsp;
 
 	for_each_rcu_flavor(rsp)
-		seq_printf(m, "%s: %c bcc: %d nbd: %lu\n",
-			   rsp->name, rsp->rcu_barrier_in_progress ? 'B' : '.',
+		seq_printf(m, "%s: bcc: %d nbd: %lu\n",
+			   rsp->name,
 			   atomic_read(&rsp->barrier_cpu_count),
 			   rsp->n_barrier_done);
 	return 0;
-- 
1.7.8


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs
  2012-08-30 19:03 ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Paul E. McKenney
@ 2012-08-30 19:03   ` Paul E. McKenney
  2012-08-31 16:21     ` Josh Triplett
  2012-08-30 19:03   ` [PATCH tip/core/rcu 3/3] sched: Fix load avg vs cpu-hotplug Paul E. McKenney
  2012-08-31 16:09   ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Josh Triplett
  2 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2012-08-30 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, sbw, patches, Paul E. McKenney,
	Paul E. McKenney

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

Posting a callback after the CPU_DEAD notifier effectively leaks
that callback unless/until that CPU comes back online.  Silence is
unhelpful when attempting to track down such leaks, so this commit emits
a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
CPU attempts to register a callback.  The rdp->nxttail[RCU_NEXT_TAIL] is
set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
notifier, allowing _call_rcu() to determine exactly when posting callbacks
is illegal.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 9854a00..5f8c4dd 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1503,6 +1503,9 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 	WARN_ONCE(rdp->qlen != 0 || rdp->nxtlist != NULL,
 		  "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, nxtlist=%p\n",
 		  cpu, rdp->qlen, rdp->nxtlist);
+	init_callback_list(rdp);
+	/* Disallow further callbacks on this CPU. */
+	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
 }
 
 #else /* #ifdef CONFIG_HOTPLUG_CPU */
@@ -1925,6 +1928,12 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
 	rdp = this_cpu_ptr(rsp->rda);
 
 	/* Add the callback to our list. */
+	if (unlikely(rdp->nxttail[RCU_NEXT_TAIL] == NULL)) {
+		/* _call_rcu() is illegal on offline CPU; leak the callback. */
+		WARN_ON_ONCE(1);
+		local_irq_restore(flags);
+		return;
+	}
 	ACCESS_ONCE(rdp->qlen)++;
 	if (lazy)
 		rdp->qlen_lazy++;
@@ -2462,6 +2471,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 	rdp->qlen_last_fqs_check = 0;
 	rdp->n_force_qs_snap = rsp->n_force_qs;
 	rdp->blimit = blimit;
+	init_callback_list(rdp);  /* Re-enable callbacks on this CPU. */
 	rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
 	atomic_set(&rdp->dynticks->dynticks,
 		   (atomic_read(&rdp->dynticks->dynticks) & ~0x1) + 1);
-- 
1.7.8


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH tip/core/rcu 3/3] sched: Fix load avg vs cpu-hotplug
  2012-08-30 19:03 ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Paul E. McKenney
  2012-08-30 19:03   ` [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs Paul E. McKenney
@ 2012-08-30 19:03   ` Paul E. McKenney
  2012-08-31 16:09   ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Josh Triplett
  2 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2012-08-30 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, Valdis.Kletnieks, dhowells, eric.dumazet,
	darren, fweisbec, sbw, patches, Peter Zijlstra, Paul E. McKenney

From: Peter Zijlstra <peterz@infradead.org>

Rabik and Paul reported two different issues related to the same few
lines of code.

Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).

Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.

The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.

The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.

So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.

[ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
miscounting noted by Rakib. ]

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
---
 kernel/sched/core.c |   41 ++++++++++++++++++++---------------------
 1 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d325c4b..06051e0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5300,27 +5300,17 @@ void idle_task_exit(void)
 }
 
 /*
- * While a dead CPU has no uninterruptible tasks queued at this point,
- * it might still have a nonzero ->nr_uninterruptible counter, because
- * for performance reasons the counter is not stricly tracking tasks to
- * their home CPUs. So we just add the counter to another CPU's counter,
- * to keep the global sum constant after CPU-down:
- */
-static void migrate_nr_uninterruptible(struct rq *rq_src)
-{
-	struct rq *rq_dest = cpu_rq(cpumask_any(cpu_active_mask));
-
-	rq_dest->nr_uninterruptible += rq_src->nr_uninterruptible;
-	rq_src->nr_uninterruptible = 0;
-}
-
-/*
- * remove the tasks which were accounted by rq from calc_load_tasks.
+ * Since this CPU is going 'away' for a while, fold any nr_active delta
+ * we might have. Assumes we're called after migrate_tasks() so that the
+ * nr_active count is stable.
+ *
+ * Also see the comment "Global load-average calculations".
  */
-static void calc_global_load_remove(struct rq *rq)
+static void calc_load_migrate(struct rq *rq)
 {
-	atomic_long_sub(rq->calc_load_active, &calc_load_tasks);
-	rq->calc_load_active = 0;
+	long delta = calc_load_fold_active(rq);
+	if (delta)
+		atomic_long_add(delta, &calc_load_tasks);
 }
 
 /*
@@ -5613,9 +5603,18 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		migrate_tasks(cpu);
 		BUG_ON(rq->nr_running != 1); /* the migration thread */
 		raw_spin_unlock_irqrestore(&rq->lock, flags);
+		break;
 
-		migrate_nr_uninterruptible(rq);
-		calc_global_load_remove(rq);
+	case CPU_DEAD:
+		{
+			struct rq *dest_rq;
+
+			local_irq_save(flags);
+			dest_rq = cpu_rq(smp_processor_id());
+			raw_spin_lock(&dest_rq->lock);
+			calc_load_migrate(rq);
+			raw_spin_unlock_irqrestore(&dest_rq->lock, flags);
+		}
 		break;
 #endif
 	}
-- 
1.7.8


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine()
  2012-08-30 19:03 ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Paul E. McKenney
  2012-08-30 19:03   ` [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs Paul E. McKenney
  2012-08-30 19:03   ` [PATCH tip/core/rcu 3/3] sched: Fix load avg vs cpu-hotplug Paul E. McKenney
@ 2012-08-31 16:09   ` Josh Triplett
  2 siblings, 0 replies; 7+ messages in thread
From: Josh Triplett @ 2012-08-31 16:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, sbw, patches, Paul E. McKenney

On Thu, Aug 30, 2012 at 12:03:01PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> Currently, _rcu_barrier() relies on preempt_disable() to prevent
> any CPU from going offline, which in turn depends on CPU hotplug's
> use of __stop_machine().
> 
> This patch therefore makes _rcu_barrier() use get_online_cpus() to
> block CPU-hotplug operations.  This has the added benefit of removing
> the need for _rcu_barrier() to adopt callbacks:  Because CPU-hotplug
> operations are excluded, there can be no callbacks to adopt.  This
> commit simplifies the code accordingly.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Impressive simplification!

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

> ---
>  kernel/rcutree.c       |   83 ++++++-----------------------------------------
>  kernel/rcutree.h       |    3 --
>  kernel/rcutree_trace.c |    4 +-
>  3 files changed, 13 insertions(+), 77 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index f280e54..9854a00 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1390,17 +1390,6 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp)
>  	int i;
>  	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
>  
> -	/*
> -	 * If there is an rcu_barrier() operation in progress, then
> -	 * only the task doing that operation is permitted to adopt
> -	 * callbacks.  To do otherwise breaks rcu_barrier() and friends
> -	 * by causing them to fail to wait for the callbacks in the
> -	 * orphanage.
> -	 */
> -	if (rsp->rcu_barrier_in_progress &&
> -	    rsp->rcu_barrier_in_progress != current)
> -		return;
> -
>  	/* Do the accounting first. */
>  	rdp->qlen_lazy += rsp->qlen_lazy;
>  	rdp->qlen += rsp->qlen;
> @@ -1455,9 +1444,8 @@ static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
>   * The CPU has been completely removed, and some other CPU is reporting
>   * this fact from process context.  Do the remainder of the cleanup,
>   * including orphaning the outgoing CPU's RCU callbacks, and also
> - * adopting them, if there is no _rcu_barrier() instance running.
> - * There can only be one CPU hotplug operation at a time, so no other
> - * CPU can be attempting to update rcu_cpu_kthread_task.
> + * adopting them.  There can only be one CPU hotplug operation at a time,
> + * so no other CPU can be attempting to update rcu_cpu_kthread_task.
>   */
>  static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
>  {
> @@ -1519,10 +1507,6 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
>  
>  #else /* #ifdef CONFIG_HOTPLUG_CPU */
>  
> -static void rcu_adopt_orphan_cbs(struct rcu_state *rsp)
> -{
> -}
> -
>  static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
>  {
>  }
> @@ -2326,13 +2310,10 @@ static void rcu_barrier_func(void *type)
>  static void _rcu_barrier(struct rcu_state *rsp)
>  {
>  	int cpu;
> -	unsigned long flags;
>  	struct rcu_data *rdp;
> -	struct rcu_data rd;
>  	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
>  	unsigned long snap_done;
>  
> -	init_rcu_head_on_stack(&rd.barrier_head);
>  	_rcu_barrier_trace(rsp, "Begin", -1, snap);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
> @@ -2372,70 +2353,30 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	/*
>  	 * Initialize the count to one rather than to zero in order to
>  	 * avoid a too-soon return to zero in case of a short grace period
> -	 * (or preemption of this task).  Also flag this task as doing
> -	 * an rcu_barrier().  This will prevent anyone else from adopting
> -	 * orphaned callbacks, which could cause otherwise failure if a
> -	 * CPU went offline and quickly came back online.  To see this,
> -	 * consider the following sequence of events:
> -	 *
> -	 * 1.	We cause CPU 0 to post an rcu_barrier_callback() callback.
> -	 * 2.	CPU 1 goes offline, orphaning its callbacks.
> -	 * 3.	CPU 0 adopts CPU 1's orphaned callbacks.
> -	 * 4.	CPU 1 comes back online.
> -	 * 5.	We cause CPU 1 to post an rcu_barrier_callback() callback.
> -	 * 6.	Both rcu_barrier_callback() callbacks are invoked, awakening
> -	 *	us -- but before CPU 1's orphaned callbacks are invoked!!!
> +	 * (or preemption of this task).  Exclude CPU-hotplug operations
> +	 * to ensure that no offline CPU has callbacks queued.
>  	 */
>  	init_completion(&rsp->barrier_completion);
>  	atomic_set(&rsp->barrier_cpu_count, 1);
> -	raw_spin_lock_irqsave(&rsp->onofflock, flags);
> -	rsp->rcu_barrier_in_progress = current;
> -	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> +	get_online_cpus();
>  
>  	/*
> -	 * Force every CPU with callbacks to register a new callback
> -	 * that will tell us when all the preceding callbacks have
> -	 * been invoked.  If an offline CPU has callbacks, wait for
> -	 * it to either come back online or to finish orphaning those
> -	 * callbacks.
> +	 * Force each CPU with callbacks to register a new callback.
> +	 * When that callback is invoked, we will know that all of the
> +	 * corresponding CPU's preceding callbacks have been invoked.
>  	 */
> -	for_each_possible_cpu(cpu) {
> -		preempt_disable();
> +	for_each_online_cpu(cpu) {
>  		rdp = per_cpu_ptr(rsp->rda, cpu);
> -		if (cpu_is_offline(cpu)) {
> -			_rcu_barrier_trace(rsp, "Offline", cpu,
> -					   rsp->n_barrier_done);
> -			preempt_enable();
> -			while (cpu_is_offline(cpu) && ACCESS_ONCE(rdp->qlen))
> -				schedule_timeout_interruptible(1);
> -		} else if (ACCESS_ONCE(rdp->qlen)) {
> +		if (ACCESS_ONCE(rdp->qlen)) {
>  			_rcu_barrier_trace(rsp, "OnlineQ", cpu,
>  					   rsp->n_barrier_done);
>  			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
> -			preempt_enable();
>  		} else {
>  			_rcu_barrier_trace(rsp, "OnlineNQ", cpu,
>  					   rsp->n_barrier_done);
> -			preempt_enable();
>  		}
>  	}
> -
> -	/*
> -	 * Now that all online CPUs have rcu_barrier_callback() callbacks
> -	 * posted, we can adopt all of the orphaned callbacks and place
> -	 * an rcu_barrier_callback() callback after them.  When that is done,
> -	 * we are guaranteed to have an rcu_barrier_callback() callback
> -	 * following every callback that could possibly have been
> -	 * registered before _rcu_barrier() was called.
> -	 */
> -	raw_spin_lock_irqsave(&rsp->onofflock, flags);
> -	rcu_adopt_orphan_cbs(rsp);
> -	rsp->rcu_barrier_in_progress = NULL;
> -	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> -	atomic_inc(&rsp->barrier_cpu_count);
> -	smp_mb__after_atomic_inc(); /* Ensure atomic_inc() before callback. */
> -	rd.rsp = rsp;
> -	rsp->call(&rd.barrier_head, rcu_barrier_callback);
> +	put_online_cpus();
>  
>  	/*
>  	 * Now that we have an rcu_barrier_callback() callback on each
> @@ -2456,8 +2397,6 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  
>  	/* Other rcu_barrier() invocations can now safely proceed. */
>  	mutex_unlock(&rsp->barrier_mutex);
> -
> -	destroy_rcu_head_on_stack(&rd.barrier_head);
>  }
>  
>  /**
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 4d29169..94dfdf1 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -398,9 +398,6 @@ struct rcu_state {
>  	struct rcu_head **orphan_donetail;	/* Tail of above. */
>  	long qlen_lazy;				/* Number of lazy callbacks. */
>  	long qlen;				/* Total number of callbacks. */
> -	struct task_struct *rcu_barrier_in_progress;
> -						/* Task doing rcu_barrier(), */
> -						/*  or NULL if no barrier. */
>  	struct mutex barrier_mutex;		/* Guards barrier fields. */
>  	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
>  	struct completion barrier_completion;	/* Wake at barrier end. */
> diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c
> index abffb48..6a2e52a 100644
> --- a/kernel/rcutree_trace.c
> +++ b/kernel/rcutree_trace.c
> @@ -51,8 +51,8 @@ static int show_rcubarrier(struct seq_file *m, void *unused)
>  	struct rcu_state *rsp;
>  
>  	for_each_rcu_flavor(rsp)
> -		seq_printf(m, "%s: %c bcc: %d nbd: %lu\n",
> -			   rsp->name, rsp->rcu_barrier_in_progress ? 'B' : '.',
> +		seq_printf(m, "%s: bcc: %d nbd: %lu\n",
> +			   rsp->name,
>  			   atomic_read(&rsp->barrier_cpu_count),
>  			   rsp->n_barrier_done);
>  	return 0;
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs
  2012-08-30 19:03   ` [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs Paul E. McKenney
@ 2012-08-31 16:21     ` Josh Triplett
  2012-09-04 23:53       ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Triplett @ 2012-08-31 16:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, sbw, patches, Paul E. McKenney

On Thu, Aug 30, 2012 at 12:03:02PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> Posting a callback after the CPU_DEAD notifier effectively leaks
> that callback unless/until that CPU comes back online.  Silence is
> unhelpful when attempting to track down such leaks, so this commit emits
> a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
> CPU attempts to register a callback.  The rdp->nxttail[RCU_NEXT_TAIL] is
> set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
> notifier, allowing _call_rcu() to determine exactly when posting callbacks
> is illegal.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

One suggestion below; with or without that change:

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

> ---
>  kernel/rcutree.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 9854a00..5f8c4dd 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1503,6 +1503,9 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
>  	WARN_ONCE(rdp->qlen != 0 || rdp->nxtlist != NULL,
>  		  "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, nxtlist=%p\n",
>  		  cpu, rdp->qlen, rdp->nxtlist);
> +	init_callback_list(rdp);
> +	/* Disallow further callbacks on this CPU. */
> +	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
>  }
>  
>  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> @@ -1925,6 +1928,12 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
>  	rdp = this_cpu_ptr(rsp->rda);
>  
>  	/* Add the callback to our list. */
> +	if (unlikely(rdp->nxttail[RCU_NEXT_TAIL] == NULL)) {
> +		/* _call_rcu() is illegal on offline CPU; leak the callback. */
> +		WARN_ON_ONCE(1);

You can write this as:

if (WARN_ON_ONCE(rdp->nxttail[RCU_NEXT_TAIL] == NULL))

WARN_ON_ONCE also has a built-in unlikely() already.

> +		local_irq_restore(flags);
> +		return;
> +	}
>  	ACCESS_ONCE(rdp->qlen)++;
>  	if (lazy)
>  		rdp->qlen_lazy++;
> @@ -2462,6 +2471,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>  	rdp->qlen_last_fqs_check = 0;
>  	rdp->n_force_qs_snap = rsp->n_force_qs;
>  	rdp->blimit = blimit;
> +	init_callback_list(rdp);  /* Re-enable callbacks on this CPU. */
>  	rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
>  	atomic_set(&rdp->dynticks->dynticks,
>  		   (atomic_read(&rdp->dynticks->dynticks) & ~0x1) + 1);
> -- 
> 1.7.8
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs
  2012-08-31 16:21     ` Josh Triplett
@ 2012-09-04 23:53       ` Paul E. McKenney
  0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2012-09-04 23:53 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, dhowells,
	eric.dumazet, darren, fweisbec, sbw, patches, Paul E. McKenney

On Fri, Aug 31, 2012 at 09:21:30AM -0700, Josh Triplett wrote:
> On Thu, Aug 30, 2012 at 12:03:02PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> > 
> > Posting a callback after the CPU_DEAD notifier effectively leaks
> > that callback unless/until that CPU comes back online.  Silence is
> > unhelpful when attempting to track down such leaks, so this commit emits
> > a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
> > CPU attempts to register a callback.  The rdp->nxttail[RCU_NEXT_TAIL] is
> > set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
> > notifier, allowing _call_rcu() to determine exactly when posting callbacks
> > is illegal.
> > 
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> One suggestion below; with or without that change:
> 
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> 
> > ---
> >  kernel/rcutree.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 9854a00..5f8c4dd 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -1503,6 +1503,9 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
> >  	WARN_ONCE(rdp->qlen != 0 || rdp->nxtlist != NULL,
> >  		  "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, nxtlist=%p\n",
> >  		  cpu, rdp->qlen, rdp->nxtlist);
> > +	init_callback_list(rdp);
> > +	/* Disallow further callbacks on this CPU. */
> > +	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
> >  }
> >  
> >  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> > @@ -1925,6 +1928,12 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
> >  	rdp = this_cpu_ptr(rsp->rda);
> >  
> >  	/* Add the callback to our list. */
> > +	if (unlikely(rdp->nxttail[RCU_NEXT_TAIL] == NULL)) {
> > +		/* _call_rcu() is illegal on offline CPU; leak the callback. */
> > +		WARN_ON_ONCE(1);
> 
> You can write this as:
> 
> if (WARN_ON_ONCE(rdp->nxttail[RCU_NEXT_TAIL] == NULL))
> 
> WARN_ON_ONCE also has a built-in unlikely() already.

Indeed, I missed the fact that WARN_ON_ONCE() returns its argument.
I believe that there are a number of other places in RCU where I could
bury a WARN_ON_ONCE(1) into the preceding "if" statement, so added a
todo-list item to check all of them and convert as appropriate.

							Thanx, Paul

> > +		local_irq_restore(flags);
> > +		return;
> > +	}
> >  	ACCESS_ONCE(rdp->qlen)++;
> >  	if (lazy)
> >  		rdp->qlen_lazy++;
> > @@ -2462,6 +2471,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> >  	rdp->qlen_last_fqs_check = 0;
> >  	rdp->n_force_qs_snap = rsp->n_force_qs;
> >  	rdp->blimit = blimit;
> > +	init_callback_list(rdp);  /* Re-enable callbacks on this CPU. */
> >  	rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
> >  	atomic_set(&rdp->dynticks->dynticks,
> >  		   (atomic_read(&rdp->dynticks->dynticks) & ~0x1) + 1);
> > -- 
> > 1.7.8
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-04 23:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-30 19:02 [PATCH tip/core/rcu 0/3] CPU-hotplug changes Paul E. McKenney
2012-08-30 19:03 ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Paul E. McKenney
2012-08-30 19:03   ` [PATCH tip/core/rcu 2/3] rcu: Disallow callback registry on offline CPUs Paul E. McKenney
2012-08-31 16:21     ` Josh Triplett
2012-09-04 23:53       ` Paul E. McKenney
2012-08-30 19:03   ` [PATCH tip/core/rcu 3/3] sched: Fix load avg vs cpu-hotplug Paul E. McKenney
2012-08-31 16:09   ` [PATCH tip/core/rcu 1/3] rcu: Remove _rcu_barrier() dependency on __stop_machine() Josh Triplett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).