All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Hogan <james.hogan@imgtec.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	<linux-kernel@vger.kernel.org>
Cc: <mingo@kernel.org>, <laijs@cn.fujitsu.com>, <dipankar@in.ibm.com>,
	<akpm@linux-foundation.org>, <mathieu.desnoyers@efficios.com>,
	<josh@joshtriplett.org>, <tglx@linutronix.de>,
	<peterz@infradead.org>, <rostedt@goodmis.org>,
	<dhowells@redhat.com>, <edumazet@google.com>,
	<dvhart@linux.intel.com>, <fweisbec@gmail.com>, <oleg@redhat.com>,
	<bobby.prani@gmail.com>, <linux-metag@vger.kernel.org>
Subject: Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
Date: Tue, 10 Mar 2015 15:30:42 +0000	[thread overview]
Message-ID: <54FF0E22.3010904@imgtec.com> (raw)
In-Reply-To: <1425404595-17816-4-git-send-email-paulmck@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2864 bytes --]

Hi Paul,

On 03/03/15 17:42, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> This commit removes the open-coded CPU-offline notification with new
> common code.  This change avoids calling scheduler code using RCU from
> an offline CPU that RCU is ignoring.  This commit is compatible with
> the existing code in not checking for timeout during a prior offline
> for a given CPU.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: James Hogan <james.hogan@imgtec.com>
> Cc: <linux-metag@vger.kernel.org>

I gave this a try via linux-next, but unfortunately it causes the
following warning every time a CPU goes down:
META213-Thread0 DSP [LogF] CPU1: unable to kill

If I add printks, I see that the state on entry to both cpu_wait_death
and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
changed from its initial value.

Should arches other than x86 now be calling cpu_set_state_online()? The
patchlet below seems to resolve it for Meta (not sure if that is the
best place in the startup sequence to do it, perhaps it doesn't matter).

diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index ac3a199e33e7..430e379ec71f 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
 	 * OK, now it's safe to let the boot CPU continue
 	 */
 	set_cpu_online(cpu, true);
+	cpu_set_state_online(cpu);
 	complete(&cpu_running);
 
 	/*

Looking at the comment before cpu_set_state_online:
> /*
>  * Mark the specified CPU online.
>  *
>  * Note that it is permissible to omit this call entirely, as is
>  * done in architectures that do no CPU-hotplug error checking.
>  */

Which suggests it wasn't wrong to omit it before your patches came
along.

Cheers
James


> ---
>  arch/metag/kernel/smp.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index f006d2276f40..ac3a199e33e7 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -261,7 +261,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> -static DECLARE_COMPLETION(cpu_killed);
>  
>  /*
>   * __cpu_disable runs on the processor to be shutdown.
> @@ -299,7 +298,7 @@ int __cpu_disable(void)
>   */
>  void __cpu_die(unsigned int cpu)
>  {
> -	if (!wait_for_completion_timeout(&cpu_killed, msecs_to_jiffies(1)))
> +	if (!cpu_wait_death(cpu, 1))
>  		pr_err("CPU%u: unable to kill\n", cpu);
>  }
>  
> @@ -314,7 +313,7 @@ void cpu_die(void)
>  	local_irq_disable();
>  	idle_task_exit();
>  
> -	complete(&cpu_killed);
> +	(void)cpu_report_death();
>  
>  	asm ("XOR	TXENABLE, D0Re0,D0Re0\n");
>  }
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: James Hogan <james.hogan-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
To: "Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org,
	dipankar-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org,
	josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org,
	tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org,
	fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	bobby.prani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-metag-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
Date: Tue, 10 Mar 2015 15:30:42 +0000	[thread overview]
Message-ID: <54FF0E22.3010904@imgtec.com> (raw)
In-Reply-To: <1425404595-17816-4-git-send-email-paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]

Hi Paul,

On 03/03/15 17:42, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> 
> This commit removes the open-coded CPU-offline notification with new
> common code.  This change avoids calling scheduler code using RCU from
> an offline CPU that RCU is ignoring.  This commit is compatible with
> the existing code in not checking for timeout during a prior offline
> for a given CPU.
> 
> Signed-off-by: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Cc: James Hogan <james.hogan-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
> Cc: <linux-metag-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>

I gave this a try via linux-next, but unfortunately it causes the
following warning every time a CPU goes down:
META213-Thread0 DSP [LogF] CPU1: unable to kill

If I add printks, I see that the state on entry to both cpu_wait_death
and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
changed from its initial value.

Should arches other than x86 now be calling cpu_set_state_online()? The
patchlet below seems to resolve it for Meta (not sure if that is the
best place in the startup sequence to do it, perhaps it doesn't matter).

diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index ac3a199e33e7..430e379ec71f 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
 	 * OK, now it's safe to let the boot CPU continue
 	 */
 	set_cpu_online(cpu, true);
+	cpu_set_state_online(cpu);
 	complete(&cpu_running);
 
 	/*

Looking at the comment before cpu_set_state_online:
> /*
>  * Mark the specified CPU online.
>  *
>  * Note that it is permissible to omit this call entirely, as is
>  * done in architectures that do no CPU-hotplug error checking.
>  */

Which suggests it wasn't wrong to omit it before your patches came
along.

Cheers
James


> ---
>  arch/metag/kernel/smp.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index f006d2276f40..ac3a199e33e7 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -261,7 +261,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> -static DECLARE_COMPLETION(cpu_killed);
>  
>  /*
>   * __cpu_disable runs on the processor to be shutdown.
> @@ -299,7 +298,7 @@ int __cpu_disable(void)
>   */
>  void __cpu_die(unsigned int cpu)
>  {
> -	if (!wait_for_completion_timeout(&cpu_killed, msecs_to_jiffies(1)))
> +	if (!cpu_wait_death(cpu, 1))
>  		pr_err("CPU%u: unable to kill\n", cpu);
>  }
>  
> @@ -314,7 +313,7 @@ void cpu_die(void)
>  	local_irq_disable();
>  	idle_task_exit();
>  
> -	complete(&cpu_killed);
> +	(void)cpu_report_death();
>  
>  	asm ("XOR	TXENABLE, D0Re0,D0Re0\n");
>  }
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-03-10 15:30 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03 17:41 [PATCH tip/core/rcu 0/20] CPU hotplug updates for v4.1 Paul E. McKenney
2015-03-03 17:42 ` [PATCH tip/core/rcu 01/20] smpboot: Add common code for notification from dying CPU Paul E. McKenney
2015-03-03 17:42   ` Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification code Paul E. McKenney
2015-03-03 19:17     ` Boris Ostrovsky
2015-03-03 19:42       ` Paul E. McKenney
2015-03-03 19:42       ` Paul E. McKenney
2015-03-03 20:13         ` Boris Ostrovsky
2015-03-03 20:13         ` Boris Ostrovsky
2015-03-03 21:26           ` Paul E. McKenney
2015-03-03 22:06             ` Boris Ostrovsky
2015-03-03 22:31               ` Paul E. McKenney
2015-03-03 22:31               ` Paul E. McKenney
2015-03-04 14:43                 ` Paul E. McKenney
2015-03-04 14:43                 ` Paul E. McKenney
2015-03-04 14:55                   ` Boris Ostrovsky
2015-03-04 14:55                   ` Boris Ostrovsky
2015-03-04 15:25                     ` Paul E. McKenney
2015-03-05 21:17                       ` Boris Ostrovsky
2015-03-05 21:17                       ` Boris Ostrovsky
2015-03-05 22:00                         ` Paul E. McKenney
2015-03-05 22:00                         ` Paul E. McKenney
2015-03-04 15:25                     ` Paul E. McKenney
2015-03-04 15:45                     ` David Vrabel
2015-03-04 15:45                     ` David Vrabel
2015-03-04 16:10                       ` Boris Ostrovsky
2015-03-04 16:10                       ` Boris Ostrovsky
2015-03-03 22:06             ` Boris Ostrovsky
2015-03-03 21:26           ` Paul E. McKenney
2015-03-03 19:17     ` Boris Ostrovsky
2015-03-03 17:42   ` Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 03/20] blackfin: " Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 04/20] metag: " Paul E. McKenney
2015-03-03 17:42     ` Paul E. McKenney
2015-03-10 15:30     ` James Hogan [this message]
2015-03-10 15:30       ` James Hogan
2015-03-10 16:59       ` Paul E. McKenney
2015-03-11 11:03         ` James Hogan
2015-03-11 11:03           ` James Hogan
2015-03-11 18:58           ` Paul E. McKenney
2015-03-11 18:58             ` Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 05/20] rcu: Consolidate offline-CPU callback initialization Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 06/20] rcu: Put all orphan-callback-related code under same comment Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 07/20] rcu: Simplify sync_rcu_preempt_exp_init() Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 08/20] rcu: Eliminate empty HOTPLUG_CPU ifdef Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 09/20] rcu: Detect stalls caused by failure to propagate up rcu_node tree Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 10/20] rcu: Provide diagnostic option to slow down grace-period initialization Paul E. McKenney
2015-03-04 10:54     ` Paul Bolle
2015-03-04 14:59       ` Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 11/20] rcutorture: Enable slow grace-period initializations Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 12/20] rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 13/20] rcu: Rework preemptible expedited bitmask handling Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 14/20] rcu: Move rcu_report_unblock_qs_rnp() to common code Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 15/20] rcu: Process offlining and onlining only at grace-period start Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 16/20] rcu: Eliminate ->onoff_mutex from rcu_node structure Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 17/20] cpu: Make CPU-offline idle-loop transition point more precise Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 18/20] rcu: Handle outgoing CPUs on exit from idle loop Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 19/20] rcutorture: Default to grace-period-initialization delays Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 20/20] rcu: Add diagnostics to grace-period cleanup Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54FF0E22.3010904@imgtec.com \
    --to=james.hogan@imgtec.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-metag@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.