linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot
@ 2018-07-19 22:18 Venkata Narendra Kumar Gutta
  2018-07-19 22:55 ` Russell King - ARM Linux
  2018-07-20  6:23 ` Mark Rutland
  0 siblings, 2 replies; 3+ messages in thread
From: Venkata Narendra Kumar Gutta @ 2018-07-19 22:18 UTC (permalink / raw)
  To: linux-arm-kernel, tsoni, ckadabi, rishabhb, linux-kernel, robh,
	hoeun.ryu, adobriyan, zhizhouzhang, suzuki.poulose, mark.rutland,
	james.morse, will.deacon, catalin.marinas
  Cc: Abhimanyu Kapur, Venkata Narendra Kumar Gutta

Nothing stops a process from hotplugging in a CPU concurrently
with a sys_reboot() call. In such a situation we could have
ipi_cpu_stop() mark a cpu as 'offline' and _cpu_up() ignore the
fact that the CPU is not really offline and call the
CPU_UP_PREPARE notifier. When this happens stop_machine code will
complain that the cpu thread already exists and BUG_ON().

CPU0                      CPU1

 sys_reboot()
 kernel_restart()
 machine_restart()
 machine_shutdown()
 smp_send_stop()
 ...                   ipi_cpu_stop()
		       set_cpu_online(1, false)
                       local_irq_disable()
                       while(1)
        <PREEMPT>
 cpu_up()
 _cpu_up()
 if (!cpu_online(1))
 __cpu_notify(CPU_UP_PREPARE...)

 cpu_stop_cpu_callback()
 BUG_ON(stopper->thread)

This is easily reproducible by hotplugging in and out in a tight
loop while also rebooting.

Since the CPU is not really offline and hasn't gone through the
proper steps to be marked as such, let's mark the CPU as inactive.
This is just as easily testable as online and avoids any possibility
of _cpu_up() trying to bring the CPU back online when it never was
offline to begin with. Based on the similar patchset by for arm
targets 040c163( "ARM: smp: Fix cpu_up() racing with sys_reboot)"

Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
Signed-off-by: Venkata Narendra Kumar Gutta <vnkgutta@codeaurora.org>
---
 arch/arm64/kernel/smp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 2faa986..adee4d3 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -790,7 +790,7 @@ void arch_irq_work_raise(void)
  */
 static void ipi_cpu_stop(unsigned int cpu)
 {
-	set_cpu_online(cpu, false);
+	set_cpu_active(cpu, false);
 
 	local_daif_mask();
 	sdei_mask_local_cpu();
@@ -925,10 +925,10 @@ void smp_send_stop(void)
 
 	/* Wait up to one second for other CPUs to stop */
 	timeout = USEC_PER_SEC;
-	while (num_online_cpus() > 1 && timeout--)
+	while (num_active_cpus() > 1 && timeout--)
 		udelay(1);
 
-	if (num_online_cpus() > 1)
+	if (num_active_cpus() > 1)
 		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
 			   cpumask_pr_args(cpu_online_mask));
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot
  2018-07-19 22:18 [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot Venkata Narendra Kumar Gutta
@ 2018-07-19 22:55 ` Russell King - ARM Linux
  2018-07-20  6:23 ` Mark Rutland
  1 sibling, 0 replies; 3+ messages in thread
From: Russell King - ARM Linux @ 2018-07-19 22:55 UTC (permalink / raw)
  To: Venkata Narendra Kumar Gutta
  Cc: linux-arm-kernel, tsoni, ckadabi, rishabhb, linux-kernel, robh,
	hoeun.ryu, adobriyan, zhizhouzhang, suzuki.poulose, mark.rutland,
	james.morse, will.deacon, catalin.marinas, Abhimanyu Kapur

On Thu, Jul 19, 2018 at 03:18:46PM -0700, Venkata Narendra Kumar Gutta wrote:
> Nothing stops a process from hotplugging in a CPU concurrently
> with a sys_reboot() call. In such a situation we could have
> ipi_cpu_stop() mark a cpu as 'offline' and _cpu_up() ignore the
> fact that the CPU is not really offline and call the
> CPU_UP_PREPARE notifier. When this happens stop_machine code will
> complain that the cpu thread already exists and BUG_ON().
> 
> CPU0                      CPU1
> 
>  sys_reboot()
>  kernel_restart()
>  machine_restart()
>  machine_shutdown()
>  smp_send_stop()
>  ...                   ipi_cpu_stop()
> 		       set_cpu_online(1, false)
>                        local_irq_disable()
>                        while(1)
>         <PREEMPT>
>  cpu_up()
>  _cpu_up()
>  if (!cpu_online(1))
>  __cpu_notify(CPU_UP_PREPARE...)
> 
>  cpu_stop_cpu_callback()
>  BUG_ON(stopper->thread)
> 
> This is easily reproducible by hotplugging in and out in a tight
> loop while also rebooting.
> 
> Since the CPU is not really offline and hasn't gone through the
> proper steps to be marked as such, let's mark the CPU as inactive.
> This is just as easily testable as online and avoids any possibility
> of _cpu_up() trying to bring the CPU back online when it never was
> offline to begin with. Based on the similar patchset by for arm
> targets 040c163( "ARM: smp: Fix cpu_up() racing with sys_reboot)"

You really need to use 12 digit commit IDs to refer to commits to
avoid:

$ git log 040c163
fatal: ambiguous argument '040c163': unknown revision or path not in the working tree.

because 7 digits are just too short and given the size of the kernel,
result in conflicting references.

I can find no trace of "040c163" touching arch/arm/kernel/smp.c,
nor can I find a commit with "racing with sys_reboot" touching the
same file.

The commit you reference doesn't seem to be in mainline.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot
  2018-07-19 22:18 [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot Venkata Narendra Kumar Gutta
  2018-07-19 22:55 ` Russell King - ARM Linux
@ 2018-07-20  6:23 ` Mark Rutland
  1 sibling, 0 replies; 3+ messages in thread
From: Mark Rutland @ 2018-07-20  6:23 UTC (permalink / raw)
  To: Venkata Narendra Kumar Gutta
  Cc: linux-arm-kernel, tsoni, ckadabi, rishabhb, linux-kernel, robh,
	hoeun.ryu, adobriyan, zhizhouzhang, suzuki.poulose, james.morse,
	will.deacon, catalin.marinas, Abhimanyu Kapur

On Thu, Jul 19, 2018 at 03:18:46PM -0700, Venkata Narendra Kumar Gutta wrote:
> Nothing stops a process from hotplugging in a CPU concurrently
> with a sys_reboot() call. In such a situation we could have
> ipi_cpu_stop() mark a cpu as 'offline' and _cpu_up() ignore the
> fact that the CPU is not really offline and call the
> CPU_UP_PREPARE notifier. When this happens stop_machine code will
> complain that the cpu thread already exists and BUG_ON().
> 
> CPU0                      CPU1
> 
>  sys_reboot()
>  kernel_restart()
>  machine_restart()
>  machine_shutdown()
>  smp_send_stop()

From a quick look arm64's machine_restart() disables IRQs, calls
smp_send_stop(), then calls one of:

* efi_reboot()
* arm_pm_restart()
* do_kernel_restart()

... and I can't see any of these calling machine_shutdown() directly.

How does machine_shutdown() get called? Is that somewhere in the restart_list
associated with do_kernel_restart?

>  ...                   ipi_cpu_stop()
> 		       set_cpu_online(1, false)
>                        local_irq_disable()
>                        while(1)
>         <PREEMPT>

Given IRQs are disabled in machine_restart(), I don't understand how this
preemption can occur. Where do they get re-enabled?

>  cpu_up()
>  _cpu_up()
>  if (!cpu_online(1))
>  __cpu_notify(CPU_UP_PREPARE...)
> 
>  cpu_stop_cpu_callback()
>  BUG_ON(stopper->thread)
> 
> This is easily reproducible by hotplugging in and out in a tight
> loop while also rebooting.

Is this reproducible with mainline? e.g. v4.18-rc5?

I've packed away my HW in preparation for an office move, but I might be able
to give this a go in a VM.

> Since the CPU is not really offline and hasn't gone through the
> proper steps to be marked as such, let's mark the CPU as inactive.
> This is just as easily testable as online and avoids any possibility
> of _cpu_up() trying to bring the CPU back online when it never was
> offline to begin with. Based on the similar patchset by for arm
> targets 040c163( "ARM: smp: Fix cpu_up() racing with sys_reboot)"

Is this in mainline?

Looking at v4.18-rc5 I don't see arm's ipi_cpu_stop touching the active mask.

Thanks,
Mark.

> 
> Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
> Signed-off-by: Venkata Narendra Kumar Gutta <vnkgutta@codeaurora.org>
> ---
>  arch/arm64/kernel/smp.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 2faa986..adee4d3 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -790,7 +790,7 @@ void arch_irq_work_raise(void)
>   */
>  static void ipi_cpu_stop(unsigned int cpu)
>  {
> -	set_cpu_online(cpu, false);
> +	set_cpu_active(cpu, false);
>  
>  	local_daif_mask();
>  	sdei_mask_local_cpu();
> @@ -925,10 +925,10 @@ void smp_send_stop(void)
>  
>  	/* Wait up to one second for other CPUs to stop */
>  	timeout = USEC_PER_SEC;
> -	while (num_online_cpus() > 1 && timeout--)
> +	while (num_active_cpus() > 1 && timeout--)
>  		udelay(1);
>  
> -	if (num_online_cpus() > 1)
> +	if (num_active_cpus() > 1)
>  		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
>  			   cpumask_pr_args(cpu_online_mask));
>  
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-07-20  6:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-19 22:18 [PATCH] ARM64: smp: Fix cpu_up() racing with sys_reboot Venkata Narendra Kumar Gutta
2018-07-19 22:55 ` Russell King - ARM Linux
2018-07-20  6:23 ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).