linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
@ 2022-01-27  9:02 Pingfan Liu
  2022-01-27  9:41 ` Baoquan He
  2022-02-14  2:38 ` Pingfan Liu
  0 siblings, 2 replies; 7+ messages in thread
From: Pingfan Liu @ 2022-01-27  9:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

The following identical code piece appears in both
migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():

	if (!cpu_online(primary_cpu))
		primary_cpu = cpumask_first(cpu_online_mask);

This is due to a breakage like the following:
   migrate_to_reboot_cpu();
   cpu_hotplug_enable();
                          --> comes a cpu_down(this_cpu) on other cpu
   machine_shutdown();

Although the kexec-reboot task can get through a cpu_down() on its cpu,
this code looks a little confusing.

Make things straight forward by keeping cpu hotplug disabled until
smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
breakage is squashed out and the rebooting cpu can keep unchanged.

Note: this patch only affects the kexec-reboot on arches, which rely on
cpu hotplug mechanism.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Vincent Donnefort <vincent.donnefort@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Baokun Li <libaokun1@huawei.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: kexec@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
v1 -> v2:
 improve commit log

 kernel/cpu.c        | 16 ++++++++++------
 kernel/kexec_core.c | 10 ++++------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9c92147f0812..87bdf21de950 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1240,20 +1240,24 @@ int remove_cpu(unsigned int cpu)
 }
 EXPORT_SYMBOL_GPL(remove_cpu);
 
+/* primary_cpu keeps unchanged after migrate_to_reboot_cpu() */
 void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
 {
 	unsigned int cpu;
 	int error;
 
+	/*
+	 * Block other cpu hotplug event, so primary_cpu is always online if
+	 * it is not touched by us
+	 */
 	cpu_maps_update_begin();
-
 	/*
-	 * Make certain the cpu I'm about to reboot on is online.
-	 *
-	 * This is inline to what migrate_to_reboot_cpu() already do.
+	 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
+	 * no further code needs to use CPU hotplug (which is true in
+	 * the reboot case). However, the kexec path depends on using
+	 * CPU hotplug again; so re-enable it here.
 	 */
-	if (!cpu_online(primary_cpu))
-		primary_cpu = cpumask_first(cpu_online_mask);
+	__cpu_hotplug_enable();
 
 	for_each_online_cpu(cpu) {
 		if (cpu == primary_cpu)
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 68480f731192..db4fa6b174e3 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1168,14 +1168,12 @@ int kernel_kexec(void)
 		kexec_in_progress = true;
 		kernel_restart_prepare("kexec reboot");
 		migrate_to_reboot_cpu();
-
 		/*
-		 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
-		 * no further code needs to use CPU hotplug (which is true in
-		 * the reboot case). However, the kexec path depends on using
-		 * CPU hotplug again; so re-enable it here.
+		 * migrate_to_reboot_cpu() disables CPU hotplug. If an arch
+		 * relies on the cpu teardown to achieve reboot, it needs to
+		 * re-enable CPU hotplug there.
 		 */
-		cpu_hotplug_enable();
+
 		pr_notice("Starting new kernel\n");
 		machine_shutdown();
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-01-27  9:02 [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable Pingfan Liu
@ 2022-01-27  9:41 ` Baoquan He
  2022-01-28  7:41   ` Pingfan Liu
  2022-02-14  2:38 ` Pingfan Liu
  1 sibling, 1 reply; 7+ messages in thread
From: Baoquan He @ 2022-01-27  9:41 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

Hi Pingfan,

On 01/27/22 at 05:02pm, Pingfan Liu wrote:
> The following identical code piece appears in both
> migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
> 
> 	if (!cpu_online(primary_cpu))
> 		primary_cpu = cpumask_first(cpu_online_mask);
> 
> This is due to a breakage like the following:
>    migrate_to_reboot_cpu();
>    cpu_hotplug_enable();
>                           --> comes a cpu_down(this_cpu) on other cpu
>    machine_shutdown();
> 
> Although the kexec-reboot task can get through a cpu_down() on its cpu,
> this code looks a little confusing.
> 
> Make things straight forward by keeping cpu hotplug disabled until
> smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> breakage is squashed out and the rebooting cpu can keep unchanged.

If I didn't go through code wrongly, you may miss the x86 case.
Several ARCHes do call smp_shutdown_nonboot_cpus() in machine_shutdown()
in kexec reboot code path, while x86 doesn't. If I am right, you may
need reconsider if this patch is needed or need be adjustd.

Are you optimizing code path, or you meet a real problem? I haven't
checked v1, but I also didn't see it's told in patch log which case it
is.


> 
> Note: this patch only affects the kexec-reboot on arches, which rely on
> cpu hotplug mechanism.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Vincent Donnefort <vincent.donnefort@arm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: YueHaibing <yuehaibing@huawei.com>
> Cc: Baokun Li <libaokun1@huawei.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: kexec@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> ---
> v1 -> v2:
>  improve commit log
> 
>  kernel/cpu.c        | 16 ++++++++++------
>  kernel/kexec_core.c | 10 ++++------
>  2 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 9c92147f0812..87bdf21de950 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1240,20 +1240,24 @@ int remove_cpu(unsigned int cpu)
>  }
>  EXPORT_SYMBOL_GPL(remove_cpu);
>  
> +/* primary_cpu keeps unchanged after migrate_to_reboot_cpu() */
>  void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
>  {
>  	unsigned int cpu;
>  	int error;
>  
> +	/*
> +	 * Block other cpu hotplug event, so primary_cpu is always online if
> +	 * it is not touched by us
> +	 */
>  	cpu_maps_update_begin();
> -
>  	/*
> -	 * Make certain the cpu I'm about to reboot on is online.
> -	 *
> -	 * This is inline to what migrate_to_reboot_cpu() already do.
> +	 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> +	 * no further code needs to use CPU hotplug (which is true in
> +	 * the reboot case). However, the kexec path depends on using
> +	 * CPU hotplug again; so re-enable it here.
>  	 */
> -	if (!cpu_online(primary_cpu))
> -		primary_cpu = cpumask_first(cpu_online_mask);
> +	__cpu_hotplug_enable();
>  
>  	for_each_online_cpu(cpu) {
>  		if (cpu == primary_cpu)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 68480f731192..db4fa6b174e3 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1168,14 +1168,12 @@ int kernel_kexec(void)
>  		kexec_in_progress = true;
>  		kernel_restart_prepare("kexec reboot");
>  		migrate_to_reboot_cpu();
> -
>  		/*
> -		 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> -		 * no further code needs to use CPU hotplug (which is true in
> -		 * the reboot case). However, the kexec path depends on using
> -		 * CPU hotplug again; so re-enable it here.
> +		 * migrate_to_reboot_cpu() disables CPU hotplug. If an arch
> +		 * relies on the cpu teardown to achieve reboot, it needs to
> +		 * re-enable CPU hotplug there.
>  		 */
> -		cpu_hotplug_enable();
> +
>  		pr_notice("Starting new kernel\n");
>  		machine_shutdown();
>  	}
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-01-27  9:41 ` Baoquan He
@ 2022-01-28  7:41   ` Pingfan Liu
  2022-02-08  9:33     ` Baoquan He
  0 siblings, 1 reply; 7+ messages in thread
From: Pingfan Liu @ 2022-01-28  7:41 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

On Thu, Jan 27, 2022 at 05:41:44PM +0800, Baoquan He wrote:
Hi Baoquan,

Thanks for reviewing, please see comment inlined
> Hi Pingfan,
> 
> On 01/27/22 at 05:02pm, Pingfan Liu wrote:
> > The following identical code piece appears in both
> > migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
> > 
> > 	if (!cpu_online(primary_cpu))
> > 		primary_cpu = cpumask_first(cpu_online_mask);
> > 
> > This is due to a breakage like the following:
> >    migrate_to_reboot_cpu();
> >    cpu_hotplug_enable();
> >                           --> comes a cpu_down(this_cpu) on other cpu
> >    machine_shutdown();
> > 
> > Although the kexec-reboot task can get through a cpu_down() on its cpu,
> > this code looks a little confusing.
> > 
> > Make things straight forward by keeping cpu hotplug disabled until
> > smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> > breakage is squashed out and the rebooting cpu can keep unchanged.
> 
> If I didn't go through code wrongly, you may miss the x86 case.
> Several ARCHes do call smp_shutdown_nonboot_cpus() in machine_shutdown()
> in kexec reboot code path, while x86 doesn't. If I am right, you may
> need reconsider if this patch is needed or need be adjustd.
> 
Citing the code piece in kernel_kexec()

                migrate_to_reboot_cpu();

                /*
                 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
                 * no further code needs to use CPU hotplug (which is true in
                 * the reboot case). However, the kexec path depends on using
                 * CPU hotplug again; so re-enable it here.
                 */
                cpu_hotplug_enable();
                pr_notice("Starting new kernel\n");
                machine_shutdown();

So maybe it can be considered in such way: "cpu_hotplug_enable()" is not
needed by x86 and ppc, so this patch removes it, while re-displace it in
a more appropriate place for arm64/riscv ...

> Are you optimizing code path, or you meet a real problem? I haven't
> checked v1, but I also didn't see it's told in patch log which case it
> is.
> 
Simplify the code path and make the logic look straight forward.

And sorry for bad expression. I had thought I expressed it by (citing
git log)

|| The following identical code piece appears in both
                 ^^^^^^^^
|| migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
|| 
|| 	if (!cpu_online(primary_cpu))
|| 		primary_cpu = cpumask_first(cpu_online_mask);
|| 
|| This is due to a breakage like the following:
                    ^^^^^^^^
||    migrate_to_reboot_cpu();
||    cpu_hotplug_enable();
||                           --> comes a cpu_down(this_cpu) on other cpu
||    machine_shutdown();
|| 
|| Although the kexec-reboot task can get through a cpu_down() on its cpu,
                                      ^^^^^^^^^^^
|| this code looks a little confusing.

Should I rephrase it?

Thanks,

	Pingfan

> > 
> > Note: this patch only affects the kexec-reboot on arches, which rely on
> > cpu hotplug mechanism.
> > 
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: Eric Biederman <ebiederm@xmission.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Valentin Schneider <valentin.schneider@arm.com>
> > Cc: Vincent Donnefort <vincent.donnefort@arm.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: YueHaibing <yuehaibing@huawei.com>
> > Cc: Baokun Li <libaokun1@huawei.com>
> > Cc: Randy Dunlap <rdunlap@infradead.org>
> > Cc: Valentin Schneider <valentin.schneider@arm.com>
> > Cc: kexec@lists.infradead.org
> > To: linux-kernel@vger.kernel.org
> > ---
> > v1 -> v2:
> >  improve commit log
> > 
> >  kernel/cpu.c        | 16 ++++++++++------
> >  kernel/kexec_core.c | 10 ++++------
> >  2 files changed, 14 insertions(+), 12 deletions(-)
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 9c92147f0812..87bdf21de950 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -1240,20 +1240,24 @@ int remove_cpu(unsigned int cpu)
> >  }
> >  EXPORT_SYMBOL_GPL(remove_cpu);
> >  
> > +/* primary_cpu keeps unchanged after migrate_to_reboot_cpu() */
> >  void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
> >  {
> >  	unsigned int cpu;
> >  	int error;
> >  
> > +	/*
> > +	 * Block other cpu hotplug event, so primary_cpu is always online if
> > +	 * it is not touched by us
> > +	 */
> >  	cpu_maps_update_begin();
> > -
> >  	/*
> > -	 * Make certain the cpu I'm about to reboot on is online.
> > -	 *
> > -	 * This is inline to what migrate_to_reboot_cpu() already do.
> > +	 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> > +	 * no further code needs to use CPU hotplug (which is true in
> > +	 * the reboot case). However, the kexec path depends on using
> > +	 * CPU hotplug again; so re-enable it here.
> >  	 */
> > -	if (!cpu_online(primary_cpu))
> > -		primary_cpu = cpumask_first(cpu_online_mask);
> > +	__cpu_hotplug_enable();
> >  
> >  	for_each_online_cpu(cpu) {
> >  		if (cpu == primary_cpu)
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 68480f731192..db4fa6b174e3 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -1168,14 +1168,12 @@ int kernel_kexec(void)
> >  		kexec_in_progress = true;
> >  		kernel_restart_prepare("kexec reboot");
> >  		migrate_to_reboot_cpu();
> > -
> >  		/*
> > -		 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> > -		 * no further code needs to use CPU hotplug (which is true in
> > -		 * the reboot case). However, the kexec path depends on using
> > -		 * CPU hotplug again; so re-enable it here.
> > +		 * migrate_to_reboot_cpu() disables CPU hotplug. If an arch
> > +		 * relies on the cpu teardown to achieve reboot, it needs to
> > +		 * re-enable CPU hotplug there.
> >  		 */
> > -		cpu_hotplug_enable();
> > +
> >  		pr_notice("Starting new kernel\n");
> >  		machine_shutdown();
> >  	}
> > -- 
> > 2.31.1
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-01-28  7:41   ` Pingfan Liu
@ 2022-02-08  9:33     ` Baoquan He
  2022-02-09  7:31       ` Pingfan Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Baoquan He @ 2022-02-08  9:33 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

On 01/28/22 at 03:41pm, Pingfan Liu wrote:
> On Thu, Jan 27, 2022 at 05:41:44PM +0800, Baoquan He wrote:
> Hi Baoquan,
> 
> Thanks for reviewing, please see comment inlined
> > Hi Pingfan,
> > 
> > On 01/27/22 at 05:02pm, Pingfan Liu wrote:
> > > The following identical code piece appears in both
> > > migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
> > > 
> > > 	if (!cpu_online(primary_cpu))
> > > 		primary_cpu = cpumask_first(cpu_online_mask);
> > > 
> > > This is due to a breakage like the following:
> > >    migrate_to_reboot_cpu();
> > >    cpu_hotplug_enable();
> > >                           --> comes a cpu_down(this_cpu) on other cpu
> > >    machine_shutdown();
> > > 
> > > Although the kexec-reboot task can get through a cpu_down() on its cpu,
> > > this code looks a little confusing.
> > > 
> > > Make things straight forward by keeping cpu hotplug disabled until
> > > smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> > > breakage is squashed out and the rebooting cpu can keep unchanged.
> > 
> > If I didn't go through code wrongly, you may miss the x86 case.
> > Several ARCHes do call smp_shutdown_nonboot_cpus() in machine_shutdown()
> > in kexec reboot code path, while x86 doesn't. If I am right, you may
> > need reconsider if this patch is needed or need be adjustd.
> > 
> Citing the code piece in kernel_kexec()
> 
>                 migrate_to_reboot_cpu();
> 
>                 /*
>                  * migrate_to_reboot_cpu() disables CPU hotplug assuming that
>                  * no further code needs to use CPU hotplug (which is true in
>                  * the reboot case). However, the kexec path depends on using
>                  * CPU hotplug again; so re-enable it here.
>                  */
>                 cpu_hotplug_enable();
>                 pr_notice("Starting new kernel\n");
>                 machine_shutdown();
> 
> So maybe it can be considered in such way: "cpu_hotplug_enable()" is not
> needed by x86 and ppc, so this patch removes it, while re-displace it in
> a more appropriate place for arm64/riscv ...

OK, so the thing is:

==
In the current code of kexec, we disable cpu hotplug and check reboot cpu
validity in migrate_to_reboot_cpu(), then enable cpu hotplug. Then in
machine_shutdown()->smp_shutdown_nonboot_cpus(), check the reboot cpu and
disable cpu hotplug again.

In this patch, it disables cpu hotplug in migrate_to_reboot_cpu() and
keep it till entering into smp_shutdown_nonboot_cpus() to shutdown all
other cpu with hotplug mechanism, then disable it again. With this
change, it won't need to double check the reboot cpu validity. 

This change only affect ARCHes relying on hotplug to shutdown cpu before
kexec reboot, e.g arm64, risc-v. Other ARCH like x86 is not affected.
==

Do I got it right?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-02-08  9:33     ` Baoquan He
@ 2022-02-09  7:31       ` Pingfan Liu
  2022-02-09  8:44         ` Baoquan He
  0 siblings, 1 reply; 7+ messages in thread
From: Pingfan Liu @ 2022-02-09  7:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

On Tue, Feb 08, 2022 at 05:33:52PM +0800, Baoquan He wrote:
> On 01/28/22 at 03:41pm, Pingfan Liu wrote:
> > On Thu, Jan 27, 2022 at 05:41:44PM +0800, Baoquan He wrote:
> > Hi Baoquan,
> > 
> > Thanks for reviewing, please see comment inlined
> > > Hi Pingfan,
> > > 
> > > On 01/27/22 at 05:02pm, Pingfan Liu wrote:
> > > > The following identical code piece appears in both
> > > > migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
> > > > 
> > > > 	if (!cpu_online(primary_cpu))
> > > > 		primary_cpu = cpumask_first(cpu_online_mask);
> > > > 
> > > > This is due to a breakage like the following:
> > > >    migrate_to_reboot_cpu();
> > > >    cpu_hotplug_enable();
> > > >                           --> comes a cpu_down(this_cpu) on other cpu
> > > >    machine_shutdown();
> > > > 
> > > > Although the kexec-reboot task can get through a cpu_down() on its cpu,
> > > > this code looks a little confusing.
> > > > 
> > > > Make things straight forward by keeping cpu hotplug disabled until
> > > > smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> > > > breakage is squashed out and the rebooting cpu can keep unchanged.
> > > 
> > > If I didn't go through code wrongly, you may miss the x86 case.
> > > Several ARCHes do call smp_shutdown_nonboot_cpus() in machine_shutdown()
> > > in kexec reboot code path, while x86 doesn't. If I am right, you may
> > > need reconsider if this patch is needed or need be adjustd.
> > > 
> > Citing the code piece in kernel_kexec()
> > 
> >                 migrate_to_reboot_cpu();
> > 
> >                 /*
> >                  * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> >                  * no further code needs to use CPU hotplug (which is true in
> >                  * the reboot case). However, the kexec path depends on using
> >                  * CPU hotplug again; so re-enable it here.
> >                  */
> >                 cpu_hotplug_enable();
> >                 pr_notice("Starting new kernel\n");
> >                 machine_shutdown();
> > 
> > So maybe it can be considered in such way: "cpu_hotplug_enable()" is not
> > needed by x86 and ppc, so this patch removes it, while re-displace it in
> > a more appropriate place for arm64/riscv ...
> 
> OK, so the thing is:
> 
> ==
> In the current code of kexec, we disable cpu hotplug and check reboot cpu
> validity in migrate_to_reboot_cpu(), then enable cpu hotplug. Then in
> machine_shutdown()->smp_shutdown_nonboot_cpus(), check the reboot cpu and
> disable cpu hotplug again.
  ^^^
No, there is no any new call to cpu_hotplug_disable() after
migrate_to_reboot_cpu(). smp_shutdown_nonboot_cpus() just leaves
cpu_hotplug_disabled==0 as it is.
> 
> In this patch, it disables cpu hotplug in migrate_to_reboot_cpu() and
> keep it till entering into smp_shutdown_nonboot_cpus() to shutdown all
> other cpu with hotplug mechanism, then disable it again. With this
                                         ^^^
Here is enable, i.e. smp_shutdown_nonboot_cpus() makes
cpu_hotplug_disabled switch from 1 to 0, so the following fn can work
||	static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
||	{
||		if (cpu_hotplug_disabled)
||			return -EBUSY;
||		return _cpu_down(cpu, 0, target);
||	}

> change, it won't need to double check the reboot cpu validity. 
> 
> This change only affect ARCHes relying on hotplug to shutdown cpu before
> kexec reboot, e.g arm64, risc-v. Other ARCH like x86 is not affected.

Yes.
> ==
> 
> Do I got it right?
> 
Here, neither the original code nor this patch has another
cpu_hotplug_disable(), the only change is the shift of cpu_hotplug_enable() to an onwards place.

And I think you should get my idea except this.

Thanks,

	Pingfan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-02-09  7:31       ` Pingfan Liu
@ 2022-02-09  8:44         ` Baoquan He
  0 siblings, 0 replies; 7+ messages in thread
From: Baoquan He @ 2022-02-09  8:44 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, kexec

On 02/09/22 at 03:31pm, Pingfan Liu wrote:
> On Tue, Feb 08, 2022 at 05:33:52PM +0800, Baoquan He wrote:
> > On 01/28/22 at 03:41pm, Pingfan Liu wrote:
> > > On Thu, Jan 27, 2022 at 05:41:44PM +0800, Baoquan He wrote:
> > > Hi Baoquan,
> > > 
> > > Thanks for reviewing, please see comment inlined
> > > > Hi Pingfan,
> > > > 
> > > > On 01/27/22 at 05:02pm, Pingfan Liu wrote:
> > > > > The following identical code piece appears in both
> > > > > migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
> > > > > 
> > > > > 	if (!cpu_online(primary_cpu))
> > > > > 		primary_cpu = cpumask_first(cpu_online_mask);
> > > > > 
> > > > > This is due to a breakage like the following:
> > > > >    migrate_to_reboot_cpu();
> > > > >    cpu_hotplug_enable();
> > > > >                           --> comes a cpu_down(this_cpu) on other cpu
> > > > >    machine_shutdown();
> > > > > 
> > > > > Although the kexec-reboot task can get through a cpu_down() on its cpu,
> > > > > this code looks a little confusing.
> > > > > 
> > > > > Make things straight forward by keeping cpu hotplug disabled until
> > > > > smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> > > > > breakage is squashed out and the rebooting cpu can keep unchanged.
> > > > 
> > > > If I didn't go through code wrongly, you may miss the x86 case.
> > > > Several ARCHes do call smp_shutdown_nonboot_cpus() in machine_shutdown()
> > > > in kexec reboot code path, while x86 doesn't. If I am right, you may
> > > > need reconsider if this patch is needed or need be adjustd.
> > > > 
> > > Citing the code piece in kernel_kexec()
> > > 
> > >                 migrate_to_reboot_cpu();
> > > 
> > >                 /*
> > >                  * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> > >                  * no further code needs to use CPU hotplug (which is true in
> > >                  * the reboot case). However, the kexec path depends on using
> > >                  * CPU hotplug again; so re-enable it here.
> > >                  */
> > >                 cpu_hotplug_enable();
> > >                 pr_notice("Starting new kernel\n");
> > >                 machine_shutdown();
> > > 
> > > So maybe it can be considered in such way: "cpu_hotplug_enable()" is not
> > > needed by x86 and ppc, so this patch removes it, while re-displace it in
> > > a more appropriate place for arm64/riscv ...
> > 
> > OK, so the thing is:
> > 
> > ==
> > In the current code of kexec, we disable cpu hotplug and check reboot cpu
> > validity in migrate_to_reboot_cpu(), then enable cpu hotplug. Then in
> > machine_shutdown()->smp_shutdown_nonboot_cpus(), check the reboot cpu and
> > disable cpu hotplug again.
>   ^^^
> No, there is no any new call to cpu_hotplug_disable() after
> migrate_to_reboot_cpu(). smp_shutdown_nonboot_cpus() just leaves
> cpu_hotplug_disabled==0 as it is.

Hmm, there's 'cpu_hotplug_disabled++;' at the end of
smp_shutdown_nonboot_cpus(). It's disabling cpu hotplug again, I think.

> > 
> > In this patch, it disables cpu hotplug in migrate_to_reboot_cpu() and
> > keep it till entering into smp_shutdown_nonboot_cpus() to shutdown all
> > other cpu with hotplug mechanism, then disable it again. With this
>                                          ^^^
> Here is enable, i.e. smp_shutdown_nonboot_cpus() makes
> cpu_hotplug_disabled switch from 1 to 0, so the following fn can work

Right, you keep the cpu hotplug disabled till in
smp_shutdown_nonboot_cpus(), then enable it to shut down all other cpus.

> ||	static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
> ||	{
> ||		if (cpu_hotplug_disabled)
> ||			return -EBUSY;
> ||		return _cpu_down(cpu, 0, target);
> ||	}
> 
> > change, it won't need to double check the reboot cpu validity. 
> > 
> > This change only affect ARCHes relying on hotplug to shutdown cpu before
> > kexec reboot, e.g arm64, risc-v. Other ARCH like x86 is not affected.
> 
> Yes.
> > ==
> > 
> > Do I got it right?
> > 
> Here, neither the original code nor this patch has another
> cpu_hotplug_disable(), the only change is the shift of cpu_hotplug_enable() to an onwards place.
> 
> And I think you should get my idea except this.
> 
> Thanks,
> 
> 	Pingfan
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable
  2022-01-27  9:02 [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable Pingfan Liu
  2022-01-27  9:41 ` Baoquan He
@ 2022-02-14  2:38 ` Pingfan Liu
  1 sibling, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2022-02-14  2:38 UTC (permalink / raw)
  To: LKML
  Cc: Eric Biederman, Peter Zijlstra, Thomas Gleixner,
	Valentin Schneider, Vincent Donnefort, Ingo Molnar, Mark Rutland,
	YueHaibing, Baokun Li, Randy Dunlap, Kexec Mailing List

Gently ping, maintainers, could you share your opinions?


Thanks

On Thu, Jan 27, 2022 at 5:02 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> The following identical code piece appears in both
> migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
>
>         if (!cpu_online(primary_cpu))
>                 primary_cpu = cpumask_first(cpu_online_mask);
>
> This is due to a breakage like the following:
>    migrate_to_reboot_cpu();
>    cpu_hotplug_enable();
>                           --> comes a cpu_down(this_cpu) on other cpu
>    machine_shutdown();
>
> Although the kexec-reboot task can get through a cpu_down() on its cpu,
> this code looks a little confusing.
>
> Make things straight forward by keeping cpu hotplug disabled until
> smp_shutdown_nonboot_cpus() holds cpu_add_remove_lock. By this way, the
> breakage is squashed out and the rebooting cpu can keep unchanged.
>
> Note: this patch only affects the kexec-reboot on arches, which rely on
> cpu hotplug mechanism.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Vincent Donnefort <vincent.donnefort@arm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: YueHaibing <yuehaibing@huawei.com>
> Cc: Baokun Li <libaokun1@huawei.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: kexec@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> ---
> v1 -> v2:
>  improve commit log
>
>  kernel/cpu.c        | 16 ++++++++++------
>  kernel/kexec_core.c | 10 ++++------
>  2 files changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 9c92147f0812..87bdf21de950 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1240,20 +1240,24 @@ int remove_cpu(unsigned int cpu)
>  }
>  EXPORT_SYMBOL_GPL(remove_cpu);
>
> +/* primary_cpu keeps unchanged after migrate_to_reboot_cpu() */
>  void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
>  {
>         unsigned int cpu;
>         int error;
>
> +       /*
> +        * Block other cpu hotplug event, so primary_cpu is always online if
> +        * it is not touched by us
> +        */
>         cpu_maps_update_begin();
> -
>         /*
> -        * Make certain the cpu I'm about to reboot on is online.
> -        *
> -        * This is inline to what migrate_to_reboot_cpu() already do.
> +        * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> +        * no further code needs to use CPU hotplug (which is true in
> +        * the reboot case). However, the kexec path depends on using
> +        * CPU hotplug again; so re-enable it here.
>          */
> -       if (!cpu_online(primary_cpu))
> -               primary_cpu = cpumask_first(cpu_online_mask);
> +       __cpu_hotplug_enable();
>
>         for_each_online_cpu(cpu) {
>                 if (cpu == primary_cpu)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 68480f731192..db4fa6b174e3 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1168,14 +1168,12 @@ int kernel_kexec(void)
>                 kexec_in_progress = true;
>                 kernel_restart_prepare("kexec reboot");
>                 migrate_to_reboot_cpu();
> -
>                 /*
> -                * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> -                * no further code needs to use CPU hotplug (which is true in
> -                * the reboot case). However, the kexec path depends on using
> -                * CPU hotplug again; so re-enable it here.
> +                * migrate_to_reboot_cpu() disables CPU hotplug. If an arch
> +                * relies on the cpu teardown to achieve reboot, it needs to
> +                * re-enable CPU hotplug there.
>                  */
> -               cpu_hotplug_enable();
> +
>                 pr_notice("Starting new kernel\n");
>                 machine_shutdown();
>         }
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-14  2:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27  9:02 [PATCHv2] kexec: disable cpu hotplug until the rebooting cpu is stable Pingfan Liu
2022-01-27  9:41 ` Baoquan He
2022-01-28  7:41   ` Pingfan Liu
2022-02-08  9:33     ` Baoquan He
2022-02-09  7:31       ` Pingfan Liu
2022-02-09  8:44         ` Baoquan He
2022-02-14  2:38 ` Pingfan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).