linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM:kexec:offline panic_smp_self_stop CPU
@ 2018-11-01 11:20 Wang Yufen
  2018-11-01 11:34 ` Russell King - ARM Linux
  0 siblings, 1 reply; 5+ messages in thread
From: Wang Yufen @ 2018-11-01 11:20 UTC (permalink / raw)
  To: linux
  Cc: linux-arm-kernel, linux-kernel, akpm, kstewart, rppt, gregkh,
	tglx, pombredanne, weiyongjun1, huawei.libin, Yufen Wang

From: Yufen Wang <wangyufen@huawei.com>

In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
  panic()
     __crash_kexec
       machine_crash_shutdown
         crash_smp_send_stop
       machine_kexec
         BUG_ON(num_online_cpus() > 1);

CPU 1:
  panic()
    local_irq_disable
    panic_smp_self_stop

If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).

Signed-off-by: Yufen Wang <wangyufen@huawei.com>
---
 arch/arm/kernel/setup.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 31940bd..151861f 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
 }
 #endif
 
+void panic_smp_self_stop(void)
+{
+	printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
+			smp_processor_id());
+	set_cpu_online(smp_processor_id(), false);
+	while (1)
+		cpu_relax();
+
+}
+
 static void __init setup_processor(void)
 {
 	struct proc_info_list *list;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] ARM:kexec:offline panic_smp_self_stop CPU
  2018-11-01 11:20 [PATCH] ARM:kexec:offline panic_smp_self_stop CPU Wang Yufen
@ 2018-11-01 11:34 ` Russell King - ARM Linux
  2018-11-02  1:17   ` wangyufen
  2018-11-02  2:31   ` [PATCH v2] " wangyufen
  0 siblings, 2 replies; 5+ messages in thread
From: Russell King - ARM Linux @ 2018-11-01 11:34 UTC (permalink / raw)
  To: Wang Yufen
  Cc: linux-arm-kernel, linux-kernel, akpm, kstewart, rppt, gregkh,
	tglx, pombredanne, weiyongjun1, huawei.libin

On Thu, Nov 01, 2018 at 07:20:49PM +0800, Wang Yufen wrote:
> From: Yufen Wang <wangyufen@huawei.com>
> 
> In case panic() and panic() called at the same time on different CPUS.
> For example:
> CPU 0:
>   panic()
>      __crash_kexec
>        machine_crash_shutdown
>          crash_smp_send_stop
>        machine_kexec
>          BUG_ON(num_online_cpus() > 1);
> 
> CPU 1:
>   panic()
>     local_irq_disable
>     panic_smp_self_stop
> 
> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
> I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
> Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
> To fix this problem, this patch split out the panic_smp_self_stop()
> and add set_cpu_online(smp_processor_id(), false).

Thanks.

I think this may as well go into arch/arm/kernel/smp.c - it won't be
required for single-CPU systems, since there aren't "other" CPUs.

It's probably also worth a comment above the function as to why we
have this.

> 
> Signed-off-by: Yufen Wang <wangyufen@huawei.com>
> ---
>  arch/arm/kernel/setup.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 31940bd..151861f 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
>  }
>  #endif
>  
> +void panic_smp_self_stop(void)
> +{
> +	printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
> +			smp_processor_id());
> +	set_cpu_online(smp_processor_id(), false);
> +	while (1)
> +		cpu_relax();
> +
> +}
> +
>  static void __init setup_processor(void)
>  {
>  	struct proc_info_list *list;
> -- 
> 2.7.4
> 
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ARM:kexec:offline panic_smp_self_stop CPU
  2018-11-01 11:34 ` Russell King - ARM Linux
@ 2018-11-02  1:17   ` wangyufen
  2018-11-02  2:31   ` [PATCH v2] " wangyufen
  1 sibling, 0 replies; 5+ messages in thread
From: wangyufen @ 2018-11-02  1:17 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, akpm, kstewart, rppt, gregkh,
	tglx, pombredanne, weiyongjun1, huawei.libin

On 2018/11/1 19:34, Russell King - ARM Linux wrote:
> On Thu, Nov 01, 2018 at 07:20:49PM +0800, Wang Yufen wrote:
>> From: Yufen Wang <wangyufen@huawei.com>
>>
>> In case panic() and panic() called at the same time on different CPUS.
>> For example:
>> CPU 0:
>>   panic()
>>      __crash_kexec
>>        machine_crash_shutdown
>>          crash_smp_send_stop
>>        machine_kexec
>>          BUG_ON(num_online_cpus() > 1);
>>
>> CPU 1:
>>   panic()
>>     local_irq_disable
>>     panic_smp_self_stop
>>
>> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
>> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
>> I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
>> Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
>> To fix this problem, this patch split out the panic_smp_self_stop()
>> and add set_cpu_online(smp_processor_id(), false).
> Thanks.
>
> I think this may as well go into arch/arm/kernel/smp.c - it won't be
> required for single-CPU systems, since there aren't "other" CPUs.
>
> It's probably also worth a comment above the function as to why we
> have this.

Thanks.

I will send v2.

>> Signed-off-by: Yufen Wang <wangyufen@huawei.com>
>> ---
>>  arch/arm/kernel/setup.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
>> index 31940bd..151861f 100644
>> --- a/arch/arm/kernel/setup.c
>> +++ b/arch/arm/kernel/setup.c
>> @@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
>>  }
>>  #endif
>>  
>> +void panic_smp_self_stop(void)
>> +{
>> +	printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
>> +			smp_processor_id());
>> +	set_cpu_online(smp_processor_id(), false);
>> +	while (1)
>> +		cpu_relax();
>> +
>> +}
>> +
>>  static void __init setup_processor(void)
>>  {
>>  	struct proc_info_list *list;
>> -- 
>> 2.7.4
>>
>>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] ARM:kexec:offline panic_smp_self_stop CPU
  2018-11-01 11:34 ` Russell King - ARM Linux
  2018-11-02  1:17   ` wangyufen
@ 2018-11-02  2:31   ` wangyufen
  2018-11-02  9:55     ` Russell King - ARM Linux
  1 sibling, 1 reply; 5+ messages in thread
From: wangyufen @ 2018-11-02  2:31 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, akpm, kstewart, rppt, gregkh,
	tglx, pombredanne, weiyongjun1, huawei.libin, Wangyufen

In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
  panic()
     __crash_kexec
       machine_crash_shutdown
         crash_smp_send_stop
       machine_kexec
         BUG_ON(num_online_cpus() > 1);

CPU 1:
  panic()
    local_irq_disable
    panic_smp_self_stop

If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).

Signed-off-by: Yufen Wang <wangyufen@huawei.com>
---
 arch/arm/kernel/smp.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 9000d8b..d7b86e4 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -682,6 +682,21 @@ void smp_send_stop(void)
 		pr_warn("SMP: failed to stop secondary CPUs\n");
 }
 
+/* In case panic() and panic() called at the same time on CPU1 and CPU2,
+ * and CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop()
+ * CPU1 can't receive the ipi irqs from CPU2, CPU1 will be always online,
+ * kdump fails. So split out the panic_smp_self_stop() and add
+ * set_cpu_online(smp_processor_id(), false).
+ */
+void panic_smp_self_stop(void)
+{
+	pr_debug("CPU %u will stop doing anything useful since another CPU has paniced\n",
+	         smp_processor_id());
+	set_cpu_online(smp_processor_id(), false);
+	while (1)
+		cpu_relax();
+}
+
 /*
  * not supported here
  */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] ARM:kexec:offline panic_smp_self_stop CPU
  2018-11-02  2:31   ` [PATCH v2] " wangyufen
@ 2018-11-02  9:55     ` Russell King - ARM Linux
  0 siblings, 0 replies; 5+ messages in thread
From: Russell King - ARM Linux @ 2018-11-02  9:55 UTC (permalink / raw)
  To: wangyufen
  Cc: linux-arm-kernel, linux-kernel, akpm, kstewart, rppt, gregkh,
	tglx, pombredanne, weiyongjun1, huawei.libin

On Fri, Nov 02, 2018 at 10:31:27AM +0800, wangyufen wrote:
> In case panic() and panic() called at the same time on different CPUS.
> For example:
> CPU 0:
>   panic()
>      __crash_kexec
>        machine_crash_shutdown
>          crash_smp_send_stop
>        machine_kexec
>          BUG_ON(num_online_cpus() > 1);
> 
> CPU 1:
>   panic()
>     local_irq_disable
>     panic_smp_self_stop
> 
> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
> To fix this problem, this patch split out the panic_smp_self_stop()
> and add set_cpu_online(smp_processor_id(), false).

Looks fine now, please send it to the patch system (details in my
signature.)  Thanks.

> 
> Signed-off-by: Yufen Wang <wangyufen@huawei.com>
> ---
>  arch/arm/kernel/smp.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 9000d8b..d7b86e4 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -682,6 +682,21 @@ void smp_send_stop(void)
>  		pr_warn("SMP: failed to stop secondary CPUs\n");
>  }
>  
> +/* In case panic() and panic() called at the same time on CPU1 and CPU2,
> + * and CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop()
> + * CPU1 can't receive the ipi irqs from CPU2, CPU1 will be always online,
> + * kdump fails. So split out the panic_smp_self_stop() and add
> + * set_cpu_online(smp_processor_id(), false).
> + */
> +void panic_smp_self_stop(void)
> +{
> +	pr_debug("CPU %u will stop doing anything useful since another CPU has paniced\n",
> +	         smp_processor_id());
> +	set_cpu_online(smp_processor_id(), false);
> +	while (1)
> +		cpu_relax();
> +}
> +
>  /*
>   * not supported here
>   */
> -- 
> 2.7.4
> 
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-02  9:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-01 11:20 [PATCH] ARM:kexec:offline panic_smp_self_stop CPU Wang Yufen
2018-11-01 11:34 ` Russell King - ARM Linux
2018-11-02  1:17   ` wangyufen
2018-11-02  2:31   ` [PATCH v2] " wangyufen
2018-11-02  9:55     ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).