linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] powerpc prevents deadlock in the watchdog path
@ 2021-10-26 16:27 Laurent Dufour
  2021-10-26 16:27 ` [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock Laurent Dufour
  2021-10-26 16:27 ` [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected Laurent Dufour
  0 siblings, 2 replies; 9+ messages in thread
From: Laurent Dufour @ 2021-10-26 16:27 UTC (permalink / raw)
  To: mpe, benh, paulus; +Cc: linuxppc-dev, linux-kernel, npiggin

While doing LPM on large system (for instance a Brazos system with 1024
CPUs and 12TB of memory) with an heavy load (I ran 'stress-ng --futex 500
-vm 5'), watchdog hard lockup are seen when the hypervisor is taking
too much time handling the page tables to track page's changes.

When this happens, the system may hung with a deadlock between the watchdog
lock and the console owner lock.

The first patch of this series prevents that deadlock by not calling printk
while holding the watchdog lock, and also not sending IPI (and waiting for
CPU's answer during 1s) while holding the watchdog lock.

The second patch ensures that the watchdog's data are accessed under the
protection of the watchdog lock.

Laurent Dufour (2):
  powerpc/watchdog: prevent printk and send IPI while holding the wd
    lock
  powerpc/watchdog: ensure watchdog data accesses are protected

 arch/powerpc/kernel/watchdog.c | 45 +++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 20 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-26 16:27 [PATCH 0/2] powerpc prevents deadlock in the watchdog path Laurent Dufour
@ 2021-10-26 16:27 ` Laurent Dufour
  2021-10-27  3:29   ` Nicholas Piggin
  2021-10-26 16:27 ` [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected Laurent Dufour
  1 sibling, 1 reply; 9+ messages in thread
From: Laurent Dufour @ 2021-10-26 16:27 UTC (permalink / raw)
  To: mpe, benh, paulus; +Cc: linuxppc-dev, linux-kernel, npiggin

When handling the Watchdog interrupt, long processing should not be done
while holding the __wd_smp_lock. This prevents the other CPUs to grab it
and to process Watchdog timer interrupts. Furhtermore, this could lead to
the following situation:

CPU x detect lockup on CPU y and grab the __wd_smp_lock
      in watchdog_smp_panic()
CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
      in soft_nmi_interrupt()
CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()
CPU x will timeout and so has spent 1s waiting while holding the
      __wd_smp_lock.

A deadlock may also happen between the __wd_smp_lock and the console_owner
'lock' this way:
CPU x grab the console_owner
CPU y grab the __wd_smp_lock
CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
CPU y wants to print something and wait for console_owner
-> deadlock

Doing all the long processing without holding the _wd_smp_lock prevents
these situations.

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/kernel/watchdog.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index f9ea0e5357f9..bc7411327066 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -149,6 +149,8 @@ static void set_cpu_stuck(int cpu, u64 tb)
 
 static void watchdog_smp_panic(int cpu, u64 tb)
 {
+	cpumask_t cpus_pending_copy;
+	u64 last_reset_tb_copy;
 	unsigned long flags;
 	int c;
 
@@ -161,29 +163,32 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 	if (cpumask_weight(&wd_smp_cpus_pending) == 0)
 		goto out;
 
+	cpumask_copy(&cpus_pending_copy, &wd_smp_cpus_pending);
+	last_reset_tb_copy = wd_smp_last_reset_tb;
+
+	/* Take the stuck CPUs out of the watch group */
+	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
+
+	wd_smp_unlock(&flags);
+
 	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
-		 cpu, cpumask_pr_args(&wd_smp_cpus_pending));
+		 cpu, cpumask_pr_args(&cpus_pending_copy));
 	pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n",
-		 cpu, tb, wd_smp_last_reset_tb,
-		 tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000);
+		 cpu, tb, last_reset_tb_copy,
+		 tb_to_ns(tb - last_reset_tb_copy) / 1000000);
 
 	if (!sysctl_hardlockup_all_cpu_backtrace) {
 		/*
 		 * Try to trigger the stuck CPUs, unless we are going to
 		 * get a backtrace on all of them anyway.
 		 */
-		for_each_cpu(c, &wd_smp_cpus_pending) {
+		for_each_cpu(c, &cpus_pending_copy) {
 			if (c == cpu)
 				continue;
 			smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000);
 		}
 	}
 
-	/* Take the stuck CPUs out of the watch group */
-	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
-
-	wd_smp_unlock(&flags);
-
 	if (sysctl_hardlockup_all_cpu_backtrace)
 		trigger_allbutself_cpu_backtrace();
 
@@ -204,6 +209,8 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 			unsigned long flags;
 
 			wd_smp_lock(&flags);
+			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
+			wd_smp_unlock(&flags);
 
 			pr_emerg("CPU %d became unstuck TB:%lld\n",
 				 cpu, tb);
@@ -212,9 +219,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 				show_regs(regs);
 			else
 				dump_stack();
-
-			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
-			wd_smp_unlock(&flags);
 		}
 		return;
 	}
@@ -267,6 +271,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 			return 0;
 		}
 		set_cpu_stuck(cpu, tb);
+		wd_smp_unlock(&flags);
 
 		pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n",
 			 cpu, (void *)regs->nip);
@@ -277,8 +282,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 		print_irqtrace_events(current);
 		show_regs(regs);
 
-		wd_smp_unlock(&flags);
-
 		if (sysctl_hardlockup_all_cpu_backtrace)
 			trigger_allbutself_cpu_backtrace();
 
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected
  2021-10-26 16:27 [PATCH 0/2] powerpc prevents deadlock in the watchdog path Laurent Dufour
  2021-10-26 16:27 ` [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock Laurent Dufour
@ 2021-10-26 16:27 ` Laurent Dufour
  2021-10-27  3:48   ` Nicholas Piggin
  1 sibling, 1 reply; 9+ messages in thread
From: Laurent Dufour @ 2021-10-26 16:27 UTC (permalink / raw)
  To: mpe, benh, paulus; +Cc: linuxppc-dev, linux-kernel, npiggin

The wd_smp_cpus_pending CPU mask should be accessed under the protection of
the __wd_smp_lock.

This prevents false alarm to be raised when the system is under an heavy
stress. This has been seen while doing LPM on large system with a big
workload.

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/kernel/watchdog.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bc7411327066..8d7a1a86187e 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -203,12 +203,13 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 
 static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 {
+	unsigned long flags;
+
+	wd_smp_lock(&flags);
 	if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) {
 		if (unlikely(cpumask_test_cpu(cpu, &wd_smp_cpus_stuck))) {
 			struct pt_regs *regs = get_irq_regs();
-			unsigned long flags;
 
-			wd_smp_lock(&flags);
 			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
 			wd_smp_unlock(&flags);
 
@@ -219,22 +220,23 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 				show_regs(regs);
 			else
 				dump_stack();
+			return;
 		}
+
+		wd_smp_unlock(&flags);
 		return;
 	}
+
 	cpumask_clear_cpu(cpu, &wd_smp_cpus_pending);
 	if (cpumask_empty(&wd_smp_cpus_pending)) {
-		unsigned long flags;
-
-		wd_smp_lock(&flags);
 		if (cpumask_empty(&wd_smp_cpus_pending)) {
 			wd_smp_last_reset_tb = tb;
 			cpumask_andnot(&wd_smp_cpus_pending,
 					&wd_cpus_enabled,
 					&wd_smp_cpus_stuck);
 		}
-		wd_smp_unlock(&flags);
 	}
+	wd_smp_unlock(&flags);
 }
 
 static void watchdog_timer_interrupt(int cpu)
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-26 16:27 ` [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock Laurent Dufour
@ 2021-10-27  3:29   ` Nicholas Piggin
  2021-10-27  8:14     ` Laurent Dufour
  2021-10-27  9:49     ` Nicholas Piggin
  0 siblings, 2 replies; 9+ messages in thread
From: Nicholas Piggin @ 2021-10-27  3:29 UTC (permalink / raw)
  To: benh, Laurent Dufour, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
> When handling the Watchdog interrupt, long processing should not be done
> while holding the __wd_smp_lock. This prevents the other CPUs to grab it
> and to process Watchdog timer interrupts. Furhtermore, this could lead to
> the following situation:
> 
> CPU x detect lockup on CPU y and grab the __wd_smp_lock
>       in watchdog_smp_panic()
> CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
>       in soft_nmi_interrupt()
> CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()

CPU y should get the IPI here if it's a NMI IPI (which will be true for 
>= POWER9 64s).

That said, not all platforms support it and the console lock problem
seems real, so okay.

> CPU x will timeout and so has spent 1s waiting while holding the
>       __wd_smp_lock.
> 
> A deadlock may also happen between the __wd_smp_lock and the console_owner
> 'lock' this way:
> CPU x grab the console_owner
> CPU y grab the __wd_smp_lock
> CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
> CPU y wants to print something and wait for console_owner
> -> deadlock
> 
> Doing all the long processing without holding the _wd_smp_lock prevents
> these situations.

The intention was to avoid logs getting garbled e.g., if multiple 
different CPUs fire at once.

I wonder if instead we could deal with that by protecting the IPI 
sending and printing stuff with a trylock, and if you don't get the
trylock then just return, and you'll come back with the next timer
interrupt.

Thanks,
Nick

> 
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/kernel/watchdog.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index f9ea0e5357f9..bc7411327066 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -149,6 +149,8 @@ static void set_cpu_stuck(int cpu, u64 tb)
>  
>  static void watchdog_smp_panic(int cpu, u64 tb)
>  {
> +	cpumask_t cpus_pending_copy;
> +	u64 last_reset_tb_copy;
>  	unsigned long flags;
>  	int c;
>  
> @@ -161,29 +163,32 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>  	if (cpumask_weight(&wd_smp_cpus_pending) == 0)
>  		goto out;
>  
> +	cpumask_copy(&cpus_pending_copy, &wd_smp_cpus_pending);
> +	last_reset_tb_copy = wd_smp_last_reset_tb;
> +
> +	/* Take the stuck CPUs out of the watch group */
> +	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
> +
> +	wd_smp_unlock(&flags);
> +
>  	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
> -		 cpu, cpumask_pr_args(&wd_smp_cpus_pending));
> +		 cpu, cpumask_pr_args(&cpus_pending_copy));
>  	pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n",
> -		 cpu, tb, wd_smp_last_reset_tb,
> -		 tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000);
> +		 cpu, tb, last_reset_tb_copy,
> +		 tb_to_ns(tb - last_reset_tb_copy) / 1000000);
>  
>  	if (!sysctl_hardlockup_all_cpu_backtrace) {
>  		/*
>  		 * Try to trigger the stuck CPUs, unless we are going to
>  		 * get a backtrace on all of them anyway.
>  		 */
> -		for_each_cpu(c, &wd_smp_cpus_pending) {
> +		for_each_cpu(c, &cpus_pending_copy) {
>  			if (c == cpu)
>  				continue;
>  			smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000);
>  		}
>  	}
>  
> -	/* Take the stuck CPUs out of the watch group */
> -	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
> -
> -	wd_smp_unlock(&flags);
> -
>  	if (sysctl_hardlockup_all_cpu_backtrace)
>  		trigger_allbutself_cpu_backtrace();
>  
> @@ -204,6 +209,8 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>  			unsigned long flags;
>  
>  			wd_smp_lock(&flags);
> +			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
> +			wd_smp_unlock(&flags);
>  
>  			pr_emerg("CPU %d became unstuck TB:%lld\n",
>  				 cpu, tb);
> @@ -212,9 +219,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>  				show_regs(regs);
>  			else
>  				dump_stack();
> -
> -			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
> -			wd_smp_unlock(&flags);
>  		}
>  		return;
>  	}
> @@ -267,6 +271,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>  			return 0;
>  		}
>  		set_cpu_stuck(cpu, tb);
> +		wd_smp_unlock(&flags);
>  
>  		pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n",
>  			 cpu, (void *)regs->nip);
> @@ -277,8 +282,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>  		print_irqtrace_events(current);
>  		show_regs(regs);
>  
> -		wd_smp_unlock(&flags);
> -
>  		if (sysctl_hardlockup_all_cpu_backtrace)
>  			trigger_allbutself_cpu_backtrace();
>  
> -- 
> 2.33.1
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected
  2021-10-26 16:27 ` [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected Laurent Dufour
@ 2021-10-27  3:48   ` Nicholas Piggin
  0 siblings, 0 replies; 9+ messages in thread
From: Nicholas Piggin @ 2021-10-27  3:48 UTC (permalink / raw)
  To: benh, Laurent Dufour, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
> The wd_smp_cpus_pending CPU mask should be accessed under the protection of
> the __wd_smp_lock.
> 
> This prevents false alarm to be raised when the system is under an heavy
> stress. This has been seen while doing LPM on large system with a big
> workload.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/kernel/watchdog.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index bc7411327066..8d7a1a86187e 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -203,12 +203,13 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>  
>  static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>  {
> +	unsigned long flags;
> +
> +	wd_smp_lock(&flags);
>  	if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) {
>  		if (unlikely(cpumask_test_cpu(cpu, &wd_smp_cpus_stuck))) {
>  			struct pt_regs *regs = get_irq_regs();
> -			unsigned long flags;
>  
> -			wd_smp_lock(&flags);
>  			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
>  			wd_smp_unlock(&flags);
>  
> @@ -219,22 +220,23 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>  				show_regs(regs);
>  			else
>  				dump_stack();
> +			return;
>  		}
> +
> +		wd_smp_unlock(&flags);
>  		return;
>  	}
> +
>  	cpumask_clear_cpu(cpu, &wd_smp_cpus_pending);
>  	if (cpumask_empty(&wd_smp_cpus_pending)) {
> -		unsigned long flags;
> -
> -		wd_smp_lock(&flags);
>  		if (cpumask_empty(&wd_smp_cpus_pending)) {
>  			wd_smp_last_reset_tb = tb;
>  			cpumask_andnot(&wd_smp_cpus_pending,
>  					&wd_cpus_enabled,
>  					&wd_smp_cpus_stuck);
>  		}
> -		wd_smp_unlock(&flags);
>  	}
> +	wd_smp_unlock(&flags);

Hmm. I wanted to avoid the lock here because all CPUs will do it on 
every heartbeat timer.

Although maybe when you look at the cacheline transfers required it 
doesn't matter much (and the timer is only once every few seconds).

I guess it's always better to aovid lock free craziness unless it's
required... so where is the race coming from? I guess 2 CPUs can
clear wd_smp_cpus_pending but not see one another's update so they
both miss cpumask_empty == true! Good catch.

We shouldn't strictly need the wd_smp_lock for the first test, but
that should be an uncommon case, so okay.

Can we clear wd_smp_cpus_pending with a non-atomic operation now?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-27  3:29   ` Nicholas Piggin
@ 2021-10-27  8:14     ` Laurent Dufour
  2021-10-27  8:51       ` Nicholas Piggin
  2021-10-27  9:49     ` Nicholas Piggin
  1 sibling, 1 reply; 9+ messages in thread
From: Laurent Dufour @ 2021-10-27  8:14 UTC (permalink / raw)
  To: Nicholas Piggin, benh, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Le 27/10/2021 à 05:29, Nicholas Piggin a écrit :
> Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
>> When handling the Watchdog interrupt, long processing should not be done
>> while holding the __wd_smp_lock. This prevents the other CPUs to grab it
>> and to process Watchdog timer interrupts. Furhtermore, this could lead to
>> the following situation:
>>
>> CPU x detect lockup on CPU y and grab the __wd_smp_lock
>>        in watchdog_smp_panic()
>> CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
>>        in soft_nmi_interrupt()
>> CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()
> 
> CPU y should get the IPI here if it's a NMI IPI (which will be true for
>> = POWER9 64s).
> 
> That said, not all platforms support it and the console lock problem
> seems real, so okay.
> 
>> CPU x will timeout and so has spent 1s waiting while holding the
>>        __wd_smp_lock.
>>
>> A deadlock may also happen between the __wd_smp_lock and the console_owner
>> 'lock' this way:
>> CPU x grab the console_owner
>> CPU y grab the __wd_smp_lock
>> CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
>> CPU y wants to print something and wait for console_owner
>> -> deadlock
>>
>> Doing all the long processing without holding the _wd_smp_lock prevents
>> these situations.
> 
> The intention was to avoid logs getting garbled e.g., if multiple
> different CPUs fire at once.
> 
> I wonder if instead we could deal with that by protecting the IPI
> sending and printing stuff with a trylock, and if you don't get the
> trylock then just return, and you'll come back with the next timer
> interrupt.

That sounds a bit risky to me, especially on large system when system goes 
wrong, all the CPU may try lock here.
Furthermore, now operation done under the lock protection are quite fast, there 
is no more spinning like the delay loop done when sending an IPI.

Protecting the IPI sending is a nightmare, if the target CPU is later play with 
the lock we are taking during the IPI processing, furthermore, if the target CPU 
is not responding the sending CPU is waiting for 1s, which slows all the system 
due to the lock held.
Since I do a copy of the pending CPU mask and clear it under the lock 
protection, the IPI sending is safe while done without holding the lock.

Regarding the interleaved traces, I don't think this has to be managed down 
here, but rather in the printk/console path.

Cheers,
Laurent.

> 
> Thanks,
> Nick
> 
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>   arch/powerpc/kernel/watchdog.c | 31 +++++++++++++++++--------------
>>   1 file changed, 17 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
>> index f9ea0e5357f9..bc7411327066 100644
>> --- a/arch/powerpc/kernel/watchdog.c
>> +++ b/arch/powerpc/kernel/watchdog.c
>> @@ -149,6 +149,8 @@ static void set_cpu_stuck(int cpu, u64 tb)
>>   
>>   static void watchdog_smp_panic(int cpu, u64 tb)
>>   {
>> +	cpumask_t cpus_pending_copy;
>> +	u64 last_reset_tb_copy;
>>   	unsigned long flags;
>>   	int c;
>>   
>> @@ -161,29 +163,32 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>>   	if (cpumask_weight(&wd_smp_cpus_pending) == 0)
>>   		goto out;
>>   
>> +	cpumask_copy(&cpus_pending_copy, &wd_smp_cpus_pending);
>> +	last_reset_tb_copy = wd_smp_last_reset_tb;
>> +
>> +	/* Take the stuck CPUs out of the watch group */
>> +	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
>> +
>> +	wd_smp_unlock(&flags);
>> +
>>   	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
>> -		 cpu, cpumask_pr_args(&wd_smp_cpus_pending));
>> +		 cpu, cpumask_pr_args(&cpus_pending_copy));
>>   	pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n",
>> -		 cpu, tb, wd_smp_last_reset_tb,
>> -		 tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000);
>> +		 cpu, tb, last_reset_tb_copy,
>> +		 tb_to_ns(tb - last_reset_tb_copy) / 1000000);
>>   
>>   	if (!sysctl_hardlockup_all_cpu_backtrace) {
>>   		/*
>>   		 * Try to trigger the stuck CPUs, unless we are going to
>>   		 * get a backtrace on all of them anyway.
>>   		 */
>> -		for_each_cpu(c, &wd_smp_cpus_pending) {
>> +		for_each_cpu(c, &cpus_pending_copy) {
>>   			if (c == cpu)
>>   				continue;
>>   			smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000);
>>   		}
>>   	}
>>   
>> -	/* Take the stuck CPUs out of the watch group */
>> -	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
>> -
>> -	wd_smp_unlock(&flags);
>> -
>>   	if (sysctl_hardlockup_all_cpu_backtrace)
>>   		trigger_allbutself_cpu_backtrace();
>>   
>> @@ -204,6 +209,8 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>>   			unsigned long flags;
>>   
>>   			wd_smp_lock(&flags);
>> +			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
>> +			wd_smp_unlock(&flags);
>>   
>>   			pr_emerg("CPU %d became unstuck TB:%lld\n",
>>   				 cpu, tb);
>> @@ -212,9 +219,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>>   				show_regs(regs);
>>   			else
>>   				dump_stack();
>> -
>> -			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
>> -			wd_smp_unlock(&flags);
>>   		}
>>   		return;
>>   	}
>> @@ -267,6 +271,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>>   			return 0;
>>   		}
>>   		set_cpu_stuck(cpu, tb);
>> +		wd_smp_unlock(&flags);
>>   
>>   		pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n",
>>   			 cpu, (void *)regs->nip);
>> @@ -277,8 +282,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>>   		print_irqtrace_events(current);
>>   		show_regs(regs);
>>   
>> -		wd_smp_unlock(&flags);
>> -
>>   		if (sysctl_hardlockup_all_cpu_backtrace)
>>   			trigger_allbutself_cpu_backtrace();
>>   
>> -- 
>> 2.33.1
>>
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-27  8:14     ` Laurent Dufour
@ 2021-10-27  8:51       ` Nicholas Piggin
  0 siblings, 0 replies; 9+ messages in thread
From: Nicholas Piggin @ 2021-10-27  8:51 UTC (permalink / raw)
  To: benh, Laurent Dufour, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Excerpts from Laurent Dufour's message of October 27, 2021 6:14 pm:
> Le 27/10/2021 à 05:29, Nicholas Piggin a écrit :
>> Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
>>> When handling the Watchdog interrupt, long processing should not be done
>>> while holding the __wd_smp_lock. This prevents the other CPUs to grab it
>>> and to process Watchdog timer interrupts. Furhtermore, this could lead to
>>> the following situation:
>>>
>>> CPU x detect lockup on CPU y and grab the __wd_smp_lock
>>>        in watchdog_smp_panic()
>>> CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
>>>        in soft_nmi_interrupt()
>>> CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()
>> 
>> CPU y should get the IPI here if it's a NMI IPI (which will be true for
>>> = POWER9 64s).
>> 
>> That said, not all platforms support it and the console lock problem
>> seems real, so okay.
>> 
>>> CPU x will timeout and so has spent 1s waiting while holding the
>>>        __wd_smp_lock.
>>>
>>> A deadlock may also happen between the __wd_smp_lock and the console_owner
>>> 'lock' this way:
>>> CPU x grab the console_owner
>>> CPU y grab the __wd_smp_lock
>>> CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
>>> CPU y wants to print something and wait for console_owner
>>> -> deadlock
>>>
>>> Doing all the long processing without holding the _wd_smp_lock prevents
>>> these situations.
>> 
>> The intention was to avoid logs getting garbled e.g., if multiple
>> different CPUs fire at once.
>> 
>> I wonder if instead we could deal with that by protecting the IPI
>> sending and printing stuff with a trylock, and if you don't get the
>> trylock then just return, and you'll come back with the next timer
>> interrupt.
> 
> That sounds a bit risky to me, especially on large system when system goes 
> wrong, all the CPU may try lock here.

That should be okay though, one will get through and the others will 
skip.

> Furthermore, now operation done under the lock protection are quite fast, there 
> is no more spinning like the delay loop done when sending an IPI.
> 
> Protecting the IPI sending is a nightmare, if the target CPU is later play with 
> the lock we are taking during the IPI processing, furthermore, if the target CPU 
> is not responding the sending CPU is waiting for 1s, which slows all the system 
> due to the lock held.
> Since I do a copy of the pending CPU mask and clear it under the lock 
> protection, the IPI sending is safe while done without holding the lock.

Protecting IPI sending basically has all the same issues in the NMI
IPI layer.

> 
> Regarding the interleaved traces, I don't think this has to be managed down 
> here, but rather in the printk/console path.

It can't necessarily be because some of the problem is actually that a 
NMI handler can be interrupted by another NMI IPI because the caller
can return only after handlers start running rather than complete.

I don't think it would be an additional nightmare to trylock.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-27  3:29   ` Nicholas Piggin
  2021-10-27  8:14     ` Laurent Dufour
@ 2021-10-27  9:49     ` Nicholas Piggin
  2021-10-28 15:45       ` Laurent Dufour
  1 sibling, 1 reply; 9+ messages in thread
From: Nicholas Piggin @ 2021-10-27  9:49 UTC (permalink / raw)
  To: benh, Laurent Dufour, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Excerpts from Nicholas Piggin's message of October 27, 2021 1:29 pm:
> Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
>> When handling the Watchdog interrupt, long processing should not be done
>> while holding the __wd_smp_lock. This prevents the other CPUs to grab it
>> and to process Watchdog timer interrupts. Furhtermore, this could lead to
>> the following situation:
>> 
>> CPU x detect lockup on CPU y and grab the __wd_smp_lock
>>       in watchdog_smp_panic()
>> CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
>>       in soft_nmi_interrupt()
>> CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()
> 
> CPU y should get the IPI here if it's a NMI IPI (which will be true for 
>>= POWER9 64s).
> 
> That said, not all platforms support it and the console lock problem
> seems real, so okay.
> 
>> CPU x will timeout and so has spent 1s waiting while holding the
>>       __wd_smp_lock.
>> 
>> A deadlock may also happen between the __wd_smp_lock and the console_owner
>> 'lock' this way:
>> CPU x grab the console_owner
>> CPU y grab the __wd_smp_lock
>> CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
>> CPU y wants to print something and wait for console_owner
>> -> deadlock
>> 
>> Doing all the long processing without holding the _wd_smp_lock prevents
>> these situations.
> 
> The intention was to avoid logs getting garbled e.g., if multiple 
> different CPUs fire at once.
> 
> I wonder if instead we could deal with that by protecting the IPI 
> sending and printing stuff with a trylock, and if you don't get the
> trylock then just return, and you'll come back with the next timer
> interrupt.

Something like this (untested) should basically hold off concurrency on
watchdog panics. It does not serialize output from IPI targets but it
should prevent multiple CPUs trying to send IPIs at once, without 
holding the lock.

---

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 2ffeb3f8b5a7..3a0db577da56 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -85,6 +85,7 @@ static DEFINE_PER_CPU(u64, wd_timer_tb);
 
 /* SMP checker bits */
 static unsigned long __wd_smp_lock;
+static unsigned long __wd_printing;
 static cpumask_t wd_smp_cpus_pending;
 static cpumask_t wd_smp_cpus_stuck;
 static u64 wd_smp_last_reset_tb;
@@ -131,10 +132,10 @@ static void wd_lockup_ipi(struct pt_regs *regs)
 	/* Do not panic from here because that can recurse into NMI IPI layer */
 }
 
-static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb)
+static void set_cpu_stuck(int cpu, u64 tb)
 {
-	cpumask_or(&wd_smp_cpus_stuck, &wd_smp_cpus_stuck, cpumask);
-	cpumask_andnot(&wd_smp_cpus_pending, &wd_smp_cpus_pending, cpumask);
+	cpumask_set_cpu(cpu, &wd_smp_cpus_stuck);
+	cpumask_clear_cpu(cpu, &wd_smp_cpus_pending);
 	if (cpumask_empty(&wd_smp_cpus_pending)) {
 		wd_smp_last_reset_tb = tb;
 		cpumask_andnot(&wd_smp_cpus_pending,
@@ -142,10 +143,6 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb)
 				&wd_smp_cpus_stuck);
 	}
 }
-static void set_cpu_stuck(int cpu, u64 tb)
-{
-	set_cpumask_stuck(cpumask_of(cpu), tb);
-}
 
 static void watchdog_smp_panic(int cpu, u64 tb)
 {
@@ -160,6 +157,10 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 		goto out;
 	if (cpumask_weight(&wd_smp_cpus_pending) == 0)
 		goto out;
+	if (__wd_printing)
+		goto out;
+	__wd_printing = 1;
+	wd_smp_unlock(&flags);
 
 	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
 		 cpu, cpumask_pr_args(&wd_smp_cpus_pending));
@@ -172,24 +173,31 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 		 * Try to trigger the stuck CPUs, unless we are going to
 		 * get a backtrace on all of them anyway.
 		 */
-		for_each_cpu(c, &wd_smp_cpus_pending) {
+		for_each_online_cpu(c) {
 			if (c == cpu)
 				continue;
+			if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending))
+				continue;
+			wd_smp_lock(&flags);
+			if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) {
+				wd_smp_unlock(&flags);
+				continue;
+			}
+			/* Take the stuck CPU out of the watch group */
+			set_cpu_stuck(cpu, tb);
+			wd_smp_unlock(&flags);
+
 			smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000);
 		}
 	}
 
-	/* Take the stuck CPUs out of the watch group */
-	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
-
-	wd_smp_unlock(&flags);
-
 	if (sysctl_hardlockup_all_cpu_backtrace)
 		trigger_allbutself_cpu_backtrace();
 
 	if (hardlockup_panic)
 		nmi_panic(NULL, "Hard LOCKUP");
 
+	__wd_printing = 0;
 	return;
 
 out:
@@ -204,8 +212,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 		if (unlikely(cpumask_test_cpu(cpu, &wd_smp_cpus_stuck))) {
 			struct pt_regs *regs = get_irq_regs();
 
-			wd_smp_lock(&flags);
-
 			pr_emerg("CPU %d became unstuck TB:%lld\n",
 				 cpu, tb);
 			print_irqtrace_events(current);
@@ -214,6 +220,7 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
 			else
 				dump_stack();
 
+			wd_smp_lock(&flags);
 			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
 			wd_smp_unlock(&flags);
 		}
@@ -263,8 +270,16 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 			wd_smp_unlock(&flags);
 			return 0;
 		}
+		if (__wd_printing) {
+			wd_smp_unlock(&flags);
+			return 0;
+		}
+		__wd_printing = 1;
+
 		set_cpu_stuck(cpu, tb);
 
+		wd_smp_unlock(&flags);
+
 		pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n",
 			 cpu, (void *)regs->nip);
 		pr_emerg("CPU %d TB:%lld, last heartbeat TB:%lld (%lldms ago)\n",
@@ -274,13 +289,13 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 		print_irqtrace_events(current);
 		show_regs(regs);
 
-		wd_smp_unlock(&flags);
-
 		if (sysctl_hardlockup_all_cpu_backtrace)
 			trigger_allbutself_cpu_backtrace();
 
 		if (hardlockup_panic)
 			nmi_panic(regs, "Hard LOCKUP");
+
+		__wd_printing = 0;
 	}
 	if (wd_panic_timeout_tb < 0x7fffffff)
 		mtspr(SPRN_DEC, wd_panic_timeout_tb);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock
  2021-10-27  9:49     ` Nicholas Piggin
@ 2021-10-28 15:45       ` Laurent Dufour
  0 siblings, 0 replies; 9+ messages in thread
From: Laurent Dufour @ 2021-10-28 15:45 UTC (permalink / raw)
  To: Nicholas Piggin, benh, mpe, paulus; +Cc: linuxppc-dev, linux-kernel

Le 27/10/2021 à 11:49, Nicholas Piggin a écrit :
> Excerpts from Nicholas Piggin's message of October 27, 2021 1:29 pm:
>> Excerpts from Laurent Dufour's message of October 27, 2021 2:27 am:
>>> When handling the Watchdog interrupt, long processing should not be done
>>> while holding the __wd_smp_lock. This prevents the other CPUs to grab it
>>> and to process Watchdog timer interrupts. Furhtermore, this could lead to
>>> the following situation:
>>>
>>> CPU x detect lockup on CPU y and grab the __wd_smp_lock
>>>        in watchdog_smp_panic()
>>> CPU y caught the watchdog interrupt and try to grab the __wd_smp_lock
>>>        in soft_nmi_interrupt()
>>> CPU x wait for CPU y to catch the IPI for 1s in __smp_send_nmi_ipi()
>>
>> CPU y should get the IPI here if it's a NMI IPI (which will be true for
>>> = POWER9 64s).
>>
>> That said, not all platforms support it and the console lock problem
>> seems real, so okay.
>>
>>> CPU x will timeout and so has spent 1s waiting while holding the
>>>        __wd_smp_lock.
>>>
>>> A deadlock may also happen between the __wd_smp_lock and the console_owner
>>> 'lock' this way:
>>> CPU x grab the console_owner
>>> CPU y grab the __wd_smp_lock
>>> CPU x catch the watchdog timer interrupt and needs to grab __wd_smp_lock
>>> CPU y wants to print something and wait for console_owner
>>> -> deadlock
>>>
>>> Doing all the long processing without holding the _wd_smp_lock prevents
>>> these situations.
>>
>> The intention was to avoid logs getting garbled e.g., if multiple
>> different CPUs fire at once.
>>
>> I wonder if instead we could deal with that by protecting the IPI
>> sending and printing stuff with a trylock, and if you don't get the
>> trylock then just return, and you'll come back with the next timer
>> interrupt.
> 
> Something like this (untested) should basically hold off concurrency on
> watchdog panics. It does not serialize output from IPI targets but it
> should prevent multiple CPUs trying to send IPIs at once, without
> holding the lock.

Got it, I'll work this way, despite the minor comments below.

> 
> ---
> 
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index 2ffeb3f8b5a7..3a0db577da56 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -85,6 +85,7 @@ static DEFINE_PER_CPU(u64, wd_timer_tb);
>   
>   /* SMP checker bits */
>   static unsigned long __wd_smp_lock;
> +static unsigned long __wd_printing;
>   static cpumask_t wd_smp_cpus_pending;
>   static cpumask_t wd_smp_cpus_stuck;
>   static u64 wd_smp_last_reset_tb;
> @@ -131,10 +132,10 @@ static void wd_lockup_ipi(struct pt_regs *regs)
>   	/* Do not panic from here because that can recurse into NMI IPI layer */
>   }
>   
> -static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb)
> +static void set_cpu_stuck(int cpu, u64 tb)
>   {
> -	cpumask_or(&wd_smp_cpus_stuck, &wd_smp_cpus_stuck, cpumask);
> -	cpumask_andnot(&wd_smp_cpus_pending, &wd_smp_cpus_pending, cpumask);
> +	cpumask_set_cpu(cpu, &wd_smp_cpus_stuck);
> +	cpumask_clear_cpu(cpu, &wd_smp_cpus_pending);
>   	if (cpumask_empty(&wd_smp_cpus_pending)) {
>   		wd_smp_last_reset_tb = tb;
>   		cpumask_andnot(&wd_smp_cpus_pending,
> @@ -142,10 +143,6 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb)
>   				&wd_smp_cpus_stuck);
>   	}
>   }
> -static void set_cpu_stuck(int cpu, u64 tb)
> -{
> -	set_cpumask_stuck(cpumask_of(cpu), tb);
> -}
>   
>   static void watchdog_smp_panic(int cpu, u64 tb)
>   {
> @@ -160,6 +157,10 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>   		goto out;
>   	if (cpumask_weight(&wd_smp_cpus_pending) == 0)
>   		goto out;
> +	if (__wd_printing)
> +		goto out;
> +	__wd_printing = 1;
> +	wd_smp_unlock(&flags);
>   
>   	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
>   		 cpu, cpumask_pr_args(&wd_smp_cpus_pending));
> @@ -172,24 +173,31 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>   		 * Try to trigger the stuck CPUs, unless we are going to
>   		 * get a backtrace on all of them anyway.
>   		 */
> -		for_each_cpu(c, &wd_smp_cpus_pending) {
> +		for_each_online_cpu(c) {
>   			if (c == cpu)
>   				continue;
> +			if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending))
> +				continue;
> +			wd_smp_lock(&flags);
> +			if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) {
> +				wd_smp_unlock(&flags);
> +				continue;
> +			}
> +			/* Take the stuck CPU out of the watch group */
> +			set_cpu_stuck(cpu, tb);
> +			wd_smp_unlock(&flags);
> +
>   			smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000);
>   		}
>   	}
>   
> -	/* Take the stuck CPUs out of the watch group */
> -	set_cpumask_stuck(&wd_smp_cpus_pending, tb);
> -
> -	wd_smp_unlock(&flags);
> -
>   	if (sysctl_hardlockup_all_cpu_backtrace)
>   		trigger_allbutself_cpu_backtrace();
>   
>   	if (hardlockup_panic)
>   		nmi_panic(NULL, "Hard LOCKUP");
>   
> +	__wd_printing = 0;

I think WRITE_ONCE() is required here, to prevent any compiler trick.
Also, I think it might be safer (I don't have clear example in mind, just a bad 
feeling) to do so before the check of hardlockup_panic and the call to panic.

>   	return;
>   
>   out:
> @@ -204,8 +212,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>   		if (unlikely(cpumask_test_cpu(cpu, &wd_smp_cpus_stuck))) {
>   			struct pt_regs *regs = get_irq_regs();
>   
> -			wd_smp_lock(&flags);
> -
>   			pr_emerg("CPU %d became unstuck TB:%lld\n",
>   				 cpu, tb);
>   			print_irqtrace_events(current);
> @@ -214,6 +220,7 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
>   			else
>   				dump_stack();
>   
> +			wd_smp_lock(&flags);
>   			cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
>   			wd_smp_unlock(&flags);
>   		}
> @@ -263,8 +270,16 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   			wd_smp_unlock(&flags);
>   			return 0;
>   		}
> +		if (__wd_printing) {
> +			wd_smp_unlock(&flags);
> +			return 0;

Here the CPU is not flagged as stuck and nothing is printed. We have to wait for 
the next soft_nmi_interrupt() to happen for this CPU to detect itself stuck and 
provide the backstack. Depending on the number of CPU and the stress on the 
system, this CPU may net see __wd_printing=0 when entering soft_nmi_interrupt() 
until a long period of time.
I think we should introduce a per CPU counter, when that counter reach a limit 
(let's say 10), the CPU is not taking care of __wd_printing and continue it's 
processing.

> +		} > +		__wd_printing = 1;
> +
>   		set_cpu_stuck(cpu, tb);
>   
> +		wd_smp_unlock(&flags);
> +
>   		pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n",
>   			 cpu, (void *)regs->nip);
>   		pr_emerg("CPU %d TB:%lld, last heartbeat TB:%lld (%lldms ago)\n",
> @@ -274,13 +289,13 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   		print_irqtrace_events(current);
>   		show_regs(regs);
>   
> -		wd_smp_unlock(&flags);
> -
>   		if (sysctl_hardlockup_all_cpu_backtrace)
>   			trigger_allbutself_cpu_backtrace();
>   
>   		if (hardlockup_panic)
>   			nmi_panic(regs, "Hard LOCKUP");
> +
> +		__wd_printing = 0;

I think WRITE_ONCE() is required here, to prevent any compiler trick.

>   	}
>   	if (wd_panic_timeout_tb < 0x7fffffff)
>   		mtspr(SPRN_DEC, wd_panic_timeout_tb);
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-28 15:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-26 16:27 [PATCH 0/2] powerpc prevents deadlock in the watchdog path Laurent Dufour
2021-10-26 16:27 ` [PATCH 1/2] powerpc/watchdog: prevent printk and send IPI while holding the wd lock Laurent Dufour
2021-10-27  3:29   ` Nicholas Piggin
2021-10-27  8:14     ` Laurent Dufour
2021-10-27  8:51       ` Nicholas Piggin
2021-10-27  9:49     ` Nicholas Piggin
2021-10-28 15:45       ` Laurent Dufour
2021-10-26 16:27 ` [PATCH 2/2] powerpc/watchdog: ensure watchdog data accesses are protected Laurent Dufour
2021-10-27  3:48   ` Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).