All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic
@ 2021-05-27  1:15 Nathan Lynch
  2021-06-02 11:08 ` Michael Ellerman
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2021-05-27  1:15 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: npiggin, paulus, clg, ldufour

When smp_send_safe_nmi_ipi() indicates that the target CPU has
responded to the IPI, skip the remote paca inspection
fallback. Otherwise both the sending and target CPUs attempt the
backtrace, usually creating a misleading ("didn't respond to backtrace
IPI" is wrong) and interleaved mess:

[ 1658.929157][    C7] rcu: Stack dump where RCU GP kthread last ran:
[ 1658.929223][    C7] Sending NMI from CPU 7 to CPUs 1:
[ 1658.929303][    C1] NMI backtrace for cpu 1
[ 1658.929303][    C7] CPU 1 didn't respond to backtrace IPI, inspecting paca.
[ 1658.929362][    C1] CPU: 1 PID: 325 Comm: kworker/1:1H Tainted: G        W   E     5.13.0-rc2+ #46
[ 1658.929405][    C7] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 325 (kworker/1:1H)
[ 1658.929465][    C1] Workqueue: events_highpri test_work_fn [test_lockup]
[ 1658.929549][    C7] Back trace of paca->saved_r1 (0xc0000000057fb400) (possibly stale):
[ 1658.929592][    C1] NIP:  c00000000002cf50 LR: c008000000820178 CTR: c00000000002cfa0

Verified using the test_lockup module, e.g.

$ echo 5 > /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout
$ insmod test_lockup.ko time_secs=1 iterations=10 state=R lock_rcu \
  touch_softlockup all_cpus

Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/stacktrace.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 1deb1bf331dd..e0ccc5a46d7e 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -174,11 +174,14 @@ static void raise_backtrace_ipi(cpumask_t *mask)
 {
 	unsigned int cpu;
 
+	if (cpumask_test_cpu(smp_processor_id(), mask)) {
+		handle_backtrace_ipi(NULL);
+		cpumask_clear_cpu(smp_processor_id(), mask);
+	}
+
 	for_each_cpu(cpu, mask) {
-		if (cpu == smp_processor_id())
-			handle_backtrace_ipi(NULL);
-		else
-			smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC);
+		if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC))
+			cpumask_clear_cpu(cpu, mask);
 	}
 
 	for_each_cpu(cpu, mask) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic
  2021-05-27  1:15 [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic Nathan Lynch
@ 2021-06-02 11:08 ` Michael Ellerman
  2021-06-02 19:44   ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2021-06-02 11:08 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: ldufour, paulus, clg, npiggin

Nathan Lynch <nathanl@linux.ibm.com> writes:
> When smp_send_safe_nmi_ipi() indicates that the target CPU has
> responded to the IPI, skip the remote paca inspection
> fallback. Otherwise both the sending and target CPUs attempt the
> backtrace, usually creating a misleading ("didn't respond to backtrace
> IPI" is wrong) and interleaved mess:

Thanks for fixing my bugs for me :)

> [ 1658.929157][    C7] rcu: Stack dump where RCU GP kthread last ran:
> [ 1658.929223][    C7] Sending NMI from CPU 7 to CPUs 1:
> [ 1658.929303][    C1] NMI backtrace for cpu 1
> [ 1658.929303][    C7] CPU 1 didn't respond to backtrace IPI, inspecting paca.
> [ 1658.929362][    C1] CPU: 1 PID: 325 Comm: kworker/1:1H Tainted: G        W   E     5.13.0-rc2+ #46
> [ 1658.929405][    C7] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 325 (kworker/1:1H)
> [ 1658.929465][    C1] Workqueue: events_highpri test_work_fn [test_lockup]
> [ 1658.929549][    C7] Back trace of paca->saved_r1 (0xc0000000057fb400) (possibly stale):
> [ 1658.929592][    C1] NIP:  c00000000002cf50 LR: c008000000820178 CTR: c00000000002cfa0
>
> Verified using the test_lockup module, e.g.
>
> $ echo 5 > /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout
> $ insmod test_lockup.ko time_secs=1 iterations=10 state=R lock_rcu \
>   touch_softlockup all_cpus
>
> Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
>  arch/powerpc/kernel/stacktrace.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
> index 1deb1bf331dd..e0ccc5a46d7e 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -174,11 +174,14 @@ static void raise_backtrace_ipi(cpumask_t *mask)
>  {
>  	unsigned int cpu;
>  
> +	if (cpumask_test_cpu(smp_processor_id(), mask)) {
> +		handle_backtrace_ipi(NULL);
> +		cpumask_clear_cpu(smp_processor_id(), mask);
> +	}
> +
>  	for_each_cpu(cpu, mask) {
> -		if (cpu == smp_processor_id())
> -			handle_backtrace_ipi(NULL);
> -		else
> -			smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC);
> +		if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC))
> +			cpumask_clear_cpu(cpu, mask);

I think there's still a race here, but instead of causing us to emit a
spurious "didn't respond" trace, it could lead to us failing to emit a
proper trace when we should.

It's hard to follow this code, but mask above is backtrace_mask from
lib/nmi_backtrace.c, because of:

void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
				   bool exclude_self,
				   void (*raise)(cpumask_t *mask))
{
	int i, this_cpu = get_cpu();

	cpumask_copy(to_cpumask(backtrace_mask), mask);
                                ^^^^^^^^^^^^^^
	...

	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
		pr_info("Sending NMI from CPU %d to CPUs %*pbl:\n",
			this_cpu, nr_cpumask_bits, to_cpumask(backtrace_mask));
		raise(to_cpumask(backtrace_mask));
                                 ^^^^^^^^^^^^^^

And raise there is raise_backtrace_ipi() (the function we're patching).

On the receiving CPU we end up executing:

int smp_handle_nmi_ipi(struct pt_regs *regs)
{
	...
	nmi_ipi_lock_start(&flags);
	if (cpumask_test_cpu(me, &nmi_ipi_pending_mask)) {
		cpumask_clear_cpu(me, &nmi_ipi_pending_mask);
		fn = READ_ONCE(nmi_ipi_function);
                ...
	}
	nmi_ipi_unlock_end(&flags);

	if (fn)
		fn(regs);


The key detail being that we drop the nmi lock before calling fn, which
means the calling CPU can return back to raise_backtrace_ipi() before fn
is called.

In our case fn is handle_backtrace_ipi() which just calls nmi_cpu_backtrace().

Which does:

bool nmi_cpu_backtrace(struct pt_regs *regs)
{
	int cpu = smp_processor_id();

	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
        	...
		pr_warn("NMI backtrace for cpu %d\n", cpu);
		if (regs)
			show_regs(regs);
		else
			dump_stack();


ie. if the CPU has been cleared from backtrace_mask it doesn't emit a
stack trace.

So we could end up with the following interleaving:


CPU0								CPU1
====								====
if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, ...
   								// smp_handle_nmi_ipi()
								fn = READ_ONCE(nmi_ipi_function);
								...
								nmi_ipi_unlock_end(&flags);
// in smp_send_safe_nmi_ipi()
nmi_ipi_lock();
while (!cpumask_empty(&nmi_ipi_pending_mask)) {
...
nmi_ipi_unlock_end(&flags);
return ret;

	cpumask_clear_cpu(cpu, mask);
			       					fn(regs)
								// -> nmi_cpu_backtrace()
								if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {


But like I said, it's not easy to follow, so maybe I missed something
along the way.


To solve it I think we want to avoid clearing a CPU from the mask unless
we know that the IPI failed for that CPU. That way there's no risk of
suppressing a trace from a CPU that successfully handles the IPI, and we
know we've waited 5 seconds for CPUs that fail to handle the IPI.

I don't think we want to allocate a whole new cpumask to track which
CPUs have failed to respond, but I don't think we need to. We can just
synchronously handle them.

Something like below.

cheers


diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 1deb1bf331dd..980e87f7ae7a 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -172,17 +172,19 @@ static void handle_backtrace_ipi(struct pt_regs *regs)
 
 static void raise_backtrace_ipi(cpumask_t *mask)
 {
+	struct paca_struct *p;
 	unsigned int cpu;
 
 	for_each_cpu(cpu, mask) {
-		if (cpu == smp_processor_id())
+		if (cpu == smp_processor_id()) {
 			handle_backtrace_ipi(NULL);
-		else
-			smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC);
-	}
+			continue;
+		}
 
-	for_each_cpu(cpu, mask) {
-		struct paca_struct *p = paca_ptrs[cpu];
+		if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC))
+			continue;
+
+		p = paca_ptrs[cpu];
 
 		cpumask_clear_cpu(cpu, mask);
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic
  2021-06-02 11:08 ` Michael Ellerman
@ 2021-06-02 19:44   ` Nathan Lynch
  2021-06-03  5:34     ` Michael Ellerman
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2021-06-02 19:44 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: npiggin, paulus, clg, ldufour, linuxppc-dev

Hi Michael,

Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> When smp_send_safe_nmi_ipi() indicates that the target CPU has
>> responded to the IPI, skip the remote paca inspection
>> fallback. Otherwise both the sending and target CPUs attempt the
>> backtrace, usually creating a misleading ("didn't respond to backtrace
>> IPI" is wrong) and interleaved mess:
>
> Thanks for fixing my bugs for me :)
>

Thanks for your review! I was beginning to think I had missed some
subtletly here, thanks for illustrating it.

I'll run with your proposed change below for the problem I'm working.


> To solve it I think we want to avoid clearing a CPU from the mask unless
> we know that the IPI failed for that CPU. That way there's no risk of
> suppressing a trace from a CPU that successfully handles the IPI, and we
> know we've waited 5 seconds for CPUs that fail to handle the IPI.
>
> I don't think we want to allocate a whole new cpumask to track which
> CPUs have failed to respond, but I don't think we need to. We can just
> synchronously handle them.
>
> diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
> index 1deb1bf331dd..980e87f7ae7a 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -172,17 +172,19 @@ static void handle_backtrace_ipi(struct pt_regs *regs)
>  
>  static void raise_backtrace_ipi(cpumask_t *mask)
>  {
> +	struct paca_struct *p;
>  	unsigned int cpu;
>  
>  	for_each_cpu(cpu, mask) {
> -		if (cpu == smp_processor_id())
> +		if (cpu == smp_processor_id()) {
>  			handle_backtrace_ipi(NULL);
> -		else
> -			smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC);
> -	}
> +			continue;
> +		}
>  
> -	for_each_cpu(cpu, mask) {
> -		struct paca_struct *p = paca_ptrs[cpu];
> +		if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC))
> +			continue;
> +
> +		p = paca_ptrs[cpu];
>  
>  		cpumask_clear_cpu(cpu, mask);
>  

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic
  2021-06-02 19:44   ` Nathan Lynch
@ 2021-06-03  5:34     ` Michael Ellerman
  2021-06-07 21:50       ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2021-06-03  5:34 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: npiggin, paulus, clg, ldufour, linuxppc-dev

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Hi Michael,
>
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> When smp_send_safe_nmi_ipi() indicates that the target CPU has
>>> responded to the IPI, skip the remote paca inspection
>>> fallback. Otherwise both the sending and target CPUs attempt the
>>> backtrace, usually creating a misleading ("didn't respond to backtrace
>>> IPI" is wrong) and interleaved mess:
>>
>> Thanks for fixing my bugs for me :)
>>
>
> Thanks for your review! I was beginning to think I had missed some
> subtletly here, thanks for illustrating it.
>
> I'll run with your proposed change below for the problem I'm working.

Thanks. I did test it a bit with the test_lockup module, but some real
world testing would be good too.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic
  2021-06-03  5:34     ` Michael Ellerman
@ 2021-06-07 21:50       ` Nathan Lynch
  0 siblings, 0 replies; 5+ messages in thread
From: Nathan Lynch @ 2021-06-07 21:50 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: npiggin, paulus, clg, ldufour, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> Hi Michael,
>>
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>>> When smp_send_safe_nmi_ipi() indicates that the target CPU has
>>>> responded to the IPI, skip the remote paca inspection
>>>> fallback. Otherwise both the sending and target CPUs attempt the
>>>> backtrace, usually creating a misleading ("didn't respond to backtrace
>>>> IPI" is wrong) and interleaved mess:
>>>
>>> Thanks for fixing my bugs for me :)
>>>
>>
>> Thanks for your review! I was beginning to think I had missed some
>> subtletly here, thanks for illustrating it.
>>
>> I'll run with your proposed change below for the problem I'm working.
>
> Thanks. I did test it a bit with the test_lockup module, but some real
> world testing would be good too.

Been running with this to work some LPM issues and can confirm it waits
the intended amount of time before falling back to a remote stack walk,
avoiding interleaved traces from source and target CPUs. You can add my
tested-by, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-07 21:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-27  1:15 [PATCH] powerpc/stacktrace: fix raise_backtrace_ipi() logic Nathan Lynch
2021-06-02 11:08 ` Michael Ellerman
2021-06-02 19:44   ` Nathan Lynch
2021-06-03  5:34     ` Michael Ellerman
2021-06-07 21:50       ` Nathan Lynch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.