linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible
@ 2022-08-04  2:34 Zhen Lei
  2022-08-04  2:34 ` [PATCH v4 1/2] sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task() Zhen Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Zhen Lei @ 2022-08-04  2:34 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Paul E . McKenney, Frederic Weisbecker, Neeraj Upadhyay,
	Josh Triplett, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu
  Cc: Zhen Lei

v3 --> v4:
1. To avoid undo/redo, merge patch 1-2 in v3 into one.

v2 --> v3:
1. Patch 1 Add trigger_single_cpu_backtrace(cpu) in synchronize_rcu_expedited_wait()
   Subsequently, we can see that all callers of dump_cpu_task() try
   trigger_single_cpu_backtrace() first. Then I do the cleanup in Patch 2.
2. Patch 3, as Paul E. McKenney's suggestion, push the code into dump_cpu_task().

For newcomers:
Currently, dump_cpu_task() is mainly used by RCU, in order to dump the
stack traces of the current task of the specified CPU when a rcu stall
is detected.

For architectures that do not support NMI interrupts, registers is not
printed when rcu stall is self-detected. This patch series improve it.


v2:
https://lkml.org/lkml/2022/7/27/1800

Zhen Lei (2):
  sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()
  sched/debug: Show the registers of 'current' in dump_cpu_task()

 kernel/rcu/tree_stall.h |  8 +++-----
 kernel/sched/core.c     | 14 ++++++++++++++
 kernel/smp.c            |  3 +--
 3 files changed, 18 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 1/2] sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()
  2022-08-04  2:34 [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Zhen Lei
@ 2022-08-04  2:34 ` Zhen Lei
  2022-08-04  2:34 ` [PATCH v4 2/2] sched/debug: Show the registers of 'current' " Zhen Lei
  2022-08-04 18:09 ` [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Paul E. McKenney
  2 siblings, 0 replies; 6+ messages in thread
From: Zhen Lei @ 2022-08-04  2:34 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Paul E . McKenney, Frederic Weisbecker, Neeraj Upadhyay,
	Josh Triplett, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu
  Cc: Zhen Lei

Function trigger_all_cpu_backtrace() uses NMI to dump the stack traces
of other CPU, it should actually be one of the ways to implement
dump_cpu_task(). So try it first in dump_cpu_task(). At the same time,
unnecessary duplicate code of upper-layer functions is eliminated.

There is also a call to dump_cpu_task() in
synchronize_rcu_expedited_wait(), which should also try to use NMI to
dump the stack traces first. It is currently the result of this
adjustment, so leave it unchanged.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/rcu/tree_stall.h | 8 +++-----
 kernel/sched/core.c     | 3 +++
 kernel/smp.c            | 3 +--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index a001e1e7a99269c..80749d257ac2f78 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -368,7 +368,7 @@ static void rcu_dump_cpu_stacks(void)
 			if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
 				if (cpu_is_offline(cpu))
 					pr_err("Offline CPU %d blocking current GP.\n", cpu);
-				else if (!trigger_single_cpu_backtrace(cpu))
+				else
 					dump_cpu_task(cpu);
 			}
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
@@ -486,8 +486,7 @@ static void rcuc_kthread_dump(struct rcu_data *rdp)
 
 	pr_err("%s kthread starved for %ld jiffies\n", rcuc->comm, j);
 	sched_show_task(rcuc);
-	if (!trigger_single_cpu_backtrace(cpu))
-		dump_cpu_task(cpu);
+	dump_cpu_task(cpu);
 }
 
 /* Complain about starvation of grace-period kthread.  */
@@ -515,8 +514,7 @@ static void rcu_check_gp_kthread_starvation(void)
 					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
 				} else  {
 					pr_err("Stack dump where RCU GP kthread last ran:\n");
-					if (!trigger_single_cpu_backtrace(cpu))
-						dump_cpu_task(cpu);
+					dump_cpu_task(cpu);
 				}
 			}
 			wake_up_process(gpk);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index df8fe433642fa30..0e82073020bf0d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -11145,6 +11145,9 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 
 void dump_cpu_task(int cpu)
 {
+	if (trigger_single_cpu_backtrace(cpu))
+		return;
+
 	pr_info("Task dump for CPU %d:\n", cpu);
 	sched_show_task(cpu_curr(cpu));
 }
diff --git a/kernel/smp.c b/kernel/smp.c
index dd215f439426449..56ca958364aebeb 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -370,8 +370,7 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 *
 	if (cpu >= 0) {
 		if (static_branch_unlikely(&csdlock_debug_extended))
 			csd_lock_print_extended(csd, cpu);
-		if (!trigger_single_cpu_backtrace(cpu))
-			dump_cpu_task(cpu);
+		dump_cpu_task(cpu);
 		if (!cpu_cur_csd) {
 			pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu);
 			arch_send_call_function_single_ipi(cpu);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v4 2/2] sched/debug: Show the registers of 'current' in dump_cpu_task()
  2022-08-04  2:34 [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Zhen Lei
  2022-08-04  2:34 ` [PATCH v4 1/2] sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task() Zhen Lei
@ 2022-08-04  2:34 ` Zhen Lei
  2022-08-04 18:09 ` [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Paul E. McKenney
  2 siblings, 0 replies; 6+ messages in thread
From: Zhen Lei @ 2022-08-04  2:34 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Paul E . McKenney, Frederic Weisbecker, Neeraj Upadhyay,
	Josh Triplett, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	rcu
  Cc: Zhen Lei

For architectures that do not support NMI, registers is not printed.
However, this information is useful for analyzing the root cause of the
fault. Fortunately, when the stack traces of current is dumped in the
interrupt handler, we can take it through get_irq_regs() and display it
through show_regs(). Further, show_regs() unwind the call trace based on
'regs', the worthless call trace associated with interrupt handling will
be omitted, this helps us to focus more on the problem. By the way, for
architectures that support NMI, it also avoids generating an unnecessary
NMI in this case.

This is an example of rcu self-detected stall on arm64:
[   27.501721] rcu: INFO: rcu_preempt self-detected stall on CPU
[   27.502238] rcu:     0-....: (1250 ticks this GP) idle=4f7/1/0x4000000000000000 softirq=2594/2594 fqs=619
[   27.502632]  (t=1251 jiffies g=2989 q=29 ncpus=4)
[   27.503845] CPU: 0 PID: 306 Comm: test0 Not tainted 5.19.0-rc7-00009-g1c1a6c29ff99-dirty #46
[   27.504732] Hardware name: linux,dummy-virt (DT)
[   27.504947] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   27.504998] pc : arch_counter_read+0x18/0x24
[   27.505301] lr : arch_counter_read+0x18/0x24
[   27.505328] sp : ffff80000b29bdf0
[   27.505345] x29: ffff80000b29bdf0 x28: 0000000000000000 x27: 0000000000000000
[   27.505475] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[   27.505553] x23: 0000000000001f40 x22: ffff800009849c48 x21: 000000065f871ae0
[   27.505627] x20: 00000000000025ec x19: ffff80000a6eb300 x18: ffffffffffffffff
[   27.505654] x17: 0000000000000001 x16: 0000000000000000 x15: ffff80000a6d0296
[   27.505681] x14: ffffffffffffffff x13: ffff80000a29bc18 x12: 0000000000000426
[   27.505709] x11: 0000000000000162 x10: ffff80000a2f3c18 x9 : ffff80000a29bc18
[   27.505736] x8 : 00000000ffffefff x7 : ffff80000a2f3c18 x6 : 00000000759bd013
[   27.505761] x5 : 01ffffffffffffff x4 : 0002dc6c00000000 x3 : 0000000000000017
[   27.505787] x2 : 00000000000025ec x1 : ffff80000b29bdf0 x0 : 0000000075a30653
[   27.505937] Call trace:
[   27.506002]  arch_counter_read+0x18/0x24
[   27.506171]  ktime_get+0x48/0xa0
[   27.506207]  test_task+0x70/0xf0
[   27.506227]  kthread+0x10c/0x110
[   27.506243]  ret_from_fork+0x10/0x20

The old output is as follows:
[   27.944550] rcu: INFO: rcu_preempt self-detected stall on CPU
[   27.944980] rcu:     0-....: (1249 ticks this GP) idle=cbb/1/0x4000000000000000 softirq=2610/2610 fqs=614
[   27.945407]  (t=1251 jiffies g=2681 q=28 ncpus=4)
[   27.945731] Task dump for CPU 0:
[   27.945844] task:test0           state:R  running task     stack:    0 pid:  306 ppid:     2 flags:0x0000000a
[   27.946073] Call trace:
[   27.946151]  dump_backtrace.part.0+0xc8/0xd4
[   27.946378]  show_stack+0x18/0x70
[   27.946405]  sched_show_task+0x150/0x180
[   27.946427]  dump_cpu_task+0x44/0x54
[   27.947193]  rcu_dump_cpu_stacks+0xec/0x130
[   27.947212]  rcu_sched_clock_irq+0xb18/0xef0
[   27.947231]  update_process_times+0x68/0xac
[   27.947248]  tick_sched_handle+0x34/0x60
[   27.947266]  tick_sched_timer+0x4c/0xa4
[   27.947281]  __hrtimer_run_queues+0x178/0x360
[   27.947295]  hrtimer_interrupt+0xe8/0x244
[   27.947309]  arch_timer_handler_virt+0x38/0x4c
[   27.947326]  handle_percpu_devid_irq+0x88/0x230
[   27.947342]  generic_handle_domain_irq+0x2c/0x44
[   27.947357]  gic_handle_irq+0x44/0xc4
[   27.947376]  call_on_irq_stack+0x2c/0x54
[   27.947415]  do_interrupt_handler+0x80/0x94
[   27.947431]  el1_interrupt+0x34/0x70
[   27.947447]  el1h_64_irq_handler+0x18/0x24
[   27.947462]  el1h_64_irq+0x64/0x68                       <--- the above backtrace is worthless
[   27.947474]  arch_counter_read+0x18/0x24
[   27.947487]  ktime_get+0x48/0xa0
[   27.947501]  test_task+0x70/0xf0
[   27.947520]  kthread+0x10c/0x110
[   27.947538]  ret_from_fork+0x10/0x20

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/sched/core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0e82073020bf0d1..6a5af8af2e69954 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -73,6 +73,7 @@
 
 #include <uapi/linux/sched/types.h>
 
+#include <asm/irq_regs.h>
 #include <asm/switch_to.h>
 #include <asm/tlb.h>
 
@@ -11145,6 +11146,16 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 
 void dump_cpu_task(int cpu)
 {
+	if (cpu == smp_processor_id() && in_hardirq()) {
+		struct pt_regs *regs;
+
+		regs = get_irq_regs();
+		if (regs) {
+			show_regs(regs);
+			return;
+		}
+	}
+
 	if (trigger_single_cpu_backtrace(cpu))
 		return;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible
  2022-08-04  2:34 [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Zhen Lei
  2022-08-04  2:34 ` [PATCH v4 1/2] sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task() Zhen Lei
  2022-08-04  2:34 ` [PATCH v4 2/2] sched/debug: Show the registers of 'current' " Zhen Lei
@ 2022-08-04 18:09 ` Paul E. McKenney
  2022-08-05  8:09   ` Leizhen (ThunderTown)
  2 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2022-08-04 18:09 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu

On Thu, Aug 04, 2022 at 10:34:18AM +0800, Zhen Lei wrote:
> v3 --> v4:
> 1. To avoid undo/redo, merge patch 1-2 in v3 into one.
> 
> v2 --> v3:
> 1. Patch 1 Add trigger_single_cpu_backtrace(cpu) in synchronize_rcu_expedited_wait()
>    Subsequently, we can see that all callers of dump_cpu_task() try
>    trigger_single_cpu_backtrace() first. Then I do the cleanup in Patch 2.
> 2. Patch 3, as Paul E. McKenney's suggestion, push the code into dump_cpu_task().
> 
> For newcomers:
> Currently, dump_cpu_task() is mainly used by RCU, in order to dump the
> stack traces of the current task of the specified CPU when a rcu stall
> is detected.
> 
> For architectures that do not support NMI interrupts, registers is not
> printed when rcu stall is self-detected. This patch series improve it.

Thank you!  I have queued both for further testing and review.  I had
to rebase them to the -rcu tree's "dev" branch.  There was one trivial
conflict, but could you please check the resulting commits, both for
my wordsmithing and to make sure that your changes still work in your
environment?  (I do not have access to that sort of hardware.)

In the future, could you please send your patches against the -rcu
tree's "dev" branch?

							Thanx, Paul

> v2:
> https://lkml.org/lkml/2022/7/27/1800
> 
> Zhen Lei (2):
>   sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()
>   sched/debug: Show the registers of 'current' in dump_cpu_task()
> 
>  kernel/rcu/tree_stall.h |  8 +++-----
>  kernel/sched/core.c     | 14 ++++++++++++++
>  kernel/smp.c            |  3 +--
>  3 files changed, 18 insertions(+), 7 deletions(-)
> 
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible
  2022-08-04 18:09 ` [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Paul E. McKenney
@ 2022-08-05  8:09   ` Leizhen (ThunderTown)
  2022-08-06  7:19     ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 6+ messages in thread
From: Leizhen (ThunderTown) @ 2022-08-05  8:09 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu



On 2022/8/5 2:09, Paul E. McKenney wrote:
> On Thu, Aug 04, 2022 at 10:34:18AM +0800, Zhen Lei wrote:
>> v3 --> v4:
>> 1. To avoid undo/redo, merge patch 1-2 in v3 into one.
>>
>> v2 --> v3:
>> 1. Patch 1 Add trigger_single_cpu_backtrace(cpu) in synchronize_rcu_expedited_wait()
>>    Subsequently, we can see that all callers of dump_cpu_task() try
>>    trigger_single_cpu_backtrace() first. Then I do the cleanup in Patch 2.
>> 2. Patch 3, as Paul E. McKenney's suggestion, push the code into dump_cpu_task().
>>
>> For newcomers:
>> Currently, dump_cpu_task() is mainly used by RCU, in order to dump the
>> stack traces of the current task of the specified CPU when a rcu stall
>> is detected.
>>
>> For architectures that do not support NMI interrupts, registers is not
>> printed when rcu stall is self-detected. This patch series improve it.
> 
> Thank you!  I have queued both for further testing and review.  I had
> to rebase them to the -rcu tree's "dev" branch.  There was one trivial
> conflict, but could you please check the resulting commits, both for
> my wordsmithing and to make sure that your changes still work in your
> environment?  (I do not have access to that sort of hardware.)

Your description is much clearer than mine, thanks. I will test it
tomorrow, the international network speed is too slow during the day.

> 
> In the future, could you please send your patches against the -rcu
> tree's "dev" branch?

Okay, no problem. I'll do it next time.

> 
> 							Thanx, Paul
> 
>> v2:
>> https://lkml.org/lkml/2022/7/27/1800
>>
>> Zhen Lei (2):
>>   sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()
>>   sched/debug: Show the registers of 'current' in dump_cpu_task()
>>
>>  kernel/rcu/tree_stall.h |  8 +++-----
>>  kernel/sched/core.c     | 14 ++++++++++++++
>>  kernel/smp.c            |  3 +--
>>  3 files changed, 18 insertions(+), 7 deletions(-)
>>
>> -- 
>> 2.25.1
>>
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible
  2022-08-05  8:09   ` Leizhen (ThunderTown)
@ 2022-08-06  7:19     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 6+ messages in thread
From: Leizhen (ThunderTown) @ 2022-08-06  7:19 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu



On 2022/8/5 16:09, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/8/5 2:09, Paul E. McKenney wrote:
>> On Thu, Aug 04, 2022 at 10:34:18AM +0800, Zhen Lei wrote:
>>> v3 --> v4:
>>> 1. To avoid undo/redo, merge patch 1-2 in v3 into one.
>>>
>>> v2 --> v3:
>>> 1. Patch 1 Add trigger_single_cpu_backtrace(cpu) in synchronize_rcu_expedited_wait()
>>>    Subsequently, we can see that all callers of dump_cpu_task() try
>>>    trigger_single_cpu_backtrace() first. Then I do the cleanup in Patch 2.
>>> 2. Patch 3, as Paul E. McKenney's suggestion, push the code into dump_cpu_task().
>>>
>>> For newcomers:
>>> Currently, dump_cpu_task() is mainly used by RCU, in order to dump the
>>> stack traces of the current task of the specified CPU when a rcu stall
>>> is detected.
>>>
>>> For architectures that do not support NMI interrupts, registers is not
>>> printed when rcu stall is self-detected. This patch series improve it.
>>
>> Thank you!  I have queued both for further testing and review.  I had
>> to rebase them to the -rcu tree's "dev" branch.  There was one trivial
>> conflict, but could you please check the resulting commits, both for
>> my wordsmithing and to make sure that your changes still work in your
>> environment?  (I do not have access to that sort of hardware.)

I tested it on x86, arm64, and arm32 platforms, and all the results are
as expected.

x86:
[   54.750801] rcu: INFO: rcu_preempt self-detected stall on CPU
[   54.754289] rcu:     0-....: (4998 ticks this GP) idle=9e5c/1/0x4000000000000000 softirq=855/855 fqs=1219
[   54.755307]  (t=5005 jiffies g=1 q=36 ncpus=8)
[   54.755311] CPU: 0 PID: 379 Comm: test0 Not tainted 5.19.0-rc3-00108-g0aa4c0b532b6 #1
[   54.755313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[   54.755321] RIP: 0010:pvclock_clocksource_read+0x12/0xc0
[   54.755328] Code: b6 47 1d 8b 17 39 ca 75 f1 22 05 79 9e a3 01 c3 0f 1f 84 00 00 00 00 00 55 53 48 83 ec 08 8b 17 89 d6 83 e6 fe 0f ae e8 0f 31 <48> c1 e2 20 48 8b 5f 10 0f b6 6f 1d 48 09 c2 48 89 d0 0f be 57 1c
[   54.755330] RSP: 0018:ffff936040547eb0 EFLAGS: 00000206
[   54.755332] RAX: 0000000088921016 RBX: 0000000c9957cfbb RCX: 0000000000000000
[   54.755333] RDX: 0000000000000026 RSI: 0000000000000006 RDI: ffffffffa46de000
[   54.755335] RBP: 0000000000000000 R08: 0000001336358637 R09: ffff936040547ea0
[   54.755336] R10: ffffffffa3e55dc0 R11: ffffffffa46ee4a1 R12: 0000000000003e42
[   54.755338] R13: ffffffffa4727d40 R14: ffffffffa28b0270 R15: 0000000000000000
[   54.755339] FS:  0000000000000000(0000) GS:ffff8e0cefc00000(0000) knlGS:0000000000000000
[   54.755343] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.755344] CR2: 0000563a3b1bae40 CR3: 0000000104378000 CR4: 00000000000006f0
[   54.755345] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   54.755346] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   54.755347] Call Trace:
[   54.755356]  <TASK>
[   54.755358]  kvm_clock_read+0x14/0x30
[   54.755361]  ktime_get+0x39/0x90
[   54.755368]  test_task+0x3f/0x60
[   54.755373]  kthread+0xe3/0x110
[   54.755376]  ? kthread_complete_and_exit+0x20/0x20
[   54.755378]  ret_from_fork+0x22/0x30
[   54.755385]  </TASK>


arm64:
[   27.235111] rcu: INFO: rcu_preempt self-detected stall on CPU
[   27.236915] rcu:     0-....: (1249 ticks this GP) idle=cac4/1/0x4000000000000000 softirq=1434/1434 fqs=625
[   27.237475]  (t=1251 jiffies g=3421 q=18 ncpus=4)
[   27.238994] CPU: 0 PID: 356 Comm: test0 Not tainted 5.19.0-rc3-00107-gbf5cb0bc4689 #1
[   27.239467] Hardware name: linux,dummy-virt (DT)
[   27.240038] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   27.240358] pc : arch_counter_read+0x18/0x24
[   27.241439] lr : arch_counter_read+0x18/0x24
[   27.241645] sp : ffff800008e53df0
[   27.241786] x29: ffff800008e53df0 x28: 0000000000000000 x27: 0000000000000000
[   27.242256] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[   27.242586] x23: 0000000000001f40 x22: ffffd62939e61df0 x21: 00000006556b8c60
[   27.243115] x20: 00000000000028ae x19: ffffd6293aad7300 x18: ffffffffffffffff
[   27.243396] x17: 0000000000000001 x16: 0000000000000000 x15: ffffd6293aabad8e
[   27.243671] x14: ffffffffffffffff x13: ffffd6293aabad85 x12: fffffffffffc4627
[   27.243947] x11: ffffd6293a772c98 x10: 0000000000000a60 x9 : 0000000000000fa0
[   27.244256] x8 : ffffd6293a772c50 x7 : ffff800008e53c40 x6 : 00000000767b62ab
[   27.244541] x5 : 01ffffffffffffff x4 : 0000000000000000 x3 : 0000000000000017
[   27.244813] x2 : 00000000000028ae x1 : ffff800008e53df0 x0 : 0000000076804ac1
[   27.245247] Call trace:
[   27.245457]  arch_counter_read+0x18/0x24
[   27.245845]  ktime_get+0x48/0xa0
[   27.246012]  test_task+0x6c/0xec
[   27.246151]  kthread+0x10c/0x110
[   27.246282]  ret_from_fork+0x10/0x20


arm32
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (499 ticks this GP) idle=c734/1/0x40000002 softirq=161/161 fqs=249
        (t=500 jiffies g=-899 q=16 ncpus=4)
CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0-rc3+ #1
Hardware name: ARM-Versatile Express
PC is at ktime_get+0x4c/0xe8
LR is at ktime_get+0x4c/0xe8
pc : [<8019b4fc>]    lr : [<8019b4fc>]    psr: 60000013
sp : c8a71f28  ip : 00000001  fp : 00000001
r10: f8ecbe00  r9 : 431bde82  r8 : d7b634db
r7 : 00000a0a  r6 : e2be6498  r5 : 00000001  r4 : 80ca8700
r3 : ffffffff  r2 : ff3a2de2  r1 : 00000000  r0 : 00c5d21d
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 627f806a  DAC: 00000051
 ktime_get from test_task+0x44/0x110
 test_task from kthread+0xd8/0xf4
 kthread from ret_from_fork+0x14/0x2c
Exception stack(0xc8a71fb0 to 0xc8a71ff8)
1fa0:                                     00000000 00000000 00000000 00000000
1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1fe0: 00000000 00000000 00000000 00000000 00000013 00000000


> 
> Your description is much clearer than mine, thanks. I will test it
> tomorrow, the international network speed is too slow during the day.
> 
>>
>> In the future, could you please send your patches against the -rcu
>> tree's "dev" branch?
> 
> Okay, no problem. I'll do it next time.
> 
>>
>> 							Thanx, Paul
>>
>>> v2:
>>> https://lkml.org/lkml/2022/7/27/1800
>>>
>>> Zhen Lei (2):
>>>   sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()
>>>   sched/debug: Show the registers of 'current' in dump_cpu_task()
>>>
>>>  kernel/rcu/tree_stall.h |  8 +++-----
>>>  kernel/sched/core.c     | 14 ++++++++++++++
>>>  kernel/smp.c            |  3 +--
>>>  3 files changed, 18 insertions(+), 7 deletions(-)
>>>
>>> -- 
>>> 2.25.1
>>>
>> .
>>
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-08-06  7:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-04  2:34 [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Zhen Lei
2022-08-04  2:34 ` [PATCH v4 1/2] sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task() Zhen Lei
2022-08-04  2:34 ` [PATCH v4 2/2] sched/debug: Show the registers of 'current' " Zhen Lei
2022-08-04 18:09 ` [PATCH v4 0/2] rcu: Display registers of self-detected stall as far as possible Paul E. McKenney
2022-08-05  8:09   ` Leizhen (ThunderTown)
2022-08-06  7:19     ` Leizhen (ThunderTown)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).