linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RCU related tracebacks in various boot tests in -next
@ 2016-11-11 14:32 Guenter Roeck
  2016-11-11 18:28 ` Paul E. McKenney
  0 siblings, 1 reply; 2+ messages in thread
From: Guenter Roeck @ 2016-11-11 14:32 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

Hi,

I see a number of RCU related tracebacks in recent versions of -next.
The problem was first seen in next-20161109.

I didn't bisect, but my wild guess is to blame Paul's related changes.
Please let me know if a bisect is needed.

Detailed logs are at http://kerneltests.org/builders.
Build/test scripts are in the various subdirectories of
https://github.com/groeck/linux-build-test/tree/master/rootfs/. 

Thanks,
Guenter

--------

arm:

[    1.699096] ===============================
[    1.699188] suspicious RCU usage. ]
[    1.699340] 4.9.0-rc4-next-20161111 #4 Not tainted
[    1.699462] -------------------------------
[    1.699554] ./include/trace/events/irq.h:69 suspicious rcu_dereference_check() usage!
[    1.699707] 
[    1.699707] other info that might help us debug this:
[    1.699707] 
[    1.699890] 
[    1.699890] RCU used illegally from idle CPU!
[    1.699890] rcu_scheduler_active = 1, debug_locks = 0
[    1.700103] RCU used illegally from extended quiescent state!
[    1.700256] no locks held by swapper/0/0.
[    1.700317] 
[    1.700317] stack backtrace:
[    1.700500] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc4-next-20161111 #4
[    1.700653] Hardware name: Generic OMAP3 (Flattened Device Tree)
[    1.700988] [<c01100ac>] (unwind_backtrace) from [<c010c290>] (show_stack+0x10/0x14)
[    1.701171] [<c010c290>] (show_stack) from [<c048f688>] (dump_stack+0xb0/0xe4)
[    1.701354] [<c048f688>] (dump_stack) from [<c01a319c>] (__handle_irq_event_percpu+0x2fc/0x3d0)
[    1.701538] [<c01a319c>] (__handle_irq_event_percpu) from [<c01a328c>] (handle_irq_event_percpu+0x1c/0x58)
[    1.701751] [<c01a328c>] (handle_irq_event_percpu) from [<c01a3300>] (handle_irq_event+0x38/0x5c)
[    1.701934] [<c01a3300>] (handle_irq_event) from [<c01a65d0>] (handle_level_irq+0xb8/0x14c)
[    1.702087] [<c01a65d0>] (handle_level_irq) from [<c01a2208>] (generic_handle_irq+0x20/0x34)
[    1.702270] [<c01a2208>] (generic_handle_irq) from [<c01a2748>] (__handle_domain_irq+0x64/0xe0)
[    1.702453] [<c01a2748>] (__handle_domain_irq) from [<c07dc7b0>] (__irq_svc+0x70/0x98)
[    1.702606] [<c07dc7b0>] (__irq_svc) from [<c0108360>] (arch_cpu_idle+0x20/0x3c)
[    1.702789] [<c0108360>] (arch_cpu_idle) from [<c01882ec>] (cpu_startup_entry+0x180/0x224)
[    1.702941] [<c01882ec>] (cpu_startup_entry) from [<c0b00c10>] (start_kernel+0x344/0x3b8)
[    1.703186] [<c0b00c10>] (start_kernel) from [<8000807c>] (0x8000807c)

arm64:

[    1.247359] ------------[ cut here ]------------
[    1.247761] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:380 rcu_momentary_dyntick_idle+0xc0/0xc8
[    1.247884] Modules linked in:
[    1.247987] 
[    1.248150] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc4-next-20161111 #1
[    1.248241] Hardware name: ZynqMP EP108 (DT)
[    1.248350] task: ffff80007c4d1900 task.stack: ffff80007c4e0000
[    1.248445] PC is at rcu_momentary_dyntick_idle+0xc0/0xc8
[    1.248523] LR is at rcu_note_context_switch+0x1b4/0x2c0
[    1.248595] pc : [<ffff000008114df0>] lr : [<ffff000008119a0c>] pstate: 600000c5
[    1.248676] sp : ffff80007c4e3ea0
[    1.248738] x29: ffff80007c4e3ea0 x28: ffff000008d88000 
[    1.248842] x27: ffff80007c4e0000 x26: ffff000008da4803 
[    1.248919] x25: 0000000000000000 x24: ffff80007c4e0000 
[    1.248992] x23: ffff0000088b3544 x22: ffff80007c4d1900 
[    1.249065] x21: ffff80007efb1880 x20: ffff000008d88000 
[    1.249138] x19: ffff80007c4d1900 x18: 000000000000000e 
[    1.249211] x17: 0000000000000007 x16: 0000000000000001 
[    1.249285] x15: 000000000000004c x14: 0000000000000000 
[    1.249362] x13: 000000000000000b x12: 0000000000000006 
[    1.249435] x11: ffff0000088e0ad8 x10: ffff80007efb18b0 
[    1.249586] x9 : ffff0000088e0ad4 x8 : 00000000029f6300 
[    1.249715] x7 : 7fffffffffffffff x6 : ffff80007efabbd0 
[    1.249839] x5 : 0000000000000004 x4 : ffff000008da0688 
[    1.249963] x3 : 0000000000000004 x2 : ffff80007efabbdc 
[    1.250088] x1 : 0000000000000001 x0 : ffff000008da4811 
[    1.250213] 
[    1.250325] ---[ end trace 2930a57464ba65da ]---
[    1.250551] Call trace:
[    1.251134] Exception stack(0xffff80007c4e3cd0 to 0xffff80007c4e3e00)
[    1.251396] 3cc0:                                   ffff80007c4d1900 0001000000000000
[    1.251605] 3ce0: ffff80007c4e3ea0 ffff000008114df0 ffff000008e6c388 ffff80007c4e3f80
[    1.251790] 3d00: ffff80007c4e3d90 ffff00000838b804 ffff000008e6c3ac ffff80007c4e3fd0
[    1.251972] 3d20: 00000000ffffffd8 ffff000008910b8c 00000000000003e0 ffff000008b4c1c1
[    1.252151] 3d40: ffff000008e6c388 ffff80007c4e3fd0 ffff000008b4c1bd ffff000008107264
[    1.252331] 3d60: ffff80007c4e3d90 ffff00000838b4e0 ffff000008da4811 0000000000000001
[    1.252509] 3d80: ffff80007efabbdc 0000000000000004 ffff000008da0688 0000000000000004
[    1.252688] 3da0: ffff80007efabbd0 7fffffffffffffff 00000000029f6300 ffff0000088e0ad4
[    1.252866] 3dc0: ffff80007efb18b0 ffff0000088e0ad8 0000000000000006 000000000000000b
[    1.253043] 3de0: 0000000000000000 000000000000004c 0000000000000001 0000000000000007
[    1.253269] [<ffff000008114df0>] rcu_momentary_dyntick_idle+0xc0/0xc8
[    1.253442] [<ffff0000088b3000>] __schedule+0x68/0x570
[    1.253598] [<ffff0000088b3544>] schedule+0x3c/0xa8
[    1.253726] [<ffff0000088b3998>] schedule_preempt_disabled+0x20/0x38
[    1.253884] [<ffff0000080f893c>] cpu_startup_entry+0xd4/0x1d8
[    1.254026] [<ffff00000808e820>] secondary_start_kernel+0x158/0x198
[    1.254268] [<00000000008bd1a4>] 0x8bd1a4

s390:

===============================

suspicious RCU usage. ]
4.9.0-rc4-next-20161111 #1 Not tainted
-------------------------------
./include/trace/events/rcu.h:454 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc4-next-20161111 #1
Hardware name: QEMU             QEMU                  QEMU             (KVM)
Stack:
000000001fffbcb0 000000001fffbd40 0000000000000003 0000000000000000
000000001fffbde0 000000001fffbd58 000000001fffbd58 0000000000000020
0000000001e20b70 ffffffffffffffff 0000000000000000 0404c0010000000a
000000000000000c 000000001fffbda8 0000000000000000 0000000000000000
0000000000000000 00000000001168f4 000000001fffbd40 000000001fffbd88

Call Trace:
([<00000000001167b0>] show_trace+0x100/0x104)
[<0000000000116896>] show_stack+0x6e/0xec 
[<00000000007985e0>] dump_stack+0x94/0xd4 
[<000000000021882e>] rcu_eqs_exit_common.isra.49+0x1ee/0x21c 
[<000000000022172a>] rcu_irq_enter+0x1ca/0x284 
[<0000000000182178>] irq_enter+0x40/0xc0 
[<000000000010dcca>] do_IRQ+0x5a/0x94 [<0000000000ad234e>] ext_int_handler+0x11e/0x124 
[<0000000000103c50>] enabled_wait+0x64/0x110 
([<0000000000103c3c>] enabled_wait+0x50/0x110)
[<0000000000103eca>] arch_cpu_idle+0x42/0x54 
[<0000000000ad0576>] default_idle_call+0x4e/0x64 
[<00000000001ddcb0>] cpuidle_idle_call+0x54/0xa4 
[<00000000001dde62>] cpu_startup_entry+0xea/0x12c 
[<0000000000eaec82>] start_kernel+0x3ee/0x3fc 
[<0000000000100020>] _stext+0x20/0x80 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RCU related tracebacks in various boot tests in -next
  2016-11-11 14:32 RCU related tracebacks in various boot tests in -next Guenter Roeck
@ 2016-11-11 18:28 ` Paul E. McKenney
  0 siblings, 0 replies; 2+ messages in thread
From: Paul E. McKenney @ 2016-11-11 18:28 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linux-kernel

On Fri, Nov 11, 2016 at 06:32:43AM -0800, Guenter Roeck wrote:
> Hi,
> 
> I see a number of RCU related tracebacks in recent versions of -next.
> The problem was first seen in next-20161109.
> 
> I didn't bisect, but my wild guess is to blame Paul's related changes.
> Please let me know if a bisect is needed.
> 
> Detailed logs are at http://kerneltests.org/builders.
> Build/test scripts are in the various subdirectories of
> https://github.com/groeck/linux-build-test/tree/master/rootfs/. 
> 
> Thanks,
> Guenter
> 
> --------

This is a failure on my part to convert fully to the new scheme, will fix.
(The third trace makes the situation clear, this one at first glance
looks like a failure to invoke irq_enter(), but it is most likely another
symptom of the problem in the third trace.)

> arm:
> 
> [    1.699096] ===============================
> [    1.699188] suspicious RCU usage. ]
> [    1.699340] 4.9.0-rc4-next-20161111 #4 Not tainted
> [    1.699462] -------------------------------
> [    1.699554] ./include/trace/events/irq.h:69 suspicious rcu_dereference_check() usage!
> [    1.699707] 
> [    1.699707] other info that might help us debug this:
> [    1.699707] 
> [    1.699890] 
> [    1.699890] RCU used illegally from idle CPU!
> [    1.699890] rcu_scheduler_active = 1, debug_locks = 0
> [    1.700103] RCU used illegally from extended quiescent state!
> [    1.700256] no locks held by swapper/0/0.
> [    1.700317] 
> [    1.700317] stack backtrace:
> [    1.700500] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc4-next-20161111 #4
> [    1.700653] Hardware name: Generic OMAP3 (Flattened Device Tree)
> [    1.700988] [<c01100ac>] (unwind_backtrace) from [<c010c290>] (show_stack+0x10/0x14)
> [    1.701171] [<c010c290>] (show_stack) from [<c048f688>] (dump_stack+0xb0/0xe4)
> [    1.701354] [<c048f688>] (dump_stack) from [<c01a319c>] (__handle_irq_event_percpu+0x2fc/0x3d0)
> [    1.701538] [<c01a319c>] (__handle_irq_event_percpu) from [<c01a328c>] (handle_irq_event_percpu+0x1c/0x58)
> [    1.701751] [<c01a328c>] (handle_irq_event_percpu) from [<c01a3300>] (handle_irq_event+0x38/0x5c)
> [    1.701934] [<c01a3300>] (handle_irq_event) from [<c01a65d0>] (handle_level_irq+0xb8/0x14c)
> [    1.702087] [<c01a65d0>] (handle_level_irq) from [<c01a2208>] (generic_handle_irq+0x20/0x34)
> [    1.702270] [<c01a2208>] (generic_handle_irq) from [<c01a2748>] (__handle_domain_irq+0x64/0xe0)
> [    1.702453] [<c01a2748>] (__handle_domain_irq) from [<c07dc7b0>] (__irq_svc+0x70/0x98)
> [    1.702606] [<c07dc7b0>] (__irq_svc) from [<c0108360>] (arch_cpu_idle+0x20/0x3c)
> [    1.702789] [<c0108360>] (arch_cpu_idle) from [<c01882ec>] (cpu_startup_entry+0x180/0x224)
> [    1.702941] [<c01882ec>] (cpu_startup_entry) from [<c0b00c10>] (start_kernel+0x344/0x3b8)
> [    1.703186] [<c0b00c10>] (start_kernel) from [<8000807c>] (0x8000807c)
> 
> arm64:

For this second one....

> [    1.247359] ------------[ cut here ]------------
> [    1.247761] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:380 rcu_momentary_dyntick_idle+0xc0/0xc8
> [    1.247884] Modules linked in:
> [    1.247987] 
> [    1.248150] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc4-next-20161111 #1
> [    1.248241] Hardware name: ZynqMP EP108 (DT)
> [    1.248350] task: ffff80007c4d1900 task.stack: ffff80007c4e0000
> [    1.248445] PC is at rcu_momentary_dyntick_idle+0xc0/0xc8
> [    1.248523] LR is at rcu_note_context_switch+0x1b4/0x2c0
> [    1.248595] pc : [<ffff000008114df0>] lr : [<ffff000008119a0c>] pstate: 600000c5
> [    1.248676] sp : ffff80007c4e3ea0
> [    1.248738] x29: ffff80007c4e3ea0 x28: ffff000008d88000 
> [    1.248842] x27: ffff80007c4e0000 x26: ffff000008da4803 
> [    1.248919] x25: 0000000000000000 x24: ffff80007c4e0000 
> [    1.248992] x23: ffff0000088b3544 x22: ffff80007c4d1900 
> [    1.249065] x21: ffff80007efb1880 x20: ffff000008d88000 
> [    1.249138] x19: ffff80007c4d1900 x18: 000000000000000e 
> [    1.249211] x17: 0000000000000007 x16: 0000000000000001 
> [    1.249285] x15: 000000000000004c x14: 0000000000000000 
> [    1.249362] x13: 000000000000000b x12: 0000000000000006 
> [    1.249435] x11: ffff0000088e0ad8 x10: ffff80007efb18b0 
> [    1.249586] x9 : ffff0000088e0ad4 x8 : 00000000029f6300 
> [    1.249715] x7 : 7fffffffffffffff x6 : ffff80007efabbd0 
> [    1.249839] x5 : 0000000000000004 x4 : ffff000008da0688 
> [    1.249963] x3 : 0000000000000004 x2 : ffff80007efabbdc 
> [    1.250088] x1 : 0000000000000001 x0 : ffff000008da4811 
> [    1.250213] 
> [    1.250325] ---[ end trace 2930a57464ba65da ]---
> [    1.250551] Call trace:
> [    1.251134] Exception stack(0xffff80007c4e3cd0 to 0xffff80007c4e3e00)
> [    1.251396] 3cc0:                                   ffff80007c4d1900 0001000000000000
> [    1.251605] 3ce0: ffff80007c4e3ea0 ffff000008114df0 ffff000008e6c388 ffff80007c4e3f80
> [    1.251790] 3d00: ffff80007c4e3d90 ffff00000838b804 ffff000008e6c3ac ffff80007c4e3fd0
> [    1.251972] 3d20: 00000000ffffffd8 ffff000008910b8c 00000000000003e0 ffff000008b4c1c1
> [    1.252151] 3d40: ffff000008e6c388 ffff80007c4e3fd0 ffff000008b4c1bd ffff000008107264
> [    1.252331] 3d60: ffff80007c4e3d90 ffff00000838b4e0 ffff000008da4811 0000000000000001
> [    1.252509] 3d80: ffff80007efabbdc 0000000000000004 ffff000008da0688 0000000000000004
> [    1.252688] 3da0: ffff80007efabbd0 7fffffffffffffff 00000000029f6300 ffff0000088e0ad4
> [    1.252866] 3dc0: ffff80007efb18b0 ffff0000088e0ad8 0000000000000006 000000000000000b
> [    1.253043] 3de0: 0000000000000000 000000000000004c 0000000000000001 0000000000000007
> [    1.253269] [<ffff000008114df0>] rcu_momentary_dyntick_idle+0xc0/0xc8
> [    1.253442] [<ffff0000088b3000>] __schedule+0x68/0x570
> [    1.253598] [<ffff0000088b3544>] schedule+0x3c/0xa8
> [    1.253726] [<ffff0000088b3998>] schedule_preempt_disabled+0x20/0x38
> [    1.253884] [<ffff0000080f893c>] cpu_startup_entry+0xd4/0x1d8
> [    1.254026] [<ffff00000808e820>] secondary_start_kernel+0x158/0x198
> [    1.254268] [<00000000008bd1a4>] 0x8bd1a4

Looks like I got too excited about adding WARN_ON_ONCE(), will fix.

> s390:
> 
> ===============================
> 
> suspicious RCU usage. ]
> 4.9.0-rc4-next-20161111 #1 Not tainted
> -------------------------------
> ./include/trace/events/rcu.h:454 suspicious rcu_dereference_check() usage!
> 
> other info that might help us debug this:
> 
> RCU used illegally from idle CPU!
> rcu_scheduler_active = 1, debug_locks = 0
> RCU used illegally from extended quiescent state!
> no locks held by swapper/0/0.
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc4-next-20161111 #1
> Hardware name: QEMU             QEMU                  QEMU             (KVM)
> Stack:
> 000000001fffbcb0 000000001fffbd40 0000000000000003 0000000000000000
> 000000001fffbde0 000000001fffbd58 000000001fffbd58 0000000000000020
> 0000000001e20b70 ffffffffffffffff 0000000000000000 0404c0010000000a
> 000000000000000c 000000001fffbda8 0000000000000000 0000000000000000
> 0000000000000000 00000000001168f4 000000001fffbd40 000000001fffbd88
> 
> Call Trace:
> ([<00000000001167b0>] show_trace+0x100/0x104)
> [<0000000000116896>] show_stack+0x6e/0xec 
> [<00000000007985e0>] dump_stack+0x94/0xd4 
> [<000000000021882e>] rcu_eqs_exit_common.isra.49+0x1ee/0x21c 
> [<000000000022172a>] rcu_irq_enter+0x1ca/0x284 
> [<0000000000182178>] irq_enter+0x40/0xc0 
> [<000000000010dcca>] do_IRQ+0x5a/0x94 [<0000000000ad234e>] ext_int_handler+0x11e/0x124 
> [<0000000000103c50>] enabled_wait+0x64/0x110 
> ([<0000000000103c3c>] enabled_wait+0x50/0x110)
> [<0000000000103eca>] arch_cpu_idle+0x42/0x54 
> [<0000000000ad0576>] default_idle_call+0x4e/0x64 
> [<00000000001ddcb0>] cpuidle_idle_call+0x54/0xa4 
> [<00000000001dde62>] cpu_startup_entry+0xea/0x12c 
> [<0000000000eaec82>] start_kernel+0x3ee/0x3fc 
> [<0000000000100020>] _stext+0x20/0x80 

And here I failed to convert one of the accesses to the ->dynticks
counter to the new scheme.  Again, will fix, patch below for the
-next you were using.

And thank you for your testing efforts!

							Thanx, Paul

------------------------------------------------------------------------

commit ee17957e96a196d3ea70a4b1c20ca09f8394ce66
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Fri Nov 11 10:24:56 2016 -0800

    fixup! rcu: Abstract the dynticks snapshot operation

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c1e3d4a333b2..03f7bcbc2e99 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1139,7 +1139,9 @@ void rcu_nmi_exit(void)
  */
 bool notrace __rcu_is_watching(void)
 {
-	return atomic_read(this_cpu_ptr(&rcu_dynticks.dynticks)) & 0x1;
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	return rcu_dynticks_in_eqs(rcu_dynticks_snap(rdtp));
 }
 
 /**

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-11-11 18:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-11 14:32 RCU related tracebacks in various boot tests in -next Guenter Roeck
2016-11-11 18:28 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).