----- On Oct 26, 2020, at 4:44 PM, rostedt rostedt@goodmis.org wrote: > On Mon, 26 Oct 2020 10:28:07 -0400 (EDT) > Mathieu Desnoyers wrote: > >> I agree with Peter. Removing the trace_.*_rcuidle weirdness from the tracepoint >> API and fixing all callers to ensure they trace from a context where RCU is >> watching would simplify instrumentation of the Linux kernel, thus making it >> harder >> for subtle bugs to hide and be unearthed only when tracing is enabled. This is > > Note, the lockdep RCU checking of a tracepoint is outside of it being > enabled or disable. So if a non rcuidle() tracepoint is in a location that > RCU is not watching, it will complain loudly, even if you don't enable that > tracepoint. > >> AFAIU the general approach Thomas Gleixner has been aiming for recently, and I >> think it is a good thing. >> >> So if we consider this our target, and that the current state of things is that >> we need to have RCU watching around callback invocation, then removing the >> dependency on SRCU seems like an overall simplification which does not regress >> feature-wise nor speed-wise compared with what we have upstream today. The next >> steps would then be to audit all rcuidle tracepoints and make sure the context >> where they are placed has RCU watching already, so we can remove the tracepoint > > Just remove the _rcuidle() from them, and lockdep will complain if they are > being called without RCU watching. That's the easy part. The patch removing rcuidle is attached to this email, and here are the splats I get just when booting the machine (see below). The part which takes more time is fixing those splats and figuring out how to restructure the code. For instance, what should we do about the rcuidle tracepoint within printk() ? Also, how do we test rcuidle tracepoints triggered only when printk is called from printk warnings ? We'd also need to test on arm32 boards, because some arm architecture code uses rcuidle tracepoints as well. As this is beyond the scope of this patch set, I can either drop this patch entirely (it's not required for sleepable tracepoints), or keep it as an intermediate cleanup step. Removing rcuidle tracepoints entirely is beyond the effort Michael and I can undertake now. ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ============================= ./include/trace/events/preemptirq.h:42 suspicious rcu_dereference_check() usage! WARNING: suspicious RCU usage other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 5.9.1+ #72 Not tainted RCU used illegally from extended quiescent state! no locks held by swapper/0/0. ----------------------------- ./include/trace/events/preemptirq.h:38 suspicious rcu_dereference_check() usage! stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: RCU used illegally from extended quiescent state! no locks held by swapper/1/0. dump_stack+0x8d/0xbb stack backtrace: ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.1+ #72 ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ./include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! no locks held by swapper/0/0. stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb lock_acquire+0x346/0x3b0 ? __lock_acquire+0x2e7/0x1da0 _raw_spin_lock+0x2c/0x40 ? vprintk_emit+0x9f/0x410 vprintk_emit+0x9f/0x410 printk+0x52/0x6e lockdep_rcu_suspicious+0x1b/0xe0 ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ./include/trace/events/lock.h:63 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! 1 lock held by swapper/0/0: #0: ffffffff97a80158 (logbuf_lock){-...}-{2:2}, at: vprintk_emit+0x9f/0x410 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb lock_release+0x25a/0x280 _raw_spin_unlock+0x17/0x30 vprintk_emit+0xdf/0x410 printk+0x52/0x6e lockdep_rcu_suspicious+0x1b/0xe0 ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ./include/trace/events/printk.h:33 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! 2 locks held by swapper/0/0: #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb console_unlock+0x5ad/0x5d0 vprintk_emit+0x133/0x410 printk+0x52/0x6e lockdep_rcu_suspicious+0x1b/0xe0 ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ./include/linux/rcupdate.h:636 rcu_read_lock() used illegally while idle! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! 4 locks held by swapper/0/0: #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0 #2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0 #3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb __atomic_notifier_call_chain+0xd7/0x110 vt_console_print+0x19e/0x3e0 console_unlock+0x417/0x5d0 vprintk_emit+0x133/0x410 printk+0x52/0x6e lockdep_rcu_suspicious+0x1b/0xe0 ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 ============================= WARNING: suspicious RCU usage 5.9.1+ #72 Not tainted ----------------------------- ./include/linux/rcupdate.h:685 rcu_read_unlock() used illegally while idle! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! 4 locks held by swapper/0/0: #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0 #2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0 #3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb __atomic_notifier_call_chain+0x101/0x110 vt_console_print+0x19e/0x3e0 console_unlock+0x417/0x5d0 vprintk_emit+0x133/0x410 printk+0x52/0x6e lockdep_rcu_suspicious+0x1b/0xe0 ? default_idle+0xa/0x20 trace_hardirqs_on+0xed/0xf0 default_idle+0xa/0x20 default_idle_call+0x4f/0x1e0 do_idle+0x1ef/0x2c0 cpu_startup_entry+0x19/0x20 start_kernel+0x578/0x59d secondary_startup_64+0xa4/0xb0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x8d/0xbb ? rcu_idle_exit+0x32/0x40 trace_hardirqs_off+0xe3/0xf0 rcu_idle_exit+0x32/0x40 default_idle_call+0x54/0x1e0 do_idle+0x1ef/0x2c0 ? lockdep_hardirqs_on_prepare+0xec/0x180 cpu_startup_entry+0x19/0x20 start_secondary+0x11c/0x160 secondary_startup_64+0xa4/0xb0 Thanks, Mathieu > > -- Steve > > >> rcuidle API. That would effectively remove the calls to >> rcu_irq_{enter,exit}_irqson >> from the tracepoint code. >> >> This is however beyond the scope of the proposed patch set. >> >> Thanks, >> >> Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com