All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine
@ 2022-04-26 10:55 Patrick Wang
  2022-04-26 12:11 ` patrick wang
  2022-04-26 15:22 ` Steven Rostedt
  0 siblings, 2 replies; 5+ messages in thread
From: Patrick Wang @ 2022-04-26 10:55 UTC (permalink / raw)
  To: paulmck, rostedt, frederic, quic_neeraju, josh,
	mathieu.desnoyers, joel, jiangshanlai
  Cc: rcu, linux-kernel, patrick.wang.shcn

Because of the change of stop machine implementation, there are functions
being called while waiting currently:

----------------------------------------------------------------
Former stop machine wait loop:
do {
    cpu_relax(); => macro
    ...
} while (curstate != STOPMACHINE_EXIT);
-----------------------------------------------------------------
Current stop machine wait loop:
do {
    stop_machine_yield(cpumask); => function (notraced)
    ...
    touch_nmi_watchdog(); => function (notraced, inside calls also notraced)
    ...
    rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced)
} while (curstate != MULTI_STOP_EXIT);
------------------------------------------------------------------

These functions (including the calls inside) should be marked notrace to avoid
their codes being updated when they are being called. The calls inside
rcu_momentary_dyntick_idle() still remain traced, and will cause crash:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu: 	1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0
  rcu: 	3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0
  	(detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4)
  Task dump for CPU 1:
  task:migration/1     state:R  running task     stack:    0 pid:   19 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:
  Task dump for CPU 3:
  task:migration/3     state:R  running task     stack:    0 pid:   29 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:
  rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
  rcu: 	Possible timer handling issue on cpu=2 timer-softirq=594
  rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
  rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
  rcu: RCU grace-period kthread stack dump:
  task:rcu_preempt     state:I stack:    0 pid:   14 ppid:     2 flags:0x00000000
  Call Trace:
    schedule+0x56/0xc2
    schedule_timeout+0x82/0x184
    rcu_gp_fqs_loop+0x19a/0x318
    rcu_gp_kthread+0x11a/0x140
    kthread+0xee/0x118
    ret_from_exception+0x0/0x14
  rcu: Stack dump where RCU GP kthread last ran:
  Task dump for CPU 2:
  task:migration/2     state:R  running task     stack:    0 pid:   24 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:

Mark the calls inside rcu_momentary_dyntick_idle():
 rcu_preempt_deferred_qs()
 rcu_preempt_need_deferred_qs()
 rcu_preempt_deferred_qs_irqrestore()
as notrace to prevent this.

Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
---
v1->v2:
 - Modify log message.

v2->v3:
 - Move "notrace" to behind "static" to keep the consistency with the rest of the code.

 kernel/rcu/tree_plugin.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c8ba0fe17267..440d9e02a26e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -460,7 +460,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp)
  * be quite short, for example, in the case of the call from
  * rcu_read_unlock_special().
  */
-static void
+static notrace void
 rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
 {
 	bool empty_exp;
@@ -581,7 +581,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
  * is disabled.  This function cannot be expected to understand these
  * nuances, so the caller must handle them.
  */
-static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
+static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t)
 {
 	return (__this_cpu_read(rcu_data.cpu_no_qs.b.exp) ||
 		READ_ONCE(t->rcu_read_unlock_special.s)) &&
@@ -595,7 +595,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
  * evaluate safety in terms of interrupt, softirq, and preemption
  * disabling.
  */
-static void rcu_preempt_deferred_qs(struct task_struct *t)
+static notrace void rcu_preempt_deferred_qs(struct task_struct *t)
 {
 	unsigned long flags;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine
  2022-04-26 10:55 [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine Patrick Wang
@ 2022-04-26 12:11 ` patrick wang
  2022-04-26 15:22 ` Steven Rostedt
  1 sibling, 0 replies; 5+ messages in thread
From: patrick wang @ 2022-04-26 12:11 UTC (permalink / raw)
  To: paulmck, Steven Rostedt, frederic, quic_neeraju, josh,
	mathieu.desnoyers, joel, jiangshanlai
  Cc: rcu, linux-kernel

[These two v3 mails are duplicate due to some network issue. Please ignore one.]

Thanks,
Patrick

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine
  2022-04-26 10:55 [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine Patrick Wang
  2022-04-26 12:11 ` patrick wang
@ 2022-04-26 15:22 ` Steven Rostedt
  2022-04-26 16:15   ` Paul E. McKenney
  1 sibling, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2022-04-26 15:22 UTC (permalink / raw)
  To: Patrick Wang
  Cc: paulmck, frederic, quic_neeraju, josh, mathieu.desnoyers, joel,
	jiangshanlai, rcu, linux-kernel

On Tue, 26 Apr 2022 18:55:11 +0800
Patrick Wang <patrick.wang.shcn@gmail.com> wrote:

> Because of the change of stop machine implementation, there are functions
> being called while waiting currently:
> 
> ----------------------------------------------------------------
> Former stop machine wait loop:
> do {
>     cpu_relax(); => macro
>     ...
> } while (curstate != STOPMACHINE_EXIT);
> -----------------------------------------------------------------
> Current stop machine wait loop:
> do {
>     stop_machine_yield(cpumask); => function (notraced)
>     ...
>     touch_nmi_watchdog(); => function (notraced, inside calls also notraced)
>     ...
>     rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced)
> } while (curstate != MULTI_STOP_EXIT);
> ------------------------------------------------------------------
> 
> These functions (including the calls inside) should be marked notrace to avoid
> their codes being updated when they are being called. The calls inside
> rcu_momentary_dyntick_idle() still remain traced, and will cause crash:
> 
>   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>   rcu: 	1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0
>   rcu: 	3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0
>   	(detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4)
>   Task dump for CPU 1:
>   task:migration/1     state:R  running task     stack:    0 pid:   19 ppid:     2 flags:0x00000000
>   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
>   Call Trace:
>   Task dump for CPU 3:
>   task:migration/3     state:R  running task     stack:    0 pid:   29 ppid:     2 flags:0x00000000
>   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
>   Call Trace:
>   rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
>   rcu: 	Possible timer handling issue on cpu=2 timer-softirq=594
>   rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
>   rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
>   rcu: RCU grace-period kthread stack dump:
>   task:rcu_preempt     state:I stack:    0 pid:   14 ppid:     2 flags:0x00000000
>   Call Trace:
>     schedule+0x56/0xc2
>     schedule_timeout+0x82/0x184
>     rcu_gp_fqs_loop+0x19a/0x318
>     rcu_gp_kthread+0x11a/0x140
>     kthread+0xee/0x118
>     ret_from_exception+0x0/0x14
>   rcu: Stack dump where RCU GP kthread last ran:
>   Task dump for CPU 2:
>   task:migration/2     state:R  running task     stack:    0 pid:   24 ppid:     2 flags:0x00000000
>   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
>   Call Trace:
> 
> Mark the calls inside rcu_momentary_dyntick_idle():
>  rcu_preempt_deferred_qs()
>  rcu_preempt_need_deferred_qs()
>  rcu_preempt_deferred_qs_irqrestore()
> as notrace to prevent this.
> 
> Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
> ---
> v1->v2:
>  - Modify log message.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine
  2022-04-26 15:22 ` Steven Rostedt
@ 2022-04-26 16:15   ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2022-04-26 16:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Patrick Wang, frederic, quic_neeraju, josh, mathieu.desnoyers,
	joel, jiangshanlai, rcu, linux-kernel

On Tue, Apr 26, 2022 at 11:22:50AM -0400, Steven Rostedt wrote:
> On Tue, 26 Apr 2022 18:55:11 +0800
> Patrick Wang <patrick.wang.shcn@gmail.com> wrote:
> 
> > Because of the change of stop machine implementation, there are functions
> > being called while waiting currently:
> > 
> > ----------------------------------------------------------------
> > Former stop machine wait loop:
> > do {
> >     cpu_relax(); => macro
> >     ...
> > } while (curstate != STOPMACHINE_EXIT);
> > -----------------------------------------------------------------
> > Current stop machine wait loop:
> > do {
> >     stop_machine_yield(cpumask); => function (notraced)
> >     ...
> >     touch_nmi_watchdog(); => function (notraced, inside calls also notraced)
> >     ...
> >     rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced)
> > } while (curstate != MULTI_STOP_EXIT);
> > ------------------------------------------------------------------
> > 
> > These functions (including the calls inside) should be marked notrace to avoid
> > their codes being updated when they are being called. The calls inside
> > rcu_momentary_dyntick_idle() still remain traced, and will cause crash:
> > 
> >   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> >   rcu: 	1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0
> >   rcu: 	3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0
> >   	(detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4)
> >   Task dump for CPU 1:
> >   task:migration/1     state:R  running task     stack:    0 pid:   19 ppid:     2 flags:0x00000000
> >   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
> >   Call Trace:
> >   Task dump for CPU 3:
> >   task:migration/3     state:R  running task     stack:    0 pid:   29 ppid:     2 flags:0x00000000
> >   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
> >   Call Trace:
> >   rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> >   rcu: 	Possible timer handling issue on cpu=2 timer-softirq=594
> >   rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
> >   rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> >   rcu: RCU grace-period kthread stack dump:
> >   task:rcu_preempt     state:I stack:    0 pid:   14 ppid:     2 flags:0x00000000
> >   Call Trace:
> >     schedule+0x56/0xc2
> >     schedule_timeout+0x82/0x184
> >     rcu_gp_fqs_loop+0x19a/0x318
> >     rcu_gp_kthread+0x11a/0x140
> >     kthread+0xee/0x118
> >     ret_from_exception+0x0/0x14
> >   rcu: Stack dump where RCU GP kthread last ran:
> >   Task dump for CPU 2:
> >   task:migration/2     state:R  running task     stack:    0 pid:   24 ppid:     2 flags:0x00000000
> >   Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
> >   Call Trace:
> > 
> > Mark the calls inside rcu_momentary_dyntick_idle():
> >  rcu_preempt_deferred_qs()
> >  rcu_preempt_need_deferred_qs()
> >  rcu_preempt_deferred_qs_irqrestore()
> > as notrace to prevent this.
> > 
> > Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
> > ---
> > v1->v2:
> >  - Modify log message.
> 
> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

I have reverted Patrick's earlier commit and replaced it with this
new one, including Steve's ack.  Thank you both!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine
@ 2022-04-26 10:45 Patrick Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Patrick Wang @ 2022-04-26 10:45 UTC (permalink / raw)
  To: paulmck, rostedt, frederic, quic_neeraju, josh,
	mathieu.desnoyers, jiangshanlai, joel
  Cc: rcu, linux-kernel, patrick.wang.shcn

Because of the change of stop machine implementation, there are functions
being called while waiting currently:

----------------------------------------------------------------
Former stop machine wait loop:
do {
    cpu_relax(); => macro
    ...
} while (curstate != STOPMACHINE_EXIT);
-----------------------------------------------------------------
Current stop machine wait loop:
do {
    stop_machine_yield(cpumask); => function (notraced)
    ...
    touch_nmi_watchdog(); => function (notraced, inside calls also notraced)
    ...
    rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced)
} while (curstate != MULTI_STOP_EXIT);
------------------------------------------------------------------

These functions (including the calls inside) should be marked notrace to avoid
their codes being updated when they are being called. The calls inside
rcu_momentary_dyntick_idle() still remain traced, and will cause crash:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu: 	1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0
  rcu: 	3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0
  	(detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4)
  Task dump for CPU 1:
  task:migration/1     state:R  running task     stack:    0 pid:   19 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:
  Task dump for CPU 3:
  task:migration/3     state:R  running task     stack:    0 pid:   29 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:
  rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
  rcu: 	Possible timer handling issue on cpu=2 timer-softirq=594
  rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2
  rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
  rcu: RCU grace-period kthread stack dump:
  task:rcu_preempt     state:I stack:    0 pid:   14 ppid:     2 flags:0x00000000
  Call Trace:
    schedule+0x56/0xc2
    schedule_timeout+0x82/0x184
    rcu_gp_fqs_loop+0x19a/0x318
    rcu_gp_kthread+0x11a/0x140
    kthread+0xee/0x118
    ret_from_exception+0x0/0x14
  rcu: Stack dump where RCU GP kthread last ran:
  Task dump for CPU 2:
  task:migration/2     state:R  running task     stack:    0 pid:   24 ppid:     2 flags:0x00000000
  Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174
  Call Trace:

Mark the calls inside rcu_momentary_dyntick_idle():
 rcu_preempt_deferred_qs()
 rcu_preempt_need_deferred_qs()
 rcu_preempt_deferred_qs_irqrestore()
as notrace to prevent this.

Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
---
v1->v2:
 - Modify log message.

v2->v3:
 - Move "notrace" to behind "static" to keep the consistency with the rest of the code.

 kernel/rcu/tree_plugin.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c8ba0fe17267..440d9e02a26e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -460,7 +460,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp)
  * be quite short, for example, in the case of the call from
  * rcu_read_unlock_special().
  */
-static void
+static notrace void
 rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
 {
 	bool empty_exp;
@@ -581,7 +581,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
  * is disabled.  This function cannot be expected to understand these
  * nuances, so the caller must handle them.
  */
-static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
+static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t)
 {
 	return (__this_cpu_read(rcu_data.cpu_no_qs.b.exp) ||
 		READ_ONCE(t->rcu_read_unlock_special.s)) &&
@@ -595,7 +595,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
  * evaluate safety in terms of interrupt, softirq, and preemption
  * disabling.
  */
-static void rcu_preempt_deferred_qs(struct task_struct *t)
+static notrace void rcu_preempt_deferred_qs(struct task_struct *t)
 {
 	unsigned long flags;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-26 16:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-26 10:55 [PATCH v3] rcu: ftrace: avoid tracing a few functions executed in stop machine Patrick Wang
2022-04-26 12:11 ` patrick wang
2022-04-26 15:22 ` Steven Rostedt
2022-04-26 16:15   ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2022-04-26 10:45 Patrick Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.