All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
  2022-01-24 10:36 [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings Zqiang
@ 2022-01-24 10:03 ` Ammar Faizi
  2022-01-24 10:38 ` Ammar Faizi
  1 sibling, 0 replies; 6+ messages in thread
From: Ammar Faizi @ 2022-01-24 10:03 UTC (permalink / raw)
  To: Zqiang, paulmck; +Cc: linux-kernel

Hi Zqiang,

On Sat, 22 Jan 2022 02:30:00 +0800, Zqiang <qiang1.zhang@intel.com> wrote:
```
 > +static void rcuc_kthread_dump(struct rcu_data *rdp)
 > +{
 > +        int cpu;
 > +        unsigned long j;
 > +        struct task_struct *rcuc = rdp->rcu_cpu_kthread_task;
 > +
 > +        if (rcu_is_rcuc_kthread_starving(rdp, &j)) {
 > +                cpu = rcuc ? task_cpu(rcuc) : -1;
 > +
 > +                if (rcuc) {
 > +                        pr_err("%s kthread starved for %ld jiffies, stack dump:\n",
 > + rcuc->comm, j);
 > +                        sched_show_task(rcuc);
 > +                        if (cpu >= 0) {
 > +                                if (cpu_online(cpu) && !idle_cpu(cpu)) {
 > +                                        pr_err("Dump current CPU stack:\n");
 > +                                        if (!trigger_single_cpu_backtrace(cpu))
 > + dump_cpu_task(cpu);
 > +                                }
 > +                        }
 > +                }
 > +        }
 > +}
```

1) We can reduce the nested if with an early return
    after checking `rcu_is_rcuc_kthread_starving()`.

2) This ternary operator doesn't make sense:

   `cpu = rcuc ? task_cpu(rcuc) : -1;`

If `rcuc` is NULL, then the "if (rcuc)" block will never
be executed, and `cpu` variable won't be used, why should
we perform a conditional with ternary to assign -1 here?

3) We can use an early return as well for the `if (rcuc)` to
    avoid more nested if.


FWIW, this one makes more sense:
```
static void rcuc_kthread_dump(struct rcu_data *rdp)
{
     int cpu;
     unsigned long j;
     struct task_struct *rcuc;

     if (!rcu_is_rcuc_kthread_starving(rdp, &j))
         return;

     rcuc = rdp->rcu_cpu_kthread_task;
     if (!rcuc)
         return;

     pr_err("%s kthread starved for %ld jiffies, stack dump:\n", rcuc->comm, j);
     sched_show_task(rcuc);
     cpu = task_cpu(rcuc);
     if (cpu_online(cpu) && !idle_cpu(cpu)) {
         pr_err("Dump current CPU stack:\n");
         if (!trigger_single_cpu_backtrace(cpu))
             dump_cpu_task(cpu);
     }
}
```

Thank you!

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
@ 2022-01-24 10:36 Zqiang
  2022-01-24 10:03 ` Ammar Faizi
  2022-01-24 10:38 ` Ammar Faizi
  0 siblings, 2 replies; 6+ messages in thread
From: Zqiang @ 2022-01-24 10:36 UTC (permalink / raw)
  To: paulmck; +Cc: linux-kernel, qiang1.zhang

When the 'use_softirq' be set zero, all RCU_SOFTIRQ processing
be moved to per-CPU rcuc kthreads, if the rcuc kthreads is
being starved, quiescent state can not report in time. the
RCU stall may be triggered. this commit adds a stack trace of
this CPU and dump rcuc kthreads stack to help analyze what
prevents rcuc kthreads from running.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 v1->v2:
 Avoid print anything when CPU is offline or idle. 

 kernel/rcu/tree.c        |  3 +++
 kernel/rcu/tree.h        |  1 +
 kernel/rcu/tree_plugin.h |  3 +++
 kernel/rcu/tree_stall.h  | 36 ++++++++++++++++++++++++++++++++++++
 4 files changed, 43 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a4c25a6283b0..e3fc31a0f546 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2850,10 +2850,12 @@ static void rcu_cpu_kthread(unsigned int cpu)
 {
 	unsigned int *statusp = this_cpu_ptr(&rcu_data.rcu_cpu_kthread_status);
 	char work, *workp = this_cpu_ptr(&rcu_data.rcu_cpu_has_work);
+	unsigned long *j = this_cpu_ptr(&rcu_data.rcuc_activity);
 	int spincnt;
 
 	trace_rcu_utilization(TPS("Start CPU kthread@rcu_run"));
 	for (spincnt = 0; spincnt < 10; spincnt++) {
+		WRITE_ONCE(*j, jiffies);
 		local_bh_disable();
 		*statusp = RCU_KTHREAD_RUNNING;
 		local_irq_disable();
@@ -2874,6 +2876,7 @@ static void rcu_cpu_kthread(unsigned int cpu)
 	schedule_timeout_idle(2);
 	trace_rcu_utilization(TPS("End CPU kthread@rcu_yield"));
 	*statusp = RCU_KTHREAD_WAITING;
+	WRITE_ONCE(*j, jiffies);
 }
 
 static struct smp_hotplug_thread rcu_cpu_thread_spec = {
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 486fc901bd08..4e0fdebb62e8 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -237,6 +237,7 @@ struct rcu_data {
 					/* rcuc per-CPU kthread or NULL. */
 	unsigned int rcu_cpu_kthread_status;
 	char rcu_cpu_has_work;
+	unsigned long rcuc_activity;
 
 	/* 7) Diagnostic data, including RCU CPU stall warnings. */
 	unsigned int softirq_snap;	/* Snapshot of softirq activity. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c5b45c2f68a1..327bbfd79cc6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -996,12 +996,15 @@ dump_blkd_tasks(struct rcu_node *rnp, int ncheck)
  */
 static void rcu_cpu_kthread_setup(unsigned int cpu)
 {
+	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 #ifdef CONFIG_RCU_BOOST
 	struct sched_param sp;
 
 	sp.sched_priority = kthread_prio;
 	sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
 #endif /* #ifdef CONFIG_RCU_BOOST */
+
+	WRITE_ONCE(rdp->rcuc_activity, jiffies);
 }
 
 #ifdef CONFIG_RCU_BOOST
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 21bebf7c9030..739a2d87ea5e 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -379,6 +379,15 @@ static bool rcu_is_gp_kthread_starving(unsigned long *jp)
 	return j > 2 * HZ;
 }
 
+static bool rcu_is_rcuc_kthread_starving(struct rcu_data *rdp, unsigned long *jp)
+{
+	unsigned long j = jiffies - READ_ONCE(rdp->rcuc_activity);
+
+	if (jp)
+		*jp = j;
+	return j > 2 * HZ;
+}
+
 /*
  * Print out diagnostic information for the specified stalled CPU.
  *
@@ -430,6 +439,30 @@ static void print_cpu_stall_info(int cpu)
 	       falsepositive ? " (false positive?)" : "");
 }
 
+static void rcuc_kthread_dump(struct rcu_data *rdp)
+{
+	int cpu;
+	unsigned long j;
+	struct task_struct *rcuc = rdp->rcu_cpu_kthread_task;
+
+	if (rcu_is_rcuc_kthread_starving(rdp, &j)) {
+		cpu = rcuc ? task_cpu(rcuc) : -1;
+
+		if (rcuc) {
+			pr_err("%s kthread starved for %ld jiffies, stack dump:\n",
+									rcuc->comm, j);
+			sched_show_task(rcuc);
+			if (cpu >= 0) {
+				if (cpu_online(cpu) && !idle_cpu(cpu)) {
+					pr_err("Dump current CPU stack:\n");
+					if (!trigger_single_cpu_backtrace(cpu))
+						dump_cpu_task(cpu);
+				}
+			}
+		}
+	}
+}
+
 /* Complain about starvation of grace-period kthread.  */
 static void rcu_check_gp_kthread_starvation(void)
 {
@@ -601,6 +634,9 @@ static void print_cpu_stall(unsigned long gps)
 	rcu_check_gp_kthread_expired_fqs_timer();
 	rcu_check_gp_kthread_starvation();
 
+	if (!use_softirq)
+		rcuc_kthread_dump(rdp);
+
 	rcu_dump_cpu_stacks();
 
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
  2022-01-24 10:36 [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings Zqiang
  2022-01-24 10:03 ` Ammar Faizi
@ 2022-01-24 10:38 ` Ammar Faizi
  2022-01-24 16:42   ` Paul E. McKenney
  1 sibling, 1 reply; 6+ messages in thread
From: Ammar Faizi @ 2022-01-24 10:38 UTC (permalink / raw)
  To: Zqiang, paulmck; +Cc: linux-kernel


[ I resend and fix my reply, my previous reply seems to have an
   issue with the "Date" ]

Hi Zqiang,

On Mon, 24 Jan 2022 18:36:37 +0800, Zqiang <qiang1.zhang@intel.com> wrote:> +static void rcuc_kthread_dump(struct rcu_data *rdp)
> +{
> +	int cpu;
> +	unsigned long j;
> +	struct task_struct *rcuc = rdp->rcu_cpu_kthread_task;
> +
> +	if (rcu_is_rcuc_kthread_starving(rdp, &j)) {
> +		cpu = rcuc ? task_cpu(rcuc) : -1;
> +
> +		if (rcuc) {
> +			pr_err("%s kthread starved for %ld jiffies, stack dump:\n",
> +									rcuc->comm, j);
> +			sched_show_task(rcuc);
> +			if (cpu >= 0) {
> +				if (cpu_online(cpu) && !idle_cpu(cpu)) {
> +					pr_err("Dump current CPU stack:\n");
> +					if (!trigger_single_cpu_backtrace(cpu))
> +						dump_cpu_task(cpu);
> +				}
> +			}
> +		}
> +	}
> +}

1) We can reduce the nested if with an early return after
    checking `rcu_is_rcuc_kthread_starving()`.

2) This ternary operator doesn't make sense:

    `cpu = rcuc ? task_cpu(rcuc) : -1;`

    If `rcuc` is NULL, then the "if (rcuc)" block will never
    be executed, and `cpu` variable won't be used, why should
    we perform a conditional with ternary to assign -1 here?

3) We can use an early return as well for the `if (rcuc)` to
    avoid more nested if.

FWIW, this one makes more sense:
```
static void rcuc_kthread_dump(struct rcu_data *rdp)
{
	 int cpu;
	 unsigned long j;
	 struct task_struct *rcuc;

	 if (!rcu_is_rcuc_kthread_starving(rdp, &j))
		 return;

	 rcuc = rdp->rcu_cpu_kthread_task;
	 if (!rcuc)
		 return;

	 pr_err("%s kthread starved for %ld jiffies, stack dump:\n", rcuc->comm, j);
	 sched_show_task(rcuc);
	 cpu = task_cpu(rcuc);
	 if (cpu_online(cpu) && !idle_cpu(cpu)) {
		 pr_err("Dump current CPU stack:\n");
		 if (!trigger_single_cpu_backtrace(cpu))
			 dump_cpu_task(cpu);
	 }
}
```

Thank you!

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
  2022-01-24 10:38 ` Ammar Faizi
@ 2022-01-24 16:42   ` Paul E. McKenney
  2022-01-24 22:08     ` Ammar Faizi
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2022-01-24 16:42 UTC (permalink / raw)
  To: Ammar Faizi; +Cc: Zqiang, linux-kernel

On Mon, Jan 24, 2022 at 05:38:21PM +0700, Ammar Faizi wrote:
> 
> [ I resend and fix my reply, my previous reply seems to have an
>   issue with the "Date" ]
> 
> Hi Zqiang,
> 
> On Mon, 24 Jan 2022 18:36:37 +0800, Zqiang <qiang1.zhang@intel.com> wrote:> +static void rcuc_kthread_dump(struct rcu_data *rdp)
> > +{
> > +	int cpu;
> > +	unsigned long j;
> > +	struct task_struct *rcuc = rdp->rcu_cpu_kthread_task;
> > +
> > +	if (rcu_is_rcuc_kthread_starving(rdp, &j)) {
> > +		cpu = rcuc ? task_cpu(rcuc) : -1;
> > +
> > +		if (rcuc) {
> > +			pr_err("%s kthread starved for %ld jiffies, stack dump:\n",
> > +									rcuc->comm, j);
> > +			sched_show_task(rcuc);
> > +			if (cpu >= 0) {
> > +				if (cpu_online(cpu) && !idle_cpu(cpu)) {
> > +					pr_err("Dump current CPU stack:\n");
> > +					if (!trigger_single_cpu_backtrace(cpu))
> > +						dump_cpu_task(cpu);
> > +				}
> > +			}
> > +		}
> > +	}
> > +}
> 
> 1) We can reduce the nested if with an early return after
>    checking `rcu_is_rcuc_kthread_starving()`.
> 
> 2) This ternary operator doesn't make sense:
> 
>    `cpu = rcuc ? task_cpu(rcuc) : -1;`
> 
>    If `rcuc` is NULL, then the "if (rcuc)" block will never
>    be executed, and `cpu` variable won't be used, why should
>    we perform a conditional with ternary to assign -1 here?
> 
> 3) We can use an early return as well for the `if (rcuc)` to
>    avoid more nested if.
> 
> FWIW, this one makes more sense:
> ```
> static void rcuc_kthread_dump(struct rcu_data *rdp)
> {
> 	 int cpu;
> 	 unsigned long j;
> 	 struct task_struct *rcuc;
> 
> 	 if (!rcu_is_rcuc_kthread_starving(rdp, &j))
> 		 return;
> 
> 	 rcuc = rdp->rcu_cpu_kthread_task;
> 	 if (!rcuc)
> 		 return;
> 
> 	 pr_err("%s kthread starved for %ld jiffies, stack dump:\n", rcuc->comm, j);

Thank you for looking this over and for the great feedback, Ammar!

I am also wondering why the above message should be printed when the
corresponding CPU is offline or idle.  Why not move the above pr_err()
line down to replace the pr_err() line below?

							Thanx, Paul

> 	 sched_show_task(rcuc);
> 	 cpu = task_cpu(rcuc);
> 	 if (cpu_online(cpu) && !idle_cpu(cpu)) {
> 		 pr_err("Dump current CPU stack:\n");
> 		 if (!trigger_single_cpu_backtrace(cpu))
> 			 dump_cpu_task(cpu);
> 	 }
> }
> ```
> 
> Thank you!
> 
> -- 
> Ammar Faizi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
  2022-01-24 16:42   ` Paul E. McKenney
@ 2022-01-24 22:08     ` Ammar Faizi
  2022-01-25  2:24       ` Zhang, Qiang1
  0 siblings, 1 reply; 6+ messages in thread
From: Ammar Faizi @ 2022-01-24 22:08 UTC (permalink / raw)
  To: paulmck; +Cc: Zqiang, linux-kernel

On 1/24/22 11:42 PM, Paul E. McKenney wrote:
> On Mon, Jan 24, 2022 at 05:38:21PM +0700, Ammar Faizi wrote:
>> [snip...]
>> FWIW, this one makes more sense:
>> ```
>> static void rcuc_kthread_dump(struct rcu_data *rdp)
>> {
>> 	 int cpu;
>> 	 unsigned long j;
>> 	 struct task_struct *rcuc;
>>
>> 	 if (!rcu_is_rcuc_kthread_starving(rdp, &j))
>> 		 return;
>>
>> 	 rcuc = rdp->rcu_cpu_kthread_task;
>> 	 if (!rcuc)
>> 		 return;
>>
>> 	 pr_err("%s kthread starved for %ld jiffies, stack dump:\n", rcuc->comm, j);
> 
> Thank you for looking this over and for the great feedback, Ammar!
> 
> I am also wondering why the above message should be printed when the
> corresponding CPU is offline or idle.  Why not move the above pr_err()
> line down to replace the pr_err() line below?
> 
> 							Thanx, Paul

Hi Paul, Thank you for the review. Agree with that.
Hopefully this one looks better (untested):
```
static void rcuc_kthread_dump(struct rcu_data *rdp)
{
	int cpu;
	unsigned long j;
	struct task_struct *rcuc;

	rcuc = rdp->rcu_cpu_kthread_task;
	if (!rcuc)
		return;

	cpu = task_cpu(rcuc);
	if (cpu_is_offline(cpu) || idle_cpu(cpu))
		return;

	if (!rcu_is_rcuc_kthread_starving(rdp, &j))
		return;

	pr_err("%s kthread starved for %ld jiffies\n", rcuc->comm, j);
	sched_show_task(rcuc);
	if (!trigger_single_cpu_backtrace(cpu))
		dump_cpu_task(cpu);
}
```

Recall that dump_cpu_task looks like this:
```
void dump_cpu_task(int cpu)
{
	pr_info("Task dump for CPU %d:\n", cpu);
	sched_show_task(cpu_curr(cpu));
}
```
which already tells us it's a dump, so "stack dump" in the pr_err()
can be omitted. Any comment, Zqiang?

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings
  2022-01-24 22:08     ` Ammar Faizi
@ 2022-01-25  2:24       ` Zhang, Qiang1
  0 siblings, 0 replies; 6+ messages in thread
From: Zhang, Qiang1 @ 2022-01-25  2:24 UTC (permalink / raw)
  To: Ammar Faizi, paulmck; +Cc: linux-kernel


On 1/24/22 11:42 PM, Paul E. McKenney wrote:
> On Mon, Jan 24, 2022 at 05:38:21PM +0700, Ammar Faizi wrote:
>> [snip...]
>> FWIW, this one makes more sense:
>> ```
>> static void rcuc_kthread_dump(struct rcu_data *rdp) {
>> 	 int cpu;
>> 	 unsigned long j;
>> 	 struct task_struct *rcuc;
>>
>> 	 if (!rcu_is_rcuc_kthread_starving(rdp, &j))
>> 		 return;
>>
>> 	 rcuc = rdp->rcu_cpu_kthread_task;
>> 	 if (!rcuc)
>> 		 return;
>>
>> 	 pr_err("%s kthread starved for %ld jiffies, stack dump:\n", 
>> rcuc->comm, j);
> 
> Thank you for looking this over and for the great feedback, Ammar!
> 
> I am also wondering why the above message should be printed when the 
> corresponding CPU is offline or idle.  Why not move the above pr_err() 
> line down to replace the pr_err() line below?
> 
> 							Thanx, Paul

>>Hi Paul, Thank you for the review. Agree with that.
>>Hopefully this one looks better (untested):
>>```
>>static void rcuc_kthread_dump(struct rcu_data *rdp) {
>>	int cpu;
>>	unsigned long j;
>>	struct task_struct *rcuc;
>>
>>	rcuc = rdp->rcu_cpu_kthread_task;
>>	if (!rcuc)
>>		return;
>>
>>	cpu = task_cpu(rcuc);
>>	if (cpu_is_offline(cpu) || idle_cpu(cpu))
>>		return;
>>
>>	if (!rcu_is_rcuc_kthread_starving(rdp, &j))
>>		return;
>>
>>	pr_err("%s kthread starved for %ld jiffies\n", rcuc->comm, j);
>>	sched_show_task(rcuc);
>>	if (!trigger_single_cpu_backtrace(cpu))
>>		dump_cpu_task(cpu);
>>}
>>```

>>Recall that dump_cpu_task looks like this:
>>```
>>void dump_cpu_task(int cpu)
>>{
>>	pr_info("Task dump for CPU %d:\n", cpu);
>>	sched_show_task(cpu_curr(cpu));
>>}
>>```
>>which already tells us it's a dump, so "stack dump" in the pr_err() can be omitted. Any comment, Zqiang?

Thanks Ammar, this look like more compact, I wiil resend.

Thanks
Zqiang

>>>>
>>--
>>Ammar Faizi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-25  3:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 10:36 [PATCH v2] rcu: Add per-CPU rcuc task info to RCU CPU stall warnings Zqiang
2022-01-24 10:03 ` Ammar Faizi
2022-01-24 10:38 ` Ammar Faizi
2022-01-24 16:42   ` Paul E. McKenney
2022-01-24 22:08     ` Ammar Faizi
2022-01-25  2:24       ` Zhang, Qiang1

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.