All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base
       [not found] <tip-ed4bbf7910b28ce3c691aef28d245585eaabda06@git.kernel.org>
@ 2018-01-14 22:44 ` Paul E. McKenney
  2018-01-15 23:57   ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2018-01-14 22:44 UTC (permalink / raw)
  To: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, tglx, anna-maria
  Cc: linux-tip-commits

On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote:
> Commit-ID:  ed4bbf7910b28ce3c691aef28d245585eaabda06
> Gitweb:     https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06
> Author:     Thomas Gleixner <tglx@linutronix.de>
> AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Sun, 14 Jan 2018 23:25:33 +0100
> 
> timers: Unconditionally check deferrable base
> 
> When the timer base is checked for expired timers then the deferrable base
> must be checked as well. This was missed when making the deferrable base
> independent of base::nohz_active.

Very cool, thank you!  I have fired up some of the dreaded TREE01 tests.

							Thanx, Paul

> Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Sebastian Siewior <bigeasy@linutronix.de>
> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> Cc: rt@linutronix.de
> ---
>  kernel/time/timer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 89a9e1b..0bcf00e 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1696,7 +1696,7 @@ void run_local_timers(void)
>  	hrtimer_run_queues();
>  	/* Raise the softirq only if required. */
>  	if (time_before(jiffies, base->clk)) {
> -		if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
> +		if (!IS_ENABLED(CONFIG_NO_HZ_COMMON))
>  			return;
>  		/* CPU is awake, so check the deferrable base. */
>  		base++;
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base
  2018-01-14 22:44 ` [tip:timers/urgent] timers: Unconditionally check deferrable base Paul E. McKenney
@ 2018-01-15 23:57   ` Paul E. McKenney
  2018-01-16 10:27     ` Thomas Gleixner
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2018-01-15 23:57 UTC (permalink / raw)
  To: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, tglx, anna-maria
  Cc: linux-tip-commits

On Sun, Jan 14, 2018 at 02:44:15PM -0800, Paul E. McKenney wrote:
> On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote:
> > Commit-ID:  ed4bbf7910b28ce3c691aef28d245585eaabda06
> > Gitweb:     https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06
> > Author:     Thomas Gleixner <tglx@linutronix.de>
> > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100
> > Committer:  Thomas Gleixner <tglx@linutronix.de>
> > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100
> > 
> > timers: Unconditionally check deferrable base
> > 
> > When the timer base is checked for expired timers then the deferrable base
> > must be checked as well. This was missed when making the deferrable base
> > independent of base::nohz_active.
> 
> Very cool, thank you!  I have fired up some of the dreaded TREE01 tests.

And (tentatively) a change in behavior!

In one case, I did not see stall warnings (at least not until the ftrace
dump at test end), but just the occasional "Waylayed timer" message:

[ 1316.628105] schedule_timeout: Waylayed timer base->clk: 0x1000eff23 jiffies: 0x1000f8335 base->next_expiry: 0x1000eff23 timer->flags: 0x3c800006 timer->expires 0x1000f62ca idx: f2 idx_now: 139 base->pending_map 000000080000000000000000000000000000000000000000100400002000000000000000000000000000010000000000000000000000000000000000000000000000000000000000
[ 1316.638372] Torture onoff task state:
[ 1316.639774] rcu_preempt     I14984     8      2 0x80000000
[ 1316.641766] Call Trace:
[ 1316.642757]  ? __schedule+0x33c/0x6f0
[ 1316.644109]  ? trace_event_raw_event_timer_start+0x6d/0xb0
[ 1316.646042]  schedule+0x37/0x90
[ 1316.647219]  schedule_timeout+0x159/0x470
[ 1316.648825]  ? __next_timer_interrupt+0xc0/0xc0
[ 1316.650459]  rcu_gp_kthread+0x9ee/0x1090
[ 1316.652147]  kthread+0xf0/0x130
[ 1316.653415]  ? rcu_oom_notify+0xd0/0xd0
[ 1316.654790]  ? kthread_destroy_worker+0x40/0x40
[ 1316.656470]  ret_from_fork+0x1f/0x30

If I am reading the waylay message correctly, the timer was delayed by
about 8.3 seconds instead of the 21 seconds required to cause an RCU
CPU stall warning.  I am going to rearrange the ftrace-dump code a bit
and run more tests.

The next one got the usual stall warning and:

[  633.772036] schedule_timeout: Waylayed timer base->clk: 0x10004c5cb jiffies: 0x10005181e base->next_expiry: 0x10004c5cb timer->flags: 0x2c00006 timer->expires 0x10004c5ca idx: b idx_now: cd base->pending_map 000000000000080000000000000000000000020000000000000020000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000
[  633.855300] Torture onoff task state:
[  633.855300] rcu_preempt     I14984     8      2 0x80000000
[  633.855300] Call Trace:
[  633.855300]  ? __schedule+0x33c/0x6f0
[  633.855300]  ? _raw_spin_lock_irqsave+0x12/0x40 
[  633.855300]  schedule+0x37/0x90
[  633.855300]  rcu_gp_kthread+0xad/0x1090
[  633.855300]  kthread+0xf0/0x130
[  633.855300]  ? rcu_oom_notify+0xd0/0xd0
[  633.855300]  ? kthread_destroy_worker+0x40/0x40
[  633.855300]  ret_from_fork+0x1f/0x30

And the third one also got the usual stall warning and:

[  382.482240] schedule_timeout: Waylayed timer base->clk: 0x10000322a jiffies: 0x100014232 base->next_expiry: 0x10000322a timer->flags: 0x44000001 timer->expires 0x10000f00f idx: 110 idx_now: 115 base->pending_map 000004000000000000000000000000000000000000000000000000000100000000000000000140000000000000000400000000000000000000000000000000000000000000000000
[  382.482245] Torture onoff task state:
[  382.482246] rcu_preempt     I14984     8      2 0x80000000
[  382.482273] Call Trace:
[  382.482280]  ? __schedule+0x33c/0x6f0
[  382.482283]  ? trace_event_raw_event_timer_start+0x6d/0xb0
[  382.482285]  schedule+0x37/0x90
[  382.482287]  schedule_timeout+0x159/0x470
[  382.482289]  ? __next_timer_interrupt+0xc0/0xc0
[  382.482292]  rcu_gp_kthread+0x9ee/0x1090
[  382.482297]  kthread+0xf0/0x130
[  382.482299]  ? rcu_oom_notify+0xd0/0xd0
[  382.482301]  ? kthread_destroy_worker+0x40/0x40
[  382.482303]  ret_from_fork+0x1f/0x30

I don't have enough runs to be sure, but my current (sad) guess is that
the error rate is unchanged.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base
  2018-01-15 23:57   ` Paul E. McKenney
@ 2018-01-16 10:27     ` Thomas Gleixner
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2018-01-16 10:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, anna-maria,
	linux-tip-commits

On Mon, 15 Jan 2018, Paul E. McKenney wrote:

> On Sun, Jan 14, 2018 at 02:44:15PM -0800, Paul E. McKenney wrote:
> > On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote:
> > > Commit-ID:  ed4bbf7910b28ce3c691aef28d245585eaabda06
> > > Gitweb:     https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06
> > > Author:     Thomas Gleixner <tglx@linutronix.de>
> > > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100
> > > Committer:  Thomas Gleixner <tglx@linutronix.de>
> > > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100
> > > 
> > > timers: Unconditionally check deferrable base
> > > 
> > > When the timer base is checked for expired timers then the deferrable base
> > > must be checked as well. This was missed when making the deferrable base
> > > independent of base::nohz_active.
> > 
> > Very cool, thank you!  I have fired up some of the dreaded TREE01 tests.
> 
> And (tentatively) a change in behavior!

That's the least thing I expected. None of this stuff should use deferrable
timers. But what do I know.

> In one case, I did not see stall warnings (at least not until the ftrace
> dump at test end), but just the occasional "Waylayed timer" message:
> 
> [ 1316.628105] schedule_timeout: Waylayed timer base->clk: 0x1000eff23 jiffies: 0x1000f8335 base->next_expiry: 0x1000eff23 timer->flags: 0x3c800006 timer->expires 0x1000f62ca idx: f2 idx_now: 139 base->pending_map 000000080000000000000000000000000000000000000000100400002000000000000000000000000000010000000000000000000000000000000000000000000000000000000000
> [ 1316.638372] Torture onoff task state:
> [ 1316.639774] rcu_preempt     I14984     8      2 0x80000000
> [ 1316.641766] Call Trace:
> [ 1316.642757]  ? __schedule+0x33c/0x6f0
> [ 1316.644109]  ? trace_event_raw_event_timer_start+0x6d/0xb0
> [ 1316.646042]  schedule+0x37/0x90
> [ 1316.647219]  schedule_timeout+0x159/0x470
> [ 1316.648825]  ? __next_timer_interrupt+0xc0/0xc0
> [ 1316.650459]  rcu_gp_kthread+0x9ee/0x1090
> [ 1316.652147]  kthread+0xf0/0x130
> [ 1316.653415]  ? rcu_oom_notify+0xd0/0xd0
> [ 1316.654790]  ? kthread_destroy_worker+0x40/0x40
> [ 1316.656470]  ret_from_fork+0x1f/0x30
> 
> If I am reading the waylay message correctly, the timer was delayed by
> about 8.3 seconds instead of the 21 seconds required to cause an RCU
> CPU stall warning.  I am going to rearrange the ftrace-dump code a bit
> and run more tests.

Would be interesting to stop the trace right when you detect the waylay.

> [  382.482303]  ret_from_fork+0x1f/0x30
> 
> I don't have enough runs to be sure, but my current (sad) guess is that
> the error rate is unchanged.

That patch really should not make any difference. We are wading through
traces and are still puzzled....

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-01-16 10:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <tip-ed4bbf7910b28ce3c691aef28d245585eaabda06@git.kernel.org>
2018-01-14 22:44 ` [tip:timers/urgent] timers: Unconditionally check deferrable base Paul E. McKenney
2018-01-15 23:57   ` Paul E. McKenney
2018-01-16 10:27     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.