linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: anna-maria@linutronix.de
Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org
Subject: Re: Timer refuses to expire
Date: Mon, 4 Dec 2017 09:34:49 -0800	[thread overview]
Message-ID: <20171204173449.GA8327@linux.vnet.ibm.com> (raw)
In-Reply-To: <20171201182529.GA6073@linux.vnet.ibm.com>

On Fri, Dec 01, 2017 at 10:25:29AM -0800, Paul E. McKenney wrote:
> Hello, Anna-Maria,
> 
> It turned out that one set of problems was due to NO_HZ_FULL issues,
> which are addressed with a pair of patches.  However, there is still one
> case of a timer refusing to expire in rcutorture's TREE01 test scenario.
> The takes from two to four hours to trigger, and the most recent one
> gives the following diagnostic (see patch at the very end of this email):
> 
> [13166.127458] schedule_timeout: Waylayed timer base->clk: 0x100c40004 jiffies: 0x100c4524e base->next_expiry: 0x100c40004 timer->flags: 0x1000003 timer->expires 0x100c40003 idx: 4 idx_now: ea base->pending_map 000000000000001000000000001080000000000000040000000002000000000000000000000000000000000000040000000000000000000000000000000000000000000000000000
> 
> The message appears any time a timer for less than five jiffies takes
> more than eight seconds to expire.  In all cases, this expiry did not
> happen naturally, but rather via an unsolicited wakeup from the RCU CPU
> stall code.  If I am interpreting this message correctly, base->clk
> has advanced past this timer, and the timer is correctly flagged in
> base->pending_map.  This seems like part of the problem because the
> timer's timeout was only three jiffies.  However, base->clk has not
> advanced for more than 20 seconds, which seems like another problem.
> 
> What additional diagnostics would be helpful?  I can capture data
> at the beginning of the schedule_timeout in the timer_list structure,
> and of course can print more information at the time of the wakeup.

And a few more from weekend testing, on the off-chance that any of them
are at all helpful.  One interesting thing about the several messages
starting at time 4731 is that CPU 5 doesn't seem to be expiring timers,
even when new ones are posted.  But at time 4891, base->clk has in fact
advanced.

							Thanx, Paul

[ 4710.845607] schedule_timeout: Waylayed timer base->clk: 0x10042fb85 jiffies: 0x100434ddc base->next_expiry: 0x10042fb85 timer->flags: 0x1400005 timer->expires 0x10042fb84 idx: 5 idx_now: e7 base->pending_map 000000000000002000000000000000000000000000000000000000100000000000000000000000000000000000010000000000000000000000000000000000000000000000000000
[ 4731.872311] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x10043a000 base->next_expiry: 0x100434800 timer->flags: 0x2e000005 timer->expires 0x100434de0 idx: b8 idx_now: d1 base->pending_map 000000000000000000000000000000000100000000000000020000100000140000000000000000000000000000010000000000000000000000000000000000000000000000000000
[ 4750.334637] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x10043e81e base->next_expiry: 0x100434800 timer->flags: 0x33000005 timer->expires 0x10043960b idx: cc idx_now: 13f base->pending_map 000000000000000000000000000000000000000000000000020000100014140000000000000000000000000000010000000000000000000000000000000000000000000000000000
[ 4771.443542] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x100443a93 base->next_expiry: 0x100434800 timer->flags: 0x4fc00005 timer->expires 0x10043e821 idx: 13f idx_now: 104 base->pending_map 000000000000000000000000000000000000000000000000020000100014040080000000000000000000000000010000000000000000000000000000000000000000000000000000
[ 4813.336678] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x10044de38 base->next_expiry: 0x100434800 timer->flags: 0x4fc00005 timer->expires 0x10043e821 idx: 13f idx_now: 10e base->pending_map 000000000000000000000000000000000000000000000000020000100014040080000000000000100000000000010000000000000000000000000000000000000000000000000000
[ 4834.345313] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x100453049 base->next_expiry: 0x100434800 timer->flags: 0x41000005 timer->expires 0x100443a96 idx: 104 idx_now: 114 base->pending_map 000000000000000000000000000000000000000000000000020000100014040000000000000040100000000000010000000000000000000000000000000000000000000000000000
[ 4834.561159] schedule_timeout: Waylayed timer base->clk: 0x100434800 jiffies: 0x100453121 base->next_expiry: 0x100434800 timer->flags: 0x43800005 timer->expires 0x10044de3b idx: 10e idx_now: 114 base->pending_map 000000000000000000000000000000000000000000000000020000100014040000000000000040000000000000000000000000000000000000000000000000000000000000000000
[ 4891.721190] schedule_timeout: Waylayed timer base->clk: 0x10045be44 jiffies: 0x100461069 base->next_expiry: 0x10045be44 timer->flags: 0x1000005 timer->expires 0x10045be43 idx: 4 idx_now: c9 base->pending_map 000000000000001000000000000000000000000000002000008010000000000100000000000000000000000000200000000000000000000000000000000000000000000000000000
[ 4891.929320] schedule_timeout: Waylayed timer base->clk: 0x10045be44 jiffies: 0x100461139 base->next_expiry: 0x10045be44 timer->flags: 0x1000005 timer->expires 0x10045be43 idx: 4 idx_now: c9 base->pending_map 000000000000001000000000000000000000000000002000008010000000000100000000000000000000000000200000000000000000000000000000000000000000000000000000
[ 4929.131690] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x10046a28b base->next_expiry: 0x10045c340 timer->flags: 0x49800005 timer->expires 0x100465039 idx: 126 idx_now: 12b base->pending_map 0000000000000000000000000000000000000000000020000080100010008401000000e0000000000000000000200000000000000000000000000000000000000000000000000000
[ 4934.785783] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x10046b8a1 base->next_expiry: 0x10045c340 timer->flags: 0x49c00005 timer->expires 0x100466634 idx: 127 idx_now: 12c base->pending_map 0000000000000000000000000000000000000000000020000080100010008401000000e0000000000000000000200000000000000000000000000000000000000000000000000000
[ 4997.703037] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x10047ae66 base->next_expiry: 0x10045c340 timer->flags: 0x4b000005 timer->expires 0x10046b8a4 idx: 12c idx_now: 13b base->pending_map 000000000000000000000000000000000000000000002000008010001000840100181060000000000000000000200000000000000000000000000000000000000000000000000000
[ 5018.820354] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1004800e4 base->next_expiry: 0x10045c340 timer->flags: 0x4ec00005 timer->expires 0x10047ae69 idx: 13b idx_now: 101 base->pending_map 000000000000000000000000000000000000000000002000008010001000840108180060000000010000000000200000000000000000000000000000000000000000000000000000
[ 5079.106104] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x10048ecde base->next_expiry: 0x10045c340 timer->flags: 0x42800005 timer->expires 0x100489a5b idx: 10a idx_now: 10f base->pending_map 000000000000000000000000000000000000000000002000008010001000840100180060000084030000000000200000000000000000000000000000000000000000000000000000
[ 5142.123589] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x10049e28b base->next_expiry: 0x10045c340 timer->flags: 0x43c00005 timer->expires 0x10048ece1 idx: 10f idx_now: 154 base->pending_map 000000000000000000000000000000000000000000002000008010001000840100180060000180030000000000200000000000000000000000000000000000000000000000000000
[ 5163.235561] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1004a3503 base->next_expiry: 0x10045c340 timer->flags: 0x55000005 timer->expires 0x10049e28e idx: 154 idx_now: 155 base->pending_map 000000000000000000000000000000000000000000002000008010001000840100180060000180030000000000300000000000000000000000000000000000000000000000000000
[ 5226.136310] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1004b2ab8 base->next_expiry: 0x10045c340 timer->flags: 0x55400005 timer->expires 0x1004a3506 idx: 155 idx_now: 157 base->pending_map 000000000000000000000000000000000000000000002000008010001000840100180060000180030000000000200000000000000000000000000000000000000000000000000000
[18001.753494] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1b79 base->next_expiry: 0x10045c340 timer->flags: 0x55400005 timer->expires 0x1004a5e11 idx: 155 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000008010001000800100180020000180030000000000200000000000000000000000000000000000000000000000000000
[18001.763893] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1b83 base->next_expiry: 0x10045c340 timer->flags: 0x49400005 timer->expires 0x100464e01 idx: 125 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000008010001000800100180020000180030000000000200000000000000000000000000000000000000000000000000000
[18001.772807] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1b8c base->next_expiry: 0x10045c340 timer->flags: 0x40400005 timer->expires 0x100480601 idx: 101 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000008010001000800100180000000180030000000000200000000000000000000000000000000000000000000000000000
[18001.782538] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1b96 base->next_expiry: 0x10045c340 timer->flags: 0x30000005 timer->expires 0x10045fe01 idx: c0 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000008010001000800100180000000180000000000000200000000000000000000000000000000000000000000000000000
[18001.791954] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1b9f base->next_expiry: 0x10045c340 timer->flags: 0x44000005 timer->expires 0x10048f601 idx: 110 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000000010001000800000180000000180000000000000200000000000000000000000000000000000000000000000000000
[18001.801725] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1ba9 base->next_expiry: 0x10045c340 timer->flags: 0x4d000005 timer->expires 0x100473e12 idx: 134 idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000000010001000800000180000000000000000000000200000000000000000000000000000000000000000000000000000
[18001.811052] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1bb2 base->next_expiry: 0x10045c340 timer->flags: 0x3b000005 timer->expires 0x10045d601 idx: ec idx_now: 184 base->pending_map 000000000000000000000000000000000000000000002000000010001000800000000000000000000000000000200000000000000000000000000000000000000000000000000000
[18001.820594] schedule_timeout: Waylayed timer base->clk: 0x10045c340 jiffies: 0x1010e1bbc base->next_expiry: 0x10045c340 timer->flags: 0x37000005 timer->expires 0x1004636e1 idx: dc idx_now: 184 base->pending_map 000000000000000000000000000000000000000000000000000000001000800000000000000000000000000000200000000000000000000000000000000000000000000000000000

  reply	other threads:[~2017-12-04 17:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-01 18:25 Timer refuses to expire Paul E. McKenney
2017-12-04 17:34 ` Paul E. McKenney [this message]
2017-12-04 17:42 ` Paul E. McKenney
2017-12-05 23:37   ` Paul E. McKenney
2017-12-06 22:04     ` Paul E. McKenney
2017-12-07  7:03       ` Boqun Feng
2017-12-07 14:56         ` Paul E. McKenney
2017-12-07 21:45           ` Paul E. McKenney
2017-12-08  0:31             ` Paul E. McKenney
2017-12-08  1:06           ` Boqun Feng
2017-12-08  1:26             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171204173449.GA8327@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=anna-maria@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).