All of lore.kernel.org
 help / color / mirror / Atom feed
* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 11:32 Jonathan Cameron
  2017-07-25 12:26   ` Nicholas Piggin
  0 siblings, 1 reply; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

We observed a regression on our d05 boards (but curiously not
the fairly similar but single socket / smaller core count
d03), initially seen with linux-next prior to the merge window
and still present in v4.13-rc2.

The symptom is:

[ 1982.959365] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1982.965021] 	2-...: (10 GPs behind) idle=1d4/0/0 softirq=306/306 fqs=0 
[ 1982.971624] 	3-...: (2 GPs behind) idle=700/0/0 softirq=307/307 fqs=0 
[ 1982.978139] 	4-...: (20 GPs behind) idle=9f4/0/0 softirq=651/652 fqs=0 
[ 1982.984740] 	5-...: (18 GPs behind) idle=a78/0/0 softirq=369/371 fqs=0 
[ 1982.991342] 	6-...: (26 GPs behind) idle=e5c/0/0 softirq=217/219 fqs=0 
[ 1982.997944] 	7-...: (1438 GPs behind) idle=eb4/0/0 softirq=260/260 fqs=0 
[ 1983.004719] 	8-...: (18 GPs behind) idle=830/0/0 softirq=1609/1609 fqs=0 
[ 1983.011494] 	9-...: (18 GPs behind) idle=e9c/0/0 softirq=242/242 fqs=0 
[ 1983.018095] 	10-...: (1434 GPs behind) idle=ca0/0/0 softirq=210/212 fqs=0 
[ 1983.024957] 	11-...: (1106 GPs behind) idle=ee0/0/0 softirq=188/191 fqs=0 
[ 1983.031819] 	12-...: (1636 GPs behind) idle=c58/0/0 softirq=215/216 fqs=0 
[ 1983.038680] 	13-...: (1114 GPs behind) idle=c20/0/0 softirq=170/170 fqs=0 
[ 1983.045542] 	14-...: (1106 GPs behind) idle=d90/0/0 softirq=176/178 fqs=0 
[ 1983.052403] 	15-...: (1858 GPs behind) idle=900/0/0 softirq=184/185 fqs=0 
[ 1983.059266] 	16-...: (1621 GPs behind) idle=f04/0/0 softirq=204/206 fqs=0 
[ 1983.066127] 	17-...: (1433 GPs behind) idle=d30/0/0 softirq=202/202 fqs=0 
[ 1983.072988] 	18-...: (18 GPs behind) idle=2d4/0/0 softirq=218/220 fqs=0 
[ 1983.079676] 	19-...: (19 GPs behind) idle=bbc/0/0 softirq=178/180 fqs=0 
[ 1983.086364] 	20-...: (0 ticks this GP) idle=ee0/0/0 softirq=231/231 fqs=0 
[ 1983.093226] 	21-...: (4 GPs behind) idle=140/0/0 softirq=208/208 fqs=0 
[ 1983.099827] 	22-...: (5 GPs behind) idle=100/0/0 softirq=186/188 fqs=0 
[ 1983.106428] 	23-...: (1635 GPs behind) idle=fd4/0/0 softirq=1220/1221 fqs=0 
[ 1983.113463] 	24-...: (1112 GPs behind) idle=ca8/0/0 softirq=231/233 fqs=0 
[ 1983.120325] 	25-...: (1637 GPs behind) idle=9c4/0/0 softirq=164/166 fqs=0 
[ 1983.127187] 	27-...: (0 ticks this GP) idle=b08/0/0 softirq=182/182 fqs=0 
[ 1983.134048] 	28-...: (1110 GPs behind) idle=d28/0/0 softirq=179/181 fqs=0 
[ 1983.140909] 	29-...: (8 GPs behind) idle=1dc/0/0 softirq=196/198 fqs=0 
[ 1983.147511] 	31-...: (1434 GPs behind) idle=74c/0/0 softirq=160/161 fqs=0 
[ 1983.154373] 	32-...: (1432 GPs behind) idle=7d4/0/0 softirq=164/164 fqs=0 
[ 1983.161234] 	33-...: (1632 GPs behind) idle=4dc/0/0 softirq=130/132 fqs=0 
[ 1983.168096] 	34-...: (57 GPs behind) idle=3b0/0/0 softirq=411/411 fqs=0 
[ 1983.174784] 	35-...: (1599 GPs behind) idle=8a0/0/0 softirq=177/179 fqs=0 
[ 1983.181646] 	36-...: (1603 GPs behind) idle=520/0/0 softirq=132/134 fqs=0 
[ 1983.188507] 	37-...: (8 GPs behind) idle=02c/0/0 softirq=176/178 fqs=0 
[ 1983.195108] 	38-...: (1442 GPs behind) idle=d8c/0/0 softirq=3189/3190 fqs=0 
[ 1983.202144] 	39-...: (1431 GPs behind) idle=444/0/0 softirq=117/117 fqs=0 
[ 1983.209005] 	40-...: (4 GPs behind) idle=688/0/0 softirq=134/136 fqs=0 
[ 1983.215606] 	41-...: (1599 GPs behind) idle=554/0/0 softirq=2707/2711 fqs=0 
[ 1983.222642] 	42-...: (1430 GPs behind) idle=15c/0/0 softirq=110/111 fqs=0 
[ 1983.229503] 	43-...: (4 GPs behind) idle=054/0/0 softirq=101/103 fqs=0 
[ 1983.236104] 	46-...: (1117 GPs behind) idle=558/0/0 softirq=251/253 fqs=0 
[ 1983.242966] 	47-...: (1118 GPs behind) idle=5f0/0/0 softirq=110/112 fqs=0 
[ 1983.249827] 	48-...: (1621 GPs behind) idle=ef4/0/0 softirq=241/242 fqs=0 
[ 1983.256689] 	49-...: (1648 GPs behind) idle=92c/0/0 softirq=207/208 fqs=0 
[ 1983.263550] 	52-...: (1439 GPs behind) idle=e40/0/0 softirq=261/263 fqs=0 
[ 1983.270412] 	54-...: (1434 GPs behind) idle=650/0/0 softirq=258/260 fqs=0 
[ 1983.277273] 	55-...: (1646 GPs behind) idle=5e0/0/0 softirq=178/178 fqs=0 
[ 1983.284135] 	56-...: (1646 GPs behind) idle=800/0/0 softirq=249/249 fqs=0 
[ 1983.290996] 	57-...: (1599 GPs behind) idle=c48/0/0 softirq=222/224 fqs=0 
[ 1983.297858] 	58-...: (1648 GPs behind) idle=3e8/0/0 softirq=235/235 fqs=0 
[ 1983.304719] 	59-...: (1434 GPs behind) idle=720/0/0 softirq=201/203 fqs=0 
[ 1983.311581] 	60-...: (1647 GPs behind) idle=c80/0/0 softirq=250/250 fqs=0 
[ 1983.318443] 	61-...: (1598 GPs behind) idle=b18/0/0 softirq=208/208 fqs=0 
[ 1983.325304] 	62-...: (1112 GPs behind) idle=0a4/0/0 softirq=620/620 fqs=0 
[ 1983.332166] 	63-...: (1109 GPs behind) idle=4b0/0/0 softirq=187/188 fqs=0 
[ 1983.339026] 	(detected by 44, t=5335 jiffies, g=1566, c=1565, q=220)
[ 1983.345371] Task dump for CPU 2:
[ 1983.348587] swapper/2       R  running task        0     0      1 0x00000000
[ 1983.355626] Call trace:
[ 1983.358072] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1983.363199] [<7fffffffffffffff>] 0x7fffffffffffffff
[ 1983.368062] Task dump for CPU 3:
[ 1983.371278] swapper/3       R  running task        0     0      1 0x00000000
[ 1983.378315] Call trace:
[ 1983.380750] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1983.385881] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
[ 1983.391699] Task dump for CPU 4:
[ 1983.394915] swapper/4       R  running task        0     0      1 0x00000000
[ 1983.401951] Call trace:
[ 1983.404386] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1983.409510] [<7fffffffffffffff>] 0x7fffffffffffffff
[ 1983.414374] Task dump for CPU 5:
[ 1983.417590] swapper/5       R  running task        0     0      1 0x00000000
[ 1983.424626] Call trace:
[ 1983.427060] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1983.432185] [<7fffffffffffffff>] 0x7fffffffffffffff
[ 1983.437049] Task dump for CPU 6:
[ 1983.440265] swapper/6       R  running task        0     0      1 0x00000000

<snip>  Mixture of the two forms above for all the cpus

[ 1984.568746] Call trace:
[ 1984.571180] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.576305] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
[ 1984.582124] Task dump for CPU 62:
[ 1984.585426] swapper/62      R  running task        0     0      1 0x00000000
[ 1984.592461] Call trace:
[ 1984.594895] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.600020] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
[ 1984.605839] Task dump for CPU 63:
[ 1984.609141] swapper/63      R  running task        0     0      1 0x00000000
[ 1984.616176] Call trace:
[ 1984.618611] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.623735] [<7fffffffffffffff>] 0x7fffffffffffffff
[ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
[ 1984.643626] Call trace:
[ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
[ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
[ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
[ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
[ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
[ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50


Reducing the RCU CPU stall timeout makes it happen more often,
but we are seeing even with the default value of 24 seconds.

Tends to occur after a period or relatively low usage, but has
also been seen mid way through performance tests.

This was not seen with v4.12 so a bisection run later lead to
commit 05a4a9527 (kernel/watchdog: split up config options).

Which was odd until we discovered that a side effect of this patch
was to change whether the softlockup detector was enabled or not in
the arm64 defconfig.

On 4.13-rc2 enabling the softlockup detector indeed stopped us
seeing the rcu issue. Disabling the equivalent on 4.12 made the
issue occur there as well.

Clearly the softlockup detector results in a thread on every cpu,
which might be related but beyond that we are still looking into
the issue.

So the obvious question is whether anyone else is seeing this as
it might help us to focus in on where to look!

In the meantime we'll carry on digging.

Thanks,

Jonathan

p.s. As a more general question, do we want to have the
soft lockup detector enabledon arm64 by default?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 11:32 RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Jonathan Cameron
@ 2017-07-25 12:26   ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-07-25 12:26 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-arm-kernel, Paul E. McKenney, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus

On Tue, 25 Jul 2017 19:32:10 +0800
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> Hi All,
> 
> We observed a regression on our d05 boards (but curiously not
> the fairly similar but single socket / smaller core count
> d03), initially seen with linux-next prior to the merge window
> and still present in v4.13-rc2.
> 
> The symptom is:
> 
> [ 1982.959365] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 1982.965021] 	2-...: (10 GPs behind) idle=1d4/0/0 softirq=306/306 fqs=0 
> [ 1982.971624] 	3-...: (2 GPs behind) idle=700/0/0 softirq=307/307 fqs=0 
> [ 1982.978139] 	4-...: (20 GPs behind) idle=9f4/0/0 softirq=651/652 fqs=0 
> [ 1982.984740] 	5-...: (18 GPs behind) idle=a78/0/0 softirq=369/371 fqs=0 
> [ 1982.991342] 	6-...: (26 GPs behind) idle=e5c/0/0 softirq=217/219 fqs=0 
> [ 1982.997944] 	7-...: (1438 GPs behind) idle=eb4/0/0 softirq=260/260 fqs=0 
> [ 1983.004719] 	8-...: (18 GPs behind) idle=830/0/0 softirq=1609/1609 fqs=0 
> [ 1983.011494] 	9-...: (18 GPs behind) idle=e9c/0/0 softirq=242/242 fqs=0 
> [ 1983.018095] 	10-...: (1434 GPs behind) idle=ca0/0/0 softirq=210/212 fqs=0 
> [ 1983.024957] 	11-...: (1106 GPs behind) idle=ee0/0/0 softirq=188/191 fqs=0 
> [ 1983.031819] 	12-...: (1636 GPs behind) idle=c58/0/0 softirq=215/216 fqs=0 
> [ 1983.038680] 	13-...: (1114 GPs behind) idle=c20/0/0 softirq=170/170 fqs=0 
> [ 1983.045542] 	14-...: (1106 GPs behind) idle=d90/0/0 softirq=176/178 fqs=0 
> [ 1983.052403] 	15-...: (1858 GPs behind) idle=900/0/0 softirq=184/185 fqs=0 
> [ 1983.059266] 	16-...: (1621 GPs behind) idle=f04/0/0 softirq=204/206 fqs=0 
> [ 1983.066127] 	17-...: (1433 GPs behind) idle=d30/0/0 softirq=202/202 fqs=0 
> [ 1983.072988] 	18-...: (18 GPs behind) idle=2d4/0/0 softirq=218/220 fqs=0 
> [ 1983.079676] 	19-...: (19 GPs behind) idle=bbc/0/0 softirq=178/180 fqs=0 
> [ 1983.086364] 	20-...: (0 ticks this GP) idle=ee0/0/0 softirq=231/231 fqs=0 
> [ 1983.093226] 	21-...: (4 GPs behind) idle=140/0/0 softirq=208/208 fqs=0 
> [ 1983.099827] 	22-...: (5 GPs behind) idle=100/0/0 softirq=186/188 fqs=0 
> [ 1983.106428] 	23-...: (1635 GPs behind) idle=fd4/0/0 softirq=1220/1221 fqs=0 
> [ 1983.113463] 	24-...: (1112 GPs behind) idle=ca8/0/0 softirq=231/233 fqs=0 
> [ 1983.120325] 	25-...: (1637 GPs behind) idle=9c4/0/0 softirq=164/166 fqs=0 
> [ 1983.127187] 	27-...: (0 ticks this GP) idle=b08/0/0 softirq=182/182 fqs=0 
> [ 1983.134048] 	28-...: (1110 GPs behind) idle=d28/0/0 softirq=179/181 fqs=0 
> [ 1983.140909] 	29-...: (8 GPs behind) idle=1dc/0/0 softirq=196/198 fqs=0 
> [ 1983.147511] 	31-...: (1434 GPs behind) idle=74c/0/0 softirq=160/161 fqs=0 
> [ 1983.154373] 	32-...: (1432 GPs behind) idle=7d4/0/0 softirq=164/164 fqs=0 
> [ 1983.161234] 	33-...: (1632 GPs behind) idle=4dc/0/0 softirq=130/132 fqs=0 
> [ 1983.168096] 	34-...: (57 GPs behind) idle=3b0/0/0 softirq=411/411 fqs=0 
> [ 1983.174784] 	35-...: (1599 GPs behind) idle=8a0/0/0 softirq=177/179 fqs=0 
> [ 1983.181646] 	36-...: (1603 GPs behind) idle=520/0/0 softirq=132/134 fqs=0 
> [ 1983.188507] 	37-...: (8 GPs behind) idle=02c/0/0 softirq=176/178 fqs=0 
> [ 1983.195108] 	38-...: (1442 GPs behind) idle=d8c/0/0 softirq=3189/3190 fqs=0 
> [ 1983.202144] 	39-...: (1431 GPs behind) idle=444/0/0 softirq=117/117 fqs=0 
> [ 1983.209005] 	40-...: (4 GPs behind) idle=688/0/0 softirq=134/136 fqs=0 
> [ 1983.215606] 	41-...: (1599 GPs behind) idle=554/0/0 softirq=2707/2711 fqs=0 
> [ 1983.222642] 	42-...: (1430 GPs behind) idle=15c/0/0 softirq=110/111 fqs=0 
> [ 1983.229503] 	43-...: (4 GPs behind) idle=054/0/0 softirq=101/103 fqs=0 
> [ 1983.236104] 	46-...: (1117 GPs behind) idle=558/0/0 softirq=251/253 fqs=0 
> [ 1983.242966] 	47-...: (1118 GPs behind) idle=5f0/0/0 softirq=110/112 fqs=0 
> [ 1983.249827] 	48-...: (1621 GPs behind) idle=ef4/0/0 softirq=241/242 fqs=0 
> [ 1983.256689] 	49-...: (1648 GPs behind) idle=92c/0/0 softirq=207/208 fqs=0 
> [ 1983.263550] 	52-...: (1439 GPs behind) idle=e40/0/0 softirq=261/263 fqs=0 
> [ 1983.270412] 	54-...: (1434 GPs behind) idle=650/0/0 softirq=258/260 fqs=0 
> [ 1983.277273] 	55-...: (1646 GPs behind) idle=5e0/0/0 softirq=178/178 fqs=0 
> [ 1983.284135] 	56-...: (1646 GPs behind) idle=800/0/0 softirq=249/249 fqs=0 
> [ 1983.290996] 	57-...: (1599 GPs behind) idle=c48/0/0 softirq=222/224 fqs=0 
> [ 1983.297858] 	58-...: (1648 GPs behind) idle=3e8/0/0 softirq=235/235 fqs=0 
> [ 1983.304719] 	59-...: (1434 GPs behind) idle=720/0/0 softirq=201/203 fqs=0 
> [ 1983.311581] 	60-...: (1647 GPs behind) idle=c80/0/0 softirq=250/250 fqs=0 
> [ 1983.318443] 	61-...: (1598 GPs behind) idle=b18/0/0 softirq=208/208 fqs=0 
> [ 1983.325304] 	62-...: (1112 GPs behind) idle=0a4/0/0 softirq=620/620 fqs=0 
> [ 1983.332166] 	63-...: (1109 GPs behind) idle=4b0/0/0 softirq=187/188 fqs=0 
> [ 1983.339026] 	(detected by 44, t=5335 jiffies, g=1566, c=1565, q=220)
> [ 1983.345371] Task dump for CPU 2:
> [ 1983.348587] swapper/2       R  running task        0     0      1 0x00000000
> [ 1983.355626] Call trace:
> [ 1983.358072] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.363199] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.368062] Task dump for CPU 3:
> [ 1983.371278] swapper/3       R  running task        0     0      1 0x00000000
> [ 1983.378315] Call trace:
> [ 1983.380750] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.385881] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1983.391699] Task dump for CPU 4:
> [ 1983.394915] swapper/4       R  running task        0     0      1 0x00000000
> [ 1983.401951] Call trace:
> [ 1983.404386] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.409510] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.414374] Task dump for CPU 5:
> [ 1983.417590] swapper/5       R  running task        0     0      1 0x00000000
> [ 1983.424626] Call trace:
> [ 1983.427060] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.432185] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.437049] Task dump for CPU 6:
> [ 1983.440265] swapper/6       R  running task        0     0      1 0x00000000
> 
> <snip>  Mixture of the two forms above for all the cpus
> 
> [ 1984.568746] Call trace:
> [ 1984.571180] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.576305] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1984.582124] Task dump for CPU 62:
> [ 1984.585426] swapper/62      R  running task        0     0      1 0x00000000
> [ 1984.592461] Call trace:
> [ 1984.594895] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.600020] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1984.605839] Task dump for CPU 63:
> [ 1984.609141] swapper/63      R  running task        0     0      1 0x00000000
> [ 1984.616176] Call trace:
> [ 1984.618611] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.623735] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> 
> 
> Reducing the RCU CPU stall timeout makes it happen more often,
> but we are seeing even with the default value of 24 seconds.
> 
> Tends to occur after a period or relatively low usage, but has
> also been seen mid way through performance tests.
> 
> This was not seen with v4.12 so a bisection run later lead to
> commit 05a4a9527 (kernel/watchdog: split up config options).
> 
> Which was odd until we discovered that a side effect of this patch
> was to change whether the softlockup detector was enabled or not in
> the arm64 defconfig.
> 
> On 4.13-rc2 enabling the softlockup detector indeed stopped us
> seeing the rcu issue. Disabling the equivalent on 4.12 made the
> issue occur there as well.
> 
> Clearly the softlockup detector results in a thread on every cpu,
> which might be related but beyond that we are still looking into
> the issue.
> 
> So the obvious question is whether anyone else is seeing this as
> it might help us to focus in on where to look!

Huh. Something similar has been seen very intermittently on powerpc
as well. We couldn't reproduce it reliably to bisect it already, so
this is a good help.

http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2

It looks like the watchdog patch has a similar effect on powerpc in
that it stops enabling the softlockup detector by default. Haven't
confirmed, but it looks like the same thing.

A bug in RCU stall detection?

> 
> In the meantime we'll carry on digging.
> 
> Thanks,
> 
> Jonathan
> 
> p.s. As a more general question, do we want to have the
> soft lockup detector enabledon arm64 by default?

I've cc'ed Don. My patch should not have changed defconfigs, I
should have been more careful with that.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 12:26   ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-07-25 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 19:32:10 +0800
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> Hi All,
> 
> We observed a regression on our d05 boards (but curiously not
> the fairly similar but single socket / smaller core count
> d03), initially seen with linux-next prior to the merge window
> and still present in v4.13-rc2.
> 
> The symptom is:
> 
> [ 1982.959365] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 1982.965021] 	2-...: (10 GPs behind) idle=1d4/0/0 softirq=306/306 fqs=0 
> [ 1982.971624] 	3-...: (2 GPs behind) idle=700/0/0 softirq=307/307 fqs=0 
> [ 1982.978139] 	4-...: (20 GPs behind) idle=9f4/0/0 softirq=651/652 fqs=0 
> [ 1982.984740] 	5-...: (18 GPs behind) idle=a78/0/0 softirq=369/371 fqs=0 
> [ 1982.991342] 	6-...: (26 GPs behind) idle=e5c/0/0 softirq=217/219 fqs=0 
> [ 1982.997944] 	7-...: (1438 GPs behind) idle=eb4/0/0 softirq=260/260 fqs=0 
> [ 1983.004719] 	8-...: (18 GPs behind) idle=830/0/0 softirq=1609/1609 fqs=0 
> [ 1983.011494] 	9-...: (18 GPs behind) idle=e9c/0/0 softirq=242/242 fqs=0 
> [ 1983.018095] 	10-...: (1434 GPs behind) idle=ca0/0/0 softirq=210/212 fqs=0 
> [ 1983.024957] 	11-...: (1106 GPs behind) idle=ee0/0/0 softirq=188/191 fqs=0 
> [ 1983.031819] 	12-...: (1636 GPs behind) idle=c58/0/0 softirq=215/216 fqs=0 
> [ 1983.038680] 	13-...: (1114 GPs behind) idle=c20/0/0 softirq=170/170 fqs=0 
> [ 1983.045542] 	14-...: (1106 GPs behind) idle=d90/0/0 softirq=176/178 fqs=0 
> [ 1983.052403] 	15-...: (1858 GPs behind) idle=900/0/0 softirq=184/185 fqs=0 
> [ 1983.059266] 	16-...: (1621 GPs behind) idle=f04/0/0 softirq=204/206 fqs=0 
> [ 1983.066127] 	17-...: (1433 GPs behind) idle=d30/0/0 softirq=202/202 fqs=0 
> [ 1983.072988] 	18-...: (18 GPs behind) idle=2d4/0/0 softirq=218/220 fqs=0 
> [ 1983.079676] 	19-...: (19 GPs behind) idle=bbc/0/0 softirq=178/180 fqs=0 
> [ 1983.086364] 	20-...: (0 ticks this GP) idle=ee0/0/0 softirq=231/231 fqs=0 
> [ 1983.093226] 	21-...: (4 GPs behind) idle=140/0/0 softirq=208/208 fqs=0 
> [ 1983.099827] 	22-...: (5 GPs behind) idle=100/0/0 softirq=186/188 fqs=0 
> [ 1983.106428] 	23-...: (1635 GPs behind) idle=fd4/0/0 softirq=1220/1221 fqs=0 
> [ 1983.113463] 	24-...: (1112 GPs behind) idle=ca8/0/0 softirq=231/233 fqs=0 
> [ 1983.120325] 	25-...: (1637 GPs behind) idle=9c4/0/0 softirq=164/166 fqs=0 
> [ 1983.127187] 	27-...: (0 ticks this GP) idle=b08/0/0 softirq=182/182 fqs=0 
> [ 1983.134048] 	28-...: (1110 GPs behind) idle=d28/0/0 softirq=179/181 fqs=0 
> [ 1983.140909] 	29-...: (8 GPs behind) idle=1dc/0/0 softirq=196/198 fqs=0 
> [ 1983.147511] 	31-...: (1434 GPs behind) idle=74c/0/0 softirq=160/161 fqs=0 
> [ 1983.154373] 	32-...: (1432 GPs behind) idle=7d4/0/0 softirq=164/164 fqs=0 
> [ 1983.161234] 	33-...: (1632 GPs behind) idle=4dc/0/0 softirq=130/132 fqs=0 
> [ 1983.168096] 	34-...: (57 GPs behind) idle=3b0/0/0 softirq=411/411 fqs=0 
> [ 1983.174784] 	35-...: (1599 GPs behind) idle=8a0/0/0 softirq=177/179 fqs=0 
> [ 1983.181646] 	36-...: (1603 GPs behind) idle=520/0/0 softirq=132/134 fqs=0 
> [ 1983.188507] 	37-...: (8 GPs behind) idle=02c/0/0 softirq=176/178 fqs=0 
> [ 1983.195108] 	38-...: (1442 GPs behind) idle=d8c/0/0 softirq=3189/3190 fqs=0 
> [ 1983.202144] 	39-...: (1431 GPs behind) idle=444/0/0 softirq=117/117 fqs=0 
> [ 1983.209005] 	40-...: (4 GPs behind) idle=688/0/0 softirq=134/136 fqs=0 
> [ 1983.215606] 	41-...: (1599 GPs behind) idle=554/0/0 softirq=2707/2711 fqs=0 
> [ 1983.222642] 	42-...: (1430 GPs behind) idle=15c/0/0 softirq=110/111 fqs=0 
> [ 1983.229503] 	43-...: (4 GPs behind) idle=054/0/0 softirq=101/103 fqs=0 
> [ 1983.236104] 	46-...: (1117 GPs behind) idle=558/0/0 softirq=251/253 fqs=0 
> [ 1983.242966] 	47-...: (1118 GPs behind) idle=5f0/0/0 softirq=110/112 fqs=0 
> [ 1983.249827] 	48-...: (1621 GPs behind) idle=ef4/0/0 softirq=241/242 fqs=0 
> [ 1983.256689] 	49-...: (1648 GPs behind) idle=92c/0/0 softirq=207/208 fqs=0 
> [ 1983.263550] 	52-...: (1439 GPs behind) idle=e40/0/0 softirq=261/263 fqs=0 
> [ 1983.270412] 	54-...: (1434 GPs behind) idle=650/0/0 softirq=258/260 fqs=0 
> [ 1983.277273] 	55-...: (1646 GPs behind) idle=5e0/0/0 softirq=178/178 fqs=0 
> [ 1983.284135] 	56-...: (1646 GPs behind) idle=800/0/0 softirq=249/249 fqs=0 
> [ 1983.290996] 	57-...: (1599 GPs behind) idle=c48/0/0 softirq=222/224 fqs=0 
> [ 1983.297858] 	58-...: (1648 GPs behind) idle=3e8/0/0 softirq=235/235 fqs=0 
> [ 1983.304719] 	59-...: (1434 GPs behind) idle=720/0/0 softirq=201/203 fqs=0 
> [ 1983.311581] 	60-...: (1647 GPs behind) idle=c80/0/0 softirq=250/250 fqs=0 
> [ 1983.318443] 	61-...: (1598 GPs behind) idle=b18/0/0 softirq=208/208 fqs=0 
> [ 1983.325304] 	62-...: (1112 GPs behind) idle=0a4/0/0 softirq=620/620 fqs=0 
> [ 1983.332166] 	63-...: (1109 GPs behind) idle=4b0/0/0 softirq=187/188 fqs=0 
> [ 1983.339026] 	(detected by 44, t=5335 jiffies, g=1566, c=1565, q=220)
> [ 1983.345371] Task dump for CPU 2:
> [ 1983.348587] swapper/2       R  running task        0     0      1 0x00000000
> [ 1983.355626] Call trace:
> [ 1983.358072] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.363199] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.368062] Task dump for CPU 3:
> [ 1983.371278] swapper/3       R  running task        0     0      1 0x00000000
> [ 1983.378315] Call trace:
> [ 1983.380750] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.385881] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1983.391699] Task dump for CPU 4:
> [ 1983.394915] swapper/4       R  running task        0     0      1 0x00000000
> [ 1983.401951] Call trace:
> [ 1983.404386] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.409510] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.414374] Task dump for CPU 5:
> [ 1983.417590] swapper/5       R  running task        0     0      1 0x00000000
> [ 1983.424626] Call trace:
> [ 1983.427060] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1983.432185] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1983.437049] Task dump for CPU 6:
> [ 1983.440265] swapper/6       R  running task        0     0      1 0x00000000
> 
> <snip>  Mixture of the two forms above for all the cpus
> 
> [ 1984.568746] Call trace:
> [ 1984.571180] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.576305] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1984.582124] Task dump for CPU 62:
> [ 1984.585426] swapper/62      R  running task        0     0      1 0x00000000
> [ 1984.592461] Call trace:
> [ 1984.594895] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.600020] [<ffff000008ea9000>] page_wait_table+0x1280/0x1800
> [ 1984.605839] Task dump for CPU 63:
> [ 1984.609141] swapper/63      R  running task        0     0      1 0x00000000
> [ 1984.616176] Call trace:
> [ 1984.618611] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.623735] [<7fffffffffffffff>] 0x7fffffffffffffff
> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> 
> 
> Reducing the RCU CPU stall timeout makes it happen more often,
> but we are seeing even with the default value of 24 seconds.
> 
> Tends to occur after a period or relatively low usage, but has
> also been seen mid way through performance tests.
> 
> This was not seen with v4.12 so a bisection run later lead to
> commit 05a4a9527 (kernel/watchdog: split up config options).
> 
> Which was odd until we discovered that a side effect of this patch
> was to change whether the softlockup detector was enabled or not in
> the arm64 defconfig.
> 
> On 4.13-rc2 enabling the softlockup detector indeed stopped us
> seeing the rcu issue. Disabling the equivalent on 4.12 made the
> issue occur there as well.
> 
> Clearly the softlockup detector results in a thread on every cpu,
> which might be related but beyond that we are still looking into
> the issue.
> 
> So the obvious question is whether anyone else is seeing this as
> it might help us to focus in on where to look!

Huh. Something similar has been seen very intermittently on powerpc
as well. We couldn't reproduce it reliably to bisect it already, so
this is a good help.

http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2

It looks like the watchdog patch has a similar effect on powerpc in
that it stops enabling the softlockup detector by default. Haven't
confirmed, but it looks like the same thing.

A bug in RCU stall detection?

> 
> In the meantime we'll carry on digging.
> 
> Thanks,
> 
> Jonathan
> 
> p.s. As a more general question, do we want to have the
> soft lockup detector enabledon arm64 by default?

I've cc'ed Don. My patch should not have changed defconfigs, I
should have been more careful with that.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 12:26   ` Nicholas Piggin
  (?)
@ 2017-07-25 13:46     ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 13:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> On Tue, 25 Jul 2017 19:32:10 +0800
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > Hi All,
> > 
> > We observed a regression on our d05 boards (but curiously not
> > the fairly similar but single socket / smaller core count
> > d03), initially seen with linux-next prior to the merge window
> > and still present in v4.13-rc2.
> > 
> > The symptom is:

Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
they have been seeing something similar, and you might well have saved
them the trouble of bisecting.

[ . . . ]

> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

This is the cause from an RCU perspective.  You had a lot of idle CPUs,
and RCU is not permitted to disturb them -- the battery-powered embedded
guys get very annoyed by that sort of thing.  What happens instead is
that each CPU updates a per-CPU state variable when entering or exiting
idle, and the grace-period kthread ("rcu_preempt kthread" in the above
message) checks these state variables, and if when sees an idle CPU,
it reports a quiescent state on that CPU's behalf.

But the grace-period kthread can only do this work if it gets a chance
to run.  And the message above says that this kthread hasn't had a chance
to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
says that grace period #1566 is in progress, the "f0x0" says that no one
is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
that the grace-period kthread has fully initialized the current grace
period and is sleeping for a few jiffies waiting to scan for idle tasks.
Finally, the "->state=0x1" says that the grace-period kthread is in
TASK_INTERRUPTIBLE state, in other words, still sleeping.

So my first question is "What did commit 05a4a9527 (kernel/watchdog:
split up config options) do to prevent the grace-period kthread from
getting a chance to run?"  I must confess that I don't see anything
obvious in that commit, so my second question is "Are we sure that
reverting this commit makes the problem go away?" and my third is "Is
this an intermittent problem that led to a false bisection?"

[ . . . ]

> > Reducing the RCU CPU stall timeout makes it happen more often,
> > but we are seeing even with the default value of 24 seconds.
> > 
> > Tends to occur after a period or relatively low usage, but has
> > also been seen mid way through performance tests.
> > 
> > This was not seen with v4.12 so a bisection run later lead to
> > commit 05a4a9527 (kernel/watchdog: split up config options).
> > 
> > Which was odd until we discovered that a side effect of this patch
> > was to change whether the softlockup detector was enabled or not in
> > the arm64 defconfig.
> > 
> > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > issue occur there as well.
> > 
> > Clearly the softlockup detector results in a thread on every cpu,
> > which might be related but beyond that we are still looking into
> > the issue.
> > 
> > So the obvious question is whether anyone else is seeing this as
> > it might help us to focus in on where to look!
> 
> Huh. Something similar has been seen very intermittently on powerpc
> as well. We couldn't reproduce it reliably to bisect it already, so
> this is a good help.
> 
> http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> 
> It looks like the watchdog patch has a similar effect on powerpc in
> that it stops enabling the softlockup detector by default. Haven't
> confirmed, but it looks like the same thing.
> 
> A bug in RCU stall detection?

Well, if I am expected to make grace periods complete when my grace-period
kthreads aren't getting any CPU time, I will have to make some substantial
changes.  ;-)

One possibility is that the timer isn't firing and another is that the
timer's wakeup is being lost somehow.

So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
This will cause RCU to do redundant wakeups on the grace-period kthread
if the grace period is moving slowly.  This is of course a crude hack,
which is why this boot parameter will also cause a splat if it ever has
to do anything.

Does this help at all?

							Thanx, Paul

> > In the meantime we'll carry on digging.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > p.s. As a more general question, do we want to have the
> > soft lockup detector enabledon arm64 by default?
> 
> I've cc'ed Don. My patch should not have changed defconfigs, I
> should have been more careful with that.
> 
> Thanks,
> Nick
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 13:46     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 13:46 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Jonathan Cameron, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> On Tue, 25 Jul 2017 19:32:10 +0800
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > Hi All,
> > 
> > We observed a regression on our d05 boards (but curiously not
> > the fairly similar but single socket / smaller core count
> > d03), initially seen with linux-next prior to the merge window
> > and still present in v4.13-rc2.
> > 
> > The symptom is:

Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
they have been seeing something similar, and you might well have saved
them the trouble of bisecting.

[ . . . ]

> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

This is the cause from an RCU perspective.  You had a lot of idle CPUs,
and RCU is not permitted to disturb them -- the battery-powered embedded
guys get very annoyed by that sort of thing.  What happens instead is
that each CPU updates a per-CPU state variable when entering or exiting
idle, and the grace-period kthread ("rcu_preempt kthread" in the above
message) checks these state variables, and if when sees an idle CPU,
it reports a quiescent state on that CPU's behalf.

But the grace-period kthread can only do this work if it gets a chance
to run.  And the message above says that this kthread hasn't had a chance
to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
says that grace period #1566 is in progress, the "f0x0" says that no one
is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
that the grace-period kthread has fully initialized the current grace
period and is sleeping for a few jiffies waiting to scan for idle tasks.
Finally, the "->state=0x1" says that the grace-period kthread is in
TASK_INTERRUPTIBLE state, in other words, still sleeping.

So my first question is "What did commit 05a4a9527 (kernel/watchdog:
split up config options) do to prevent the grace-period kthread from
getting a chance to run?"  I must confess that I don't see anything
obvious in that commit, so my second question is "Are we sure that
reverting this commit makes the problem go away?" and my third is "Is
this an intermittent problem that led to a false bisection?"

[ . . . ]

> > Reducing the RCU CPU stall timeout makes it happen more often,
> > but we are seeing even with the default value of 24 seconds.
> > 
> > Tends to occur after a period or relatively low usage, but has
> > also been seen mid way through performance tests.
> > 
> > This was not seen with v4.12 so a bisection run later lead to
> > commit 05a4a9527 (kernel/watchdog: split up config options).
> > 
> > Which was odd until we discovered that a side effect of this patch
> > was to change whether the softlockup detector was enabled or not in
> > the arm64 defconfig.
> > 
> > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > issue occur there as well.
> > 
> > Clearly the softlockup detector results in a thread on every cpu,
> > which might be related but beyond that we are still looking into
> > the issue.
> > 
> > So the obvious question is whether anyone else is seeing this as
> > it might help us to focus in on where to look!
> 
> Huh. Something similar has been seen very intermittently on powerpc
> as well. We couldn't reproduce it reliably to bisect it already, so
> this is a good help.
> 
> http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> 
> It looks like the watchdog patch has a similar effect on powerpc in
> that it stops enabling the softlockup detector by default. Haven't
> confirmed, but it looks like the same thing.
> 
> A bug in RCU stall detection?

Well, if I am expected to make grace periods complete when my grace-period
kthreads aren't getting any CPU time, I will have to make some substantial
changes.  ;-)

One possibility is that the timer isn't firing and another is that the
timer's wakeup is being lost somehow.

So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
This will cause RCU to do redundant wakeups on the grace-period kthread
if the grace period is moving slowly.  This is of course a crude hack,
which is why this boot parameter will also cause a splat if it ever has
to do anything.

Does this help at all?

							Thanx, Paul

> > In the meantime we'll carry on digging.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > p.s. As a more general question, do we want to have the
> > soft lockup detector enabledon arm64 by default?
> 
> I've cc'ed Don. My patch should not have changed defconfigs, I
> should have been more careful with that.
> 
> Thanks,
> Nick
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 13:46     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 13:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> On Tue, 25 Jul 2017 19:32:10 +0800
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > Hi All,
> > 
> > We observed a regression on our d05 boards (but curiously not
> > the fairly similar but single socket / smaller core count
> > d03), initially seen with linux-next prior to the merge window
> > and still present in v4.13-rc2.
> > 
> > The symptom is:

Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
they have been seeing something similar, and you might well have saved
them the trouble of bisecting.

[ . . . ]

> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

This is the cause from an RCU perspective.  You had a lot of idle CPUs,
and RCU is not permitted to disturb them -- the battery-powered embedded
guys get very annoyed by that sort of thing.  What happens instead is
that each CPU updates a per-CPU state variable when entering or exiting
idle, and the grace-period kthread ("rcu_preempt kthread" in the above
message) checks these state variables, and if when sees an idle CPU,
it reports a quiescent state on that CPU's behalf.

But the grace-period kthread can only do this work if it gets a chance
to run.  And the message above says that this kthread hasn't had a chance
to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
says that grace period #1566 is in progress, the "f0x0" says that no one
is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
that the grace-period kthread has fully initialized the current grace
period and is sleeping for a few jiffies waiting to scan for idle tasks.
Finally, the "->state=0x1" says that the grace-period kthread is in
TASK_INTERRUPTIBLE state, in other words, still sleeping.

So my first question is "What did commit 05a4a9527 (kernel/watchdog:
split up config options) do to prevent the grace-period kthread from
getting a chance to run?"  I must confess that I don't see anything
obvious in that commit, so my second question is "Are we sure that
reverting this commit makes the problem go away?" and my third is "Is
this an intermittent problem that led to a false bisection?"

[ . . . ]

> > Reducing the RCU CPU stall timeout makes it happen more often,
> > but we are seeing even with the default value of 24 seconds.
> > 
> > Tends to occur after a period or relatively low usage, but has
> > also been seen mid way through performance tests.
> > 
> > This was not seen with v4.12 so a bisection run later lead to
> > commit 05a4a9527 (kernel/watchdog: split up config options).
> > 
> > Which was odd until we discovered that a side effect of this patch
> > was to change whether the softlockup detector was enabled or not in
> > the arm64 defconfig.
> > 
> > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > issue occur there as well.
> > 
> > Clearly the softlockup detector results in a thread on every cpu,
> > which might be related but beyond that we are still looking into
> > the issue.
> > 
> > So the obvious question is whether anyone else is seeing this as
> > it might help us to focus in on where to look!
> 
> Huh. Something similar has been seen very intermittently on powerpc
> as well. We couldn't reproduce it reliably to bisect it already, so
> this is a good help.
> 
> http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> 
> It looks like the watchdog patch has a similar effect on powerpc in
> that it stops enabling the softlockup detector by default. Haven't
> confirmed, but it looks like the same thing.
> 
> A bug in RCU stall detection?

Well, if I am expected to make grace periods complete when my grace-period
kthreads aren't getting any CPU time, I will have to make some substantial
changes.  ;-)

One possibility is that the timer isn't firing and another is that the
timer's wakeup is being lost somehow.

So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
This will cause RCU to do redundant wakeups on the grace-period kthread
if the grace period is moving slowly.  This is of course a crude hack,
which is why this boot parameter will also cause a splat if it ever has
to do anything.

Does this help at all?

							Thanx, Paul

> > In the meantime we'll carry on digging.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > p.s. As a more general question, do we want to have the
> > soft lockup detector enabledon arm64 by default?
> 
> I've cc'ed Don. My patch should not have changed defconfigs, I
> should have been more careful with that.
> 
> Thanks,
> Nick
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 13:46     ` Paul E. McKenney
  (?)
@ 2017-07-25 14:42       ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 06:46:26 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > On Tue, 25 Jul 2017 19:32:10 +0800
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > Hi All,
> > > 
> > > We observed a regression on our d05 boards (but curiously not
> > > the fairly similar but single socket / smaller core count
> > > d03), initially seen with linux-next prior to the merge window
> > > and still present in v4.13-rc2.
> > > 
> > > The symptom is:  
> 
> Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> they have been seeing something similar, and you might well have saved
> them the trouble of bisecting.
> 
> [ . . . ]
> 
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> 
> This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> and RCU is not permitted to disturb them -- the battery-powered embedded
> guys get very annoyed by that sort of thing.  What happens instead is
> that each CPU updates a per-CPU state variable when entering or exiting
> idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> message) checks these state variables, and if when sees an idle CPU,
> it reports a quiescent state on that CPU's behalf.
> 
> But the grace-period kthread can only do this work if it gets a chance
> to run.  And the message above says that this kthread hasn't had a chance
> to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> says that grace period #1566 is in progress, the "f0x0" says that no one
> is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> that the grace-period kthread has fully initialized the current grace
> period and is sleeping for a few jiffies waiting to scan for idle tasks.
> Finally, the "->state=0x1" says that the grace-period kthread is in
> TASK_INTERRUPTIBLE state, in other words, still sleeping.
Thanks for the explanation!
> 
> So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> split up config options) do to prevent the grace-period kthread from
> getting a chance to run?" 

As far as we can tell it was a side effect of that patch.

The real cause is that patch changed the result of defconfigs to stop running
the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR

Enabling that on 4.13-rc2 (and presumably everything in between)
means we don't see the problem any more.

> I must confess that I don't see anything
> obvious in that commit, so my second question is "Are we sure that
> reverting this commit makes the problem go away?"
Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
That detector fires up a thread on every cpu, which may be relevant.

> and my third is "Is
> this an intermittent problem that led to a false bisection?"
Whilst it is a bit slow to occur, we verified with long runs on either
side of that patch and since with the option enabled on latest mainline.

Also can cause the issue before that patch by disabling the previous
relevant option on 4.12.

> 
> [ . . . ]
> 
> > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > but we are seeing even with the default value of 24 seconds.
> > > 
> > > Tends to occur after a period or relatively low usage, but has
> > > also been seen mid way through performance tests.
> > > 
> > > This was not seen with v4.12 so a bisection run later lead to
> > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > 
> > > Which was odd until we discovered that a side effect of this patch
> > > was to change whether the softlockup detector was enabled or not in
> > > the arm64 defconfig.
> > > 
> > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > issue occur there as well.
> > > 
> > > Clearly the softlockup detector results in a thread on every cpu,
> > > which might be related but beyond that we are still looking into
> > > the issue.
> > > 
> > > So the obvious question is whether anyone else is seeing this as
> > > it might help us to focus in on where to look!  
> > 
> > Huh. Something similar has been seen very intermittently on powerpc
> > as well. We couldn't reproduce it reliably to bisect it already, so
> > this is a good help.
> > 
> > http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> > 
> > It looks like the watchdog patch has a similar effect on powerpc in
> > that it stops enabling the softlockup detector by default. Haven't
> > confirmed, but it looks like the same thing.
> > 
> > A bug in RCU stall detection?  
> 
> Well, if I am expected to make grace periods complete when my grace-period
> kthreads aren't getting any CPU time, I will have to make some substantial
> changes.  ;-)
> 
> One possibility is that the timer isn't firing and another is that the
> timer's wakeup is being lost somehow.
> 
> So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> This will cause RCU to do redundant wakeups on the grace-period kthread
> if the grace period is moving slowly.  This is of course a crude hack,
> which is why this boot parameter will also cause a splat if it ever has
> to do anything.
Running that now will let you know how it goes.  Not seen the issue yet
but might just be a 'lucky' run - will give it a few hours.

Jonathan
> 
> Does this help at all?
> 
> 							Thanx, Paul
> 
> > > In the meantime we'll carry on digging.
> > > 
> > > Thanks,
> > > 
> > > Jonathan
> > > 
> > > p.s. As a more general question, do we want to have the
> > > soft lockup detector enabledon arm64 by default?  
> > 
> > I've cc'ed Don. My patch should not have changed defconfigs, I
> > should have been more careful with that.
> > 
> > Thanks,
> > Nick
> >   
> 



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 14:42       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 14:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Nicholas Piggin, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Tue, 25 Jul 2017 06:46:26 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > On Tue, 25 Jul 2017 19:32:10 +0800
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > Hi All,
> > > 
> > > We observed a regression on our d05 boards (but curiously not
> > > the fairly similar but single socket / smaller core count
> > > d03), initially seen with linux-next prior to the merge window
> > > and still present in v4.13-rc2.
> > > 
> > > The symptom is:  
> 
> Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> they have been seeing something similar, and you might well have saved
> them the trouble of bisecting.
> 
> [ . . . ]
> 
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> 
> This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> and RCU is not permitted to disturb them -- the battery-powered embedded
> guys get very annoyed by that sort of thing.  What happens instead is
> that each CPU updates a per-CPU state variable when entering or exiting
> idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> message) checks these state variables, and if when sees an idle CPU,
> it reports a quiescent state on that CPU's behalf.
> 
> But the grace-period kthread can only do this work if it gets a chance
> to run.  And the message above says that this kthread hasn't had a chance
> to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> says that grace period #1566 is in progress, the "f0x0" says that no one
> is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> that the grace-period kthread has fully initialized the current grace
> period and is sleeping for a few jiffies waiting to scan for idle tasks.
> Finally, the "->state=0x1" says that the grace-period kthread is in
> TASK_INTERRUPTIBLE state, in other words, still sleeping.
Thanks for the explanation!
> 
> So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> split up config options) do to prevent the grace-period kthread from
> getting a chance to run?" 

As far as we can tell it was a side effect of that patch.

The real cause is that patch changed the result of defconfigs to stop running
the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR

Enabling that on 4.13-rc2 (and presumably everything in between)
means we don't see the problem any more.

> I must confess that I don't see anything
> obvious in that commit, so my second question is "Are we sure that
> reverting this commit makes the problem go away?"
Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
That detector fires up a thread on every cpu, which may be relevant.

> and my third is "Is
> this an intermittent problem that led to a false bisection?"
Whilst it is a bit slow to occur, we verified with long runs on either
side of that patch and since with the option enabled on latest mainline.

Also can cause the issue before that patch by disabling the previous
relevant option on 4.12.

> 
> [ . . . ]
> 
> > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > but we are seeing even with the default value of 24 seconds.
> > > 
> > > Tends to occur after a period or relatively low usage, but has
> > > also been seen mid way through performance tests.
> > > 
> > > This was not seen with v4.12 so a bisection run later lead to
> > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > 
> > > Which was odd until we discovered that a side effect of this patch
> > > was to change whether the softlockup detector was enabled or not in
> > > the arm64 defconfig.
> > > 
> > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > issue occur there as well.
> > > 
> > > Clearly the softlockup detector results in a thread on every cpu,
> > > which might be related but beyond that we are still looking into
> > > the issue.
> > > 
> > > So the obvious question is whether anyone else is seeing this as
> > > it might help us to focus in on where to look!  
> > 
> > Huh. Something similar has been seen very intermittently on powerpc
> > as well. We couldn't reproduce it reliably to bisect it already, so
> > this is a good help.
> > 
> > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > 
> > It looks like the watchdog patch has a similar effect on powerpc in
> > that it stops enabling the softlockup detector by default. Haven't
> > confirmed, but it looks like the same thing.
> > 
> > A bug in RCU stall detection?  
> 
> Well, if I am expected to make grace periods complete when my grace-period
> kthreads aren't getting any CPU time, I will have to make some substantial
> changes.  ;-)
> 
> One possibility is that the timer isn't firing and another is that the
> timer's wakeup is being lost somehow.
> 
> So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> This will cause RCU to do redundant wakeups on the grace-period kthread
> if the grace period is moving slowly.  This is of course a crude hack,
> which is why this boot parameter will also cause a splat if it ever has
> to do anything.
Running that now will let you know how it goes.  Not seen the issue yet
but might just be a 'lucky' run - will give it a few hours.

Jonathan
> 
> Does this help at all?
> 
> 							Thanx, Paul
> 
> > > In the meantime we'll carry on digging.
> > > 
> > > Thanks,
> > > 
> > > Jonathan
> > > 
> > > p.s. As a more general question, do we want to have the
> > > soft lockup detector enabledon arm64 by default?  
> > 
> > I've cc'ed Don. My patch should not have changed defconfigs, I
> > should have been more careful with that.
> > 
> > Thanks,
> > Nick
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 14:42       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 06:46:26 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > On Tue, 25 Jul 2017 19:32:10 +0800
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > Hi All,
> > > 
> > > We observed a regression on our d05 boards (but curiously not
> > > the fairly similar but single socket / smaller core count
> > > d03), initially seen with linux-next prior to the merge window
> > > and still present in v4.13-rc2.
> > > 
> > > The symptom is:  
> 
> Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> they have been seeing something similar, and you might well have saved
> them the trouble of bisecting.
> 
> [ . . . ]
> 
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> 
> This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> and RCU is not permitted to disturb them -- the battery-powered embedded
> guys get very annoyed by that sort of thing.  What happens instead is
> that each CPU updates a per-CPU state variable when entering or exiting
> idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> message) checks these state variables, and if when sees an idle CPU,
> it reports a quiescent state on that CPU's behalf.
> 
> But the grace-period kthread can only do this work if it gets a chance
> to run.  And the message above says that this kthread hasn't had a chance
> to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> says that grace period #1566 is in progress, the "f0x0" says that no one
> is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> that the grace-period kthread has fully initialized the current grace
> period and is sleeping for a few jiffies waiting to scan for idle tasks.
> Finally, the "->state=0x1" says that the grace-period kthread is in
> TASK_INTERRUPTIBLE state, in other words, still sleeping.
Thanks for the explanation!
> 
> So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> split up config options) do to prevent the grace-period kthread from
> getting a chance to run?" 

As far as we can tell it was a side effect of that patch.

The real cause is that patch changed the result of defconfigs to stop running
the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR

Enabling that on 4.13-rc2 (and presumably everything in between)
means we don't see the problem any more.

> I must confess that I don't see anything
> obvious in that commit, so my second question is "Are we sure that
> reverting this commit makes the problem go away?"
Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
That detector fires up a thread on every cpu, which may be relevant.

> and my third is "Is
> this an intermittent problem that led to a false bisection?"
Whilst it is a bit slow to occur, we verified with long runs on either
side of that patch and since with the option enabled on latest mainline.

Also can cause the issue before that patch by disabling the previous
relevant option on 4.12.

> 
> [ . . . ]
> 
> > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > but we are seeing even with the default value of 24 seconds.
> > > 
> > > Tends to occur after a period or relatively low usage, but has
> > > also been seen mid way through performance tests.
> > > 
> > > This was not seen with v4.12 so a bisection run later lead to
> > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > 
> > > Which was odd until we discovered that a side effect of this patch
> > > was to change whether the softlockup detector was enabled or not in
> > > the arm64 defconfig.
> > > 
> > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > issue occur there as well.
> > > 
> > > Clearly the softlockup detector results in a thread on every cpu,
> > > which might be related but beyond that we are still looking into
> > > the issue.
> > > 
> > > So the obvious question is whether anyone else is seeing this as
> > > it might help us to focus in on where to look!  
> > 
> > Huh. Something similar has been seen very intermittently on powerpc
> > as well. We couldn't reproduce it reliably to bisect it already, so
> > this is a good help.
> > 
> > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > 
> > It looks like the watchdog patch has a similar effect on powerpc in
> > that it stops enabling the softlockup detector by default. Haven't
> > confirmed, but it looks like the same thing.
> > 
> > A bug in RCU stall detection?  
> 
> Well, if I am expected to make grace periods complete when my grace-period
> kthreads aren't getting any CPU time, I will have to make some substantial
> changes.  ;-)
> 
> One possibility is that the timer isn't firing and another is that the
> timer's wakeup is being lost somehow.
> 
> So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> This will cause RCU to do redundant wakeups on the grace-period kthread
> if the grace period is moving slowly.  This is of course a crude hack,
> which is why this boot parameter will also cause a splat if it ever has
> to do anything.
Running that now will let you know how it goes.  Not seen the issue yet
but might just be a 'lucky' run - will give it a few hours.

Jonathan
> 
> Does this help at all?
> 
> 							Thanx, Paul
> 
> > > In the meantime we'll carry on digging.
> > > 
> > > Thanks,
> > > 
> > > Jonathan
> > > 
> > > p.s. As a more general question, do we want to have the
> > > soft lockup detector enabledon arm64 by default?  
> > 
> > I've cc'ed Don. My patch should not have changed defconfigs, I
> > should have been more careful with that.
> > 
> > Thanks,
> > Nick
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 14:42       ` Jonathan Cameron
  (?)
@ 2017-07-25 15:12         ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 06:46:26 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > Hi All,
> > > > 
> > > > We observed a regression on our d05 boards (but curiously not
> > > > the fairly similar but single socket / smaller core count
> > > > d03), initially seen with linux-next prior to the merge window
> > > > and still present in v4.13-rc2.
> > > > 
> > > > The symptom is:  
> > 
> > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > they have been seeing something similar, and you might well have saved
> > them the trouble of bisecting.
> > 
> > [ . . . ]
> > 
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> > 
> > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > and RCU is not permitted to disturb them -- the battery-powered embedded
> > guys get very annoyed by that sort of thing.  What happens instead is
> > that each CPU updates a per-CPU state variable when entering or exiting
> > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > message) checks these state variables, and if when sees an idle CPU,
> > it reports a quiescent state on that CPU's behalf.
> > 
> > But the grace-period kthread can only do this work if it gets a chance
> > to run.  And the message above says that this kthread hasn't had a chance
> > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > says that grace period #1566 is in progress, the "f0x0" says that no one
> > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > that the grace-period kthread has fully initialized the current grace
> > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > Finally, the "->state=0x1" says that the grace-period kthread is in
> > TASK_INTERRUPTIBLE state, in other words, still sleeping.
> 
> Thanks for the explanation!
> > 
> > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > split up config options) do to prevent the grace-period kthread from
> > getting a chance to run?" 
> 
> As far as we can tell it was a side effect of that patch.
> 
> The real cause is that patch changed the result of defconfigs to stop running
> the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> 
> Enabling that on 4.13-rc2 (and presumably everything in between)
> means we don't see the problem any more.
> 
> > I must confess that I don't see anything
> > obvious in that commit, so my second question is "Are we sure that
> > reverting this commit makes the problem go away?"
> 
> Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> That detector fires up a thread on every cpu, which may be relevant.

Interesting...  Why should it be necessary to fire up a thread on every
CPU in order to make sure that RCU's grace-period kthreads get some
CPU time?  Especially give how many idle CPUs you had on your system.

So I have to ask if there is some other bug that the softlockup detector
is masking.

> > and my third is "Is
> > this an intermittent problem that led to a false bisection?"
> 
> Whilst it is a bit slow to occur, we verified with long runs on either
> side of that patch and since with the option enabled on latest mainline.
> 
> Also can cause the issue before that patch by disabling the previous
> relevant option on 4.12.

OK, thank you -- hard to argue with that!  ;-)

Except that I am still puzzled as to why per-CPU softlockup threads
are needed for RCU's kthreads to get their wakeups.  We really should
be able to disable softlockup and still have kthreads get wakeups and
access to CPU, after all.

> > [ . . . ]
> > 
> > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > but we are seeing even with the default value of 24 seconds.
> > > > 
> > > > Tends to occur after a period or relatively low usage, but has
> > > > also been seen mid way through performance tests.
> > > > 
> > > > This was not seen with v4.12 so a bisection run later lead to
> > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > 
> > > > Which was odd until we discovered that a side effect of this patch
> > > > was to change whether the softlockup detector was enabled or not in
> > > > the arm64 defconfig.
> > > > 
> > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > issue occur there as well.
> > > > 
> > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > which might be related but beyond that we are still looking into
> > > > the issue.
> > > > 
> > > > So the obvious question is whether anyone else is seeing this as
> > > > it might help us to focus in on where to look!  
> > > 
> > > Huh. Something similar has been seen very intermittently on powerpc
> > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > this is a good help.
> > > 
> > > http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> > > 
> > > It looks like the watchdog patch has a similar effect on powerpc in
> > > that it stops enabling the softlockup detector by default. Haven't
> > > confirmed, but it looks like the same thing.
> > > 
> > > A bug in RCU stall detection?  
> > 
> > Well, if I am expected to make grace periods complete when my grace-period
> > kthreads aren't getting any CPU time, I will have to make some substantial
> > changes.  ;-)
> > 
> > One possibility is that the timer isn't firing and another is that the
> > timer's wakeup is being lost somehow.
> > 
> > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > This will cause RCU to do redundant wakeups on the grace-period kthread
> > if the grace period is moving slowly.  This is of course a crude hack,
> > which is why this boot parameter will also cause a splat if it ever has
> > to do anything.
> 
> Running that now will let you know how it goes.  Not seen the issue yet
> but might just be a 'lucky' run - will give it a few hours.

Thank you very much!

							Thanx, Paul

> Jonathan
> > 
> > Does this help at all?
> > 
> > 							Thanx, Paul
> > 
> > > > In the meantime we'll carry on digging.
> > > > 
> > > > Thanks,
> > > > 
> > > > Jonathan
> > > > 
> > > > p.s. As a more general question, do we want to have the
> > > > soft lockup detector enabledon arm64 by default?  
> > > 
> > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > should have been more careful with that.
> > > 
> > > Thanks,
> > > Nick
> > >   
> > 
> 
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 15:12         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 15:12 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Nicholas Piggin, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 06:46:26 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > Hi All,
> > > > 
> > > > We observed a regression on our d05 boards (but curiously not
> > > > the fairly similar but single socket / smaller core count
> > > > d03), initially seen with linux-next prior to the merge window
> > > > and still present in v4.13-rc2.
> > > > 
> > > > The symptom is:  
> > 
> > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > they have been seeing something similar, and you might well have saved
> > them the trouble of bisecting.
> > 
> > [ . . . ]
> > 
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> > 
> > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > and RCU is not permitted to disturb them -- the battery-powered embedded
> > guys get very annoyed by that sort of thing.  What happens instead is
> > that each CPU updates a per-CPU state variable when entering or exiting
> > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > message) checks these state variables, and if when sees an idle CPU,
> > it reports a quiescent state on that CPU's behalf.
> > 
> > But the grace-period kthread can only do this work if it gets a chance
> > to run.  And the message above says that this kthread hasn't had a chance
> > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > says that grace period #1566 is in progress, the "f0x0" says that no one
> > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > that the grace-period kthread has fully initialized the current grace
> > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > Finally, the "->state=0x1" says that the grace-period kthread is in
> > TASK_INTERRUPTIBLE state, in other words, still sleeping.
> 
> Thanks for the explanation!
> > 
> > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > split up config options) do to prevent the grace-period kthread from
> > getting a chance to run?" 
> 
> As far as we can tell it was a side effect of that patch.
> 
> The real cause is that patch changed the result of defconfigs to stop running
> the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> 
> Enabling that on 4.13-rc2 (and presumably everything in between)
> means we don't see the problem any more.
> 
> > I must confess that I don't see anything
> > obvious in that commit, so my second question is "Are we sure that
> > reverting this commit makes the problem go away?"
> 
> Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> That detector fires up a thread on every cpu, which may be relevant.

Interesting...  Why should it be necessary to fire up a thread on every
CPU in order to make sure that RCU's grace-period kthreads get some
CPU time?  Especially give how many idle CPUs you had on your system.

So I have to ask if there is some other bug that the softlockup detector
is masking.

> > and my third is "Is
> > this an intermittent problem that led to a false bisection?"
> 
> Whilst it is a bit slow to occur, we verified with long runs on either
> side of that patch and since with the option enabled on latest mainline.
> 
> Also can cause the issue before that patch by disabling the previous
> relevant option on 4.12.

OK, thank you -- hard to argue with that!  ;-)

Except that I am still puzzled as to why per-CPU softlockup threads
are needed for RCU's kthreads to get their wakeups.  We really should
be able to disable softlockup and still have kthreads get wakeups and
access to CPU, after all.

> > [ . . . ]
> > 
> > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > but we are seeing even with the default value of 24 seconds.
> > > > 
> > > > Tends to occur after a period or relatively low usage, but has
> > > > also been seen mid way through performance tests.
> > > > 
> > > > This was not seen with v4.12 so a bisection run later lead to
> > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > 
> > > > Which was odd until we discovered that a side effect of this patch
> > > > was to change whether the softlockup detector was enabled or not in
> > > > the arm64 defconfig.
> > > > 
> > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > issue occur there as well.
> > > > 
> > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > which might be related but beyond that we are still looking into
> > > > the issue.
> > > > 
> > > > So the obvious question is whether anyone else is seeing this as
> > > > it might help us to focus in on where to look!  
> > > 
> > > Huh. Something similar has been seen very intermittently on powerpc
> > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > this is a good help.
> > > 
> > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > 
> > > It looks like the watchdog patch has a similar effect on powerpc in
> > > that it stops enabling the softlockup detector by default. Haven't
> > > confirmed, but it looks like the same thing.
> > > 
> > > A bug in RCU stall detection?  
> > 
> > Well, if I am expected to make grace periods complete when my grace-period
> > kthreads aren't getting any CPU time, I will have to make some substantial
> > changes.  ;-)
> > 
> > One possibility is that the timer isn't firing and another is that the
> > timer's wakeup is being lost somehow.
> > 
> > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > This will cause RCU to do redundant wakeups on the grace-period kthread
> > if the grace period is moving slowly.  This is of course a crude hack,
> > which is why this boot parameter will also cause a splat if it ever has
> > to do anything.
> 
> Running that now will let you know how it goes.  Not seen the issue yet
> but might just be a 'lucky' run - will give it a few hours.

Thank you very much!

							Thanx, Paul

> Jonathan
> > 
> > Does this help at all?
> > 
> > 							Thanx, Paul
> > 
> > > > In the meantime we'll carry on digging.
> > > > 
> > > > Thanks,
> > > > 
> > > > Jonathan
> > > > 
> > > > p.s. As a more general question, do we want to have the
> > > > soft lockup detector enabledon arm64 by default?  
> > > 
> > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > should have been more careful with that.
> > > 
> > > Thanks,
> > > Nick
> > >   
> > 
> 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 15:12         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-25 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 06:46:26 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > Hi All,
> > > > 
> > > > We observed a regression on our d05 boards (but curiously not
> > > > the fairly similar but single socket / smaller core count
> > > > d03), initially seen with linux-next prior to the merge window
> > > > and still present in v4.13-rc2.
> > > > 
> > > > The symptom is:  
> > 
> > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> > they have been seeing something similar, and you might well have saved
> > them the trouble of bisecting.
> > 
> > [ . . . ]
> > 
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> > 
> > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > and RCU is not permitted to disturb them -- the battery-powered embedded
> > guys get very annoyed by that sort of thing.  What happens instead is
> > that each CPU updates a per-CPU state variable when entering or exiting
> > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > message) checks these state variables, and if when sees an idle CPU,
> > it reports a quiescent state on that CPU's behalf.
> > 
> > But the grace-period kthread can only do this work if it gets a chance
> > to run.  And the message above says that this kthread hasn't had a chance
> > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > says that grace period #1566 is in progress, the "f0x0" says that no one
> > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > that the grace-period kthread has fully initialized the current grace
> > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > Finally, the "->state=0x1" says that the grace-period kthread is in
> > TASK_INTERRUPTIBLE state, in other words, still sleeping.
> 
> Thanks for the explanation!
> > 
> > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > split up config options) do to prevent the grace-period kthread from
> > getting a chance to run?" 
> 
> As far as we can tell it was a side effect of that patch.
> 
> The real cause is that patch changed the result of defconfigs to stop running
> the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> 
> Enabling that on 4.13-rc2 (and presumably everything in between)
> means we don't see the problem any more.
> 
> > I must confess that I don't see anything
> > obvious in that commit, so my second question is "Are we sure that
> > reverting this commit makes the problem go away?"
> 
> Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> That detector fires up a thread on every cpu, which may be relevant.

Interesting...  Why should it be necessary to fire up a thread on every
CPU in order to make sure that RCU's grace-period kthreads get some
CPU time?  Especially give how many idle CPUs you had on your system.

So I have to ask if there is some other bug that the softlockup detector
is masking.

> > and my third is "Is
> > this an intermittent problem that led to a false bisection?"
> 
> Whilst it is a bit slow to occur, we verified with long runs on either
> side of that patch and since with the option enabled on latest mainline.
> 
> Also can cause the issue before that patch by disabling the previous
> relevant option on 4.12.

OK, thank you -- hard to argue with that!  ;-)

Except that I am still puzzled as to why per-CPU softlockup threads
are needed for RCU's kthreads to get their wakeups.  We really should
be able to disable softlockup and still have kthreads get wakeups and
access to CPU, after all.

> > [ . . . ]
> > 
> > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > but we are seeing even with the default value of 24 seconds.
> > > > 
> > > > Tends to occur after a period or relatively low usage, but has
> > > > also been seen mid way through performance tests.
> > > > 
> > > > This was not seen with v4.12 so a bisection run later lead to
> > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > 
> > > > Which was odd until we discovered that a side effect of this patch
> > > > was to change whether the softlockup detector was enabled or not in
> > > > the arm64 defconfig.
> > > > 
> > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > issue occur there as well.
> > > > 
> > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > which might be related but beyond that we are still looking into
> > > > the issue.
> > > > 
> > > > So the obvious question is whether anyone else is seeing this as
> > > > it might help us to focus in on where to look!  
> > > 
> > > Huh. Something similar has been seen very intermittently on powerpc
> > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > this is a good help.
> > > 
> > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > 
> > > It looks like the watchdog patch has a similar effect on powerpc in
> > > that it stops enabling the softlockup detector by default. Haven't
> > > confirmed, but it looks like the same thing.
> > > 
> > > A bug in RCU stall detection?  
> > 
> > Well, if I am expected to make grace periods complete when my grace-period
> > kthreads aren't getting any CPU time, I will have to make some substantial
> > changes.  ;-)
> > 
> > One possibility is that the timer isn't firing and another is that the
> > timer's wakeup is being lost somehow.
> > 
> > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > This will cause RCU to do redundant wakeups on the grace-period kthread
> > if the grace period is moving slowly.  This is of course a crude hack,
> > which is why this boot parameter will also cause a splat if it ever has
> > to do anything.
> 
> Running that now will let you know how it goes.  Not seen the issue yet
> but might just be a 'lucky' run - will give it a few hours.

Thank you very much!

							Thanx, Paul

> Jonathan
> > 
> > Does this help at all?
> > 
> > 							Thanx, Paul
> > 
> > > > In the meantime we'll carry on digging.
> > > > 
> > > > Thanks,
> > > > 
> > > > Jonathan
> > > > 
> > > > p.s. As a more general question, do we want to have the
> > > > soft lockup detector enabledon arm64 by default?  
> > > 
> > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > should have been more careful with that.
> > > 
> > > Thanks,
> > > Nick
> > >   
> > 
> 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 15:12         ` Paul E. McKenney
  (?)
@ 2017-07-25 16:52           ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 08:12:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 06:46:26 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > Hi All,
> > > > > 
> > > > > We observed a regression on our d05 boards (but curiously not
> > > > > the fairly similar but single socket / smaller core count
> > > > > d03), initially seen with linux-next prior to the merge window
> > > > > and still present in v4.13-rc2.
> > > > > 
> > > > > The symptom is:    
> > > 
> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > they have been seeing something similar, and you might well have saved
> > > them the trouble of bisecting.
> > > 
> > > [ . . . ]
> > >   
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > 
> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > guys get very annoyed by that sort of thing.  What happens instead is
> > > that each CPU updates a per-CPU state variable when entering or exiting
> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > message) checks these state variables, and if when sees an idle CPU,
> > > it reports a quiescent state on that CPU's behalf.
> > > 
> > > But the grace-period kthread can only do this work if it gets a chance
> > > to run.  And the message above says that this kthread hasn't had a chance
> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > that the grace-period kthread has fully initialized the current grace
> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > 
> > Thanks for the explanation!  
> > > 
> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > split up config options) do to prevent the grace-period kthread from
> > > getting a chance to run?"   
> > 
> > As far as we can tell it was a side effect of that patch.
> > 
> > The real cause is that patch changed the result of defconfigs to stop running
> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > 
> > Enabling that on 4.13-rc2 (and presumably everything in between)
> > means we don't see the problem any more.
> >   
> > > I must confess that I don't see anything
> > > obvious in that commit, so my second question is "Are we sure that
> > > reverting this commit makes the problem go away?"  
> > 
> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > That detector fires up a thread on every cpu, which may be relevant.  
> 
> Interesting...  Why should it be necessary to fire up a thread on every
> CPU in order to make sure that RCU's grace-period kthreads get some
> CPU time?  Especially give how many idle CPUs you had on your system.
> 
> So I have to ask if there is some other bug that the softlockup detector
> is masking.
I am thinking the same.  We can try going back further than 4.12 tomorrow
(we think we can realistically go back to 4.8 and possibly 4.6
with this board)
> 
> > > and my third is "Is
> > > this an intermittent problem that led to a false bisection?"  
> > 
> > Whilst it is a bit slow to occur, we verified with long runs on either
> > side of that patch and since with the option enabled on latest mainline.
> > 
> > Also can cause the issue before that patch by disabling the previous
> > relevant option on 4.12.  
> 
> OK, thank you -- hard to argue with that!  ;-)
We thought it was a pretty unlikely a bisection result
hence hammered it thoroughly ;)
> 
> Except that I am still puzzled as to why per-CPU softlockup threads
> are needed for RCU's kthreads to get their wakeups.  We really should
> be able to disable softlockup and still have kthreads get wakeups and
> access to CPU, after all.
> 
> > > [ . . . ]
> > >   
> > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > but we are seeing even with the default value of 24 seconds.
> > > > > 
> > > > > Tends to occur after a period or relatively low usage, but has
> > > > > also been seen mid way through performance tests.
> > > > > 
> > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > 
> > > > > Which was odd until we discovered that a side effect of this patch
> > > > > was to change whether the softlockup detector was enabled or not in
> > > > > the arm64 defconfig.
> > > > > 
> > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > issue occur there as well.
> > > > > 
> > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > which might be related but beyond that we are still looking into
> > > > > the issue.
> > > > > 
> > > > > So the obvious question is whether anyone else is seeing this as
> > > > > it might help us to focus in on where to look!    
> > > > 
> > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > this is a good help.
> > > > 
> > > > http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> > > > 
> > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > that it stops enabling the softlockup detector by default. Haven't
> > > > confirmed, but it looks like the same thing.
> > > > 
> > > > A bug in RCU stall detection?    
> > > 
> > > Well, if I am expected to make grace periods complete when my grace-period
> > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > changes.  ;-)
> > > 
> > > One possibility is that the timer isn't firing and another is that the
> > > timer's wakeup is being lost somehow.
> > > 
> > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > if the grace period is moving slowly.  This is of course a crude hack,
> > > which is why this boot parameter will also cause a splat if it ever has
> > > to do anything.  
> > 
> > Running that now will let you know how it goes.  Not seen the issue yet
> > but might just be a 'lucky' run - will give it a few hours.  
> 
> Thank you very much!
So far it's not actually shown any splats.  I did a quick drop back to running
without the parameter and got the original splat in less that 5 minutes.

I've spun up another board with this parameter set as well and will leave
them both running overnight to see if anything interesting happens.

Thanks for your help with this,

Jonathan

> 
> 							Thanx, Paul
> 
> > Jonathan  
> > > 
> > > Does this help at all?
> > > 
> > > 							Thanx, Paul
> > >   
> > > > > In the meantime we'll carry on digging.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Jonathan
> > > > > 
> > > > > p.s. As a more general question, do we want to have the
> > > > > soft lockup detector enabledon arm64 by default?    
> > > > 
> > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > should have been more careful with that.
> > > > 
> > > > Thanks,
> > > > Nick
> > > >     
> > >   
> > 
> >   
> 



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 16:52           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 16:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Nicholas Piggin, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Tue, 25 Jul 2017 08:12:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 06:46:26 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > Hi All,
> > > > > 
> > > > > We observed a regression on our d05 boards (but curiously not
> > > > > the fairly similar but single socket / smaller core count
> > > > > d03), initially seen with linux-next prior to the merge window
> > > > > and still present in v4.13-rc2.
> > > > > 
> > > > > The symptom is:    
> > > 
> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > they have been seeing something similar, and you might well have saved
> > > them the trouble of bisecting.
> > > 
> > > [ . . . ]
> > >   
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > 
> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > guys get very annoyed by that sort of thing.  What happens instead is
> > > that each CPU updates a per-CPU state variable when entering or exiting
> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > message) checks these state variables, and if when sees an idle CPU,
> > > it reports a quiescent state on that CPU's behalf.
> > > 
> > > But the grace-period kthread can only do this work if it gets a chance
> > > to run.  And the message above says that this kthread hasn't had a chance
> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > that the grace-period kthread has fully initialized the current grace
> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > 
> > Thanks for the explanation!  
> > > 
> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > split up config options) do to prevent the grace-period kthread from
> > > getting a chance to run?"   
> > 
> > As far as we can tell it was a side effect of that patch.
> > 
> > The real cause is that patch changed the result of defconfigs to stop running
> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > 
> > Enabling that on 4.13-rc2 (and presumably everything in between)
> > means we don't see the problem any more.
> >   
> > > I must confess that I don't see anything
> > > obvious in that commit, so my second question is "Are we sure that
> > > reverting this commit makes the problem go away?"  
> > 
> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > That detector fires up a thread on every cpu, which may be relevant.  
> 
> Interesting...  Why should it be necessary to fire up a thread on every
> CPU in order to make sure that RCU's grace-period kthreads get some
> CPU time?  Especially give how many idle CPUs you had on your system.
> 
> So I have to ask if there is some other bug that the softlockup detector
> is masking.
I am thinking the same.  We can try going back further than 4.12 tomorrow
(we think we can realistically go back to 4.8 and possibly 4.6
with this board)
> 
> > > and my third is "Is
> > > this an intermittent problem that led to a false bisection?"  
> > 
> > Whilst it is a bit slow to occur, we verified with long runs on either
> > side of that patch and since with the option enabled on latest mainline.
> > 
> > Also can cause the issue before that patch by disabling the previous
> > relevant option on 4.12.  
> 
> OK, thank you -- hard to argue with that!  ;-)
We thought it was a pretty unlikely a bisection result
hence hammered it thoroughly ;)
> 
> Except that I am still puzzled as to why per-CPU softlockup threads
> are needed for RCU's kthreads to get their wakeups.  We really should
> be able to disable softlockup and still have kthreads get wakeups and
> access to CPU, after all.
> 
> > > [ . . . ]
> > >   
> > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > but we are seeing even with the default value of 24 seconds.
> > > > > 
> > > > > Tends to occur after a period or relatively low usage, but has
> > > > > also been seen mid way through performance tests.
> > > > > 
> > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > 
> > > > > Which was odd until we discovered that a side effect of this patch
> > > > > was to change whether the softlockup detector was enabled or not in
> > > > > the arm64 defconfig.
> > > > > 
> > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > issue occur there as well.
> > > > > 
> > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > which might be related but beyond that we are still looking into
> > > > > the issue.
> > > > > 
> > > > > So the obvious question is whether anyone else is seeing this as
> > > > > it might help us to focus in on where to look!    
> > > > 
> > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > this is a good help.
> > > > 
> > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > 
> > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > that it stops enabling the softlockup detector by default. Haven't
> > > > confirmed, but it looks like the same thing.
> > > > 
> > > > A bug in RCU stall detection?    
> > > 
> > > Well, if I am expected to make grace periods complete when my grace-period
> > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > changes.  ;-)
> > > 
> > > One possibility is that the timer isn't firing and another is that the
> > > timer's wakeup is being lost somehow.
> > > 
> > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > if the grace period is moving slowly.  This is of course a crude hack,
> > > which is why this boot parameter will also cause a splat if it ever has
> > > to do anything.  
> > 
> > Running that now will let you know how it goes.  Not seen the issue yet
> > but might just be a 'lucky' run - will give it a few hours.  
> 
> Thank you very much!
So far it's not actually shown any splats.  I did a quick drop back to running
without the parameter and got the original splat in less that 5 minutes.

I've spun up another board with this parameter set as well and will leave
them both running overnight to see if anything interesting happens.

Thanks for your help with this,

Jonathan

> 
> 							Thanx, Paul
> 
> > Jonathan  
> > > 
> > > Does this help at all?
> > > 
> > > 							Thanx, Paul
> > >   
> > > > > In the meantime we'll carry on digging.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Jonathan
> > > > > 
> > > > > p.s. As a more general question, do we want to have the
> > > > > soft lockup detector enabledon arm64 by default?    
> > > > 
> > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > should have been more careful with that.
> > > > 
> > > > Thanks,
> > > > Nick
> > > >     
> > >   
> > 
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-25 16:52           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-25 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 08:12:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 06:46:26 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > Hi All,
> > > > > 
> > > > > We observed a regression on our d05 boards (but curiously not
> > > > > the fairly similar but single socket / smaller core count
> > > > > d03), initially seen with linux-next prior to the merge window
> > > > > and still present in v4.13-rc2.
> > > > > 
> > > > > The symptom is:    
> > > 
> > > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> > > they have been seeing something similar, and you might well have saved
> > > them the trouble of bisecting.
> > > 
> > > [ . . . ]
> > >   
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > 
> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > guys get very annoyed by that sort of thing.  What happens instead is
> > > that each CPU updates a per-CPU state variable when entering or exiting
> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > message) checks these state variables, and if when sees an idle CPU,
> > > it reports a quiescent state on that CPU's behalf.
> > > 
> > > But the grace-period kthread can only do this work if it gets a chance
> > > to run.  And the message above says that this kthread hasn't had a chance
> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > that the grace-period kthread has fully initialized the current grace
> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > 
> > Thanks for the explanation!  
> > > 
> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > split up config options) do to prevent the grace-period kthread from
> > > getting a chance to run?"   
> > 
> > As far as we can tell it was a side effect of that patch.
> > 
> > The real cause is that patch changed the result of defconfigs to stop running
> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > 
> > Enabling that on 4.13-rc2 (and presumably everything in between)
> > means we don't see the problem any more.
> >   
> > > I must confess that I don't see anything
> > > obvious in that commit, so my second question is "Are we sure that
> > > reverting this commit makes the problem go away?"  
> > 
> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > That detector fires up a thread on every cpu, which may be relevant.  
> 
> Interesting...  Why should it be necessary to fire up a thread on every
> CPU in order to make sure that RCU's grace-period kthreads get some
> CPU time?  Especially give how many idle CPUs you had on your system.
> 
> So I have to ask if there is some other bug that the softlockup detector
> is masking.
I am thinking the same.  We can try going back further than 4.12 tomorrow
(we think we can realistically go back to 4.8 and possibly 4.6
with this board)
> 
> > > and my third is "Is
> > > this an intermittent problem that led to a false bisection?"  
> > 
> > Whilst it is a bit slow to occur, we verified with long runs on either
> > side of that patch and since with the option enabled on latest mainline.
> > 
> > Also can cause the issue before that patch by disabling the previous
> > relevant option on 4.12.  
> 
> OK, thank you -- hard to argue with that!  ;-)
We thought it was a pretty unlikely a bisection result
hence hammered it thoroughly ;)
> 
> Except that I am still puzzled as to why per-CPU softlockup threads
> are needed for RCU's kthreads to get their wakeups.  We really should
> be able to disable softlockup and still have kthreads get wakeups and
> access to CPU, after all.
> 
> > > [ . . . ]
> > >   
> > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > but we are seeing even with the default value of 24 seconds.
> > > > > 
> > > > > Tends to occur after a period or relatively low usage, but has
> > > > > also been seen mid way through performance tests.
> > > > > 
> > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > 
> > > > > Which was odd until we discovered that a side effect of this patch
> > > > > was to change whether the softlockup detector was enabled or not in
> > > > > the arm64 defconfig.
> > > > > 
> > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > issue occur there as well.
> > > > > 
> > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > which might be related but beyond that we are still looking into
> > > > > the issue.
> > > > > 
> > > > > So the obvious question is whether anyone else is seeing this as
> > > > > it might help us to focus in on where to look!    
> > > > 
> > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > this is a good help.
> > > > 
> > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > 
> > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > that it stops enabling the softlockup detector by default. Haven't
> > > > confirmed, but it looks like the same thing.
> > > > 
> > > > A bug in RCU stall detection?    
> > > 
> > > Well, if I am expected to make grace periods complete when my grace-period
> > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > changes.  ;-)
> > > 
> > > One possibility is that the timer isn't firing and another is that the
> > > timer's wakeup is being lost somehow.
> > > 
> > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > if the grace period is moving slowly.  This is of course a crude hack,
> > > which is why this boot parameter will also cause a splat if it ever has
> > > to do anything.  
> > 
> > Running that now will let you know how it goes.  Not seen the issue yet
> > but might just be a 'lucky' run - will give it a few hours.  
> 
> Thank you very much!
So far it's not actually shown any splats.  I did a quick drop back to running
without the parameter and got the original splat in less that 5 minutes.

I've spun up another board with this parameter set as well and will leave
them both running overnight to see if anything interesting happens.

Thanks for your help with this,

Jonathan

> 
> 							Thanx, Paul
> 
> > Jonathan  
> > > 
> > > Does this help at all?
> > > 
> > > 							Thanx, Paul
> > >   
> > > > > In the meantime we'll carry on digging.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Jonathan
> > > > > 
> > > > > p.s. As a more general question, do we want to have the
> > > > > soft lockup detector enabledon arm64 by default?    
> > > > 
> > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > should have been more careful with that.
> > > > 
> > > > Thanks,
> > > > Nick
> > > >     
> > >   
> > 
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-25 16:52           ` Jonathan Cameron
  (?)
@ 2017-07-25 21:10             ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-25 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Wed, 26 Jul 2017 00:52:07 +0800

> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
>> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
>> > On Tue, 25 Jul 2017 06:46:26 -0700
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>> >   
>> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
>> > > > On Tue, 25 Jul 2017 19:32:10 +0800
>> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>> > > >     
>> > > > > Hi All,
>> > > > > 
>> > > > > We observed a regression on our d05 boards (but curiously not
>> > > > > the fairly similar but single socket / smaller core count
>> > > > > d03), initially seen with linux-next prior to the merge window
>> > > > > and still present in v4.13-rc2.
>> > > > > 
>> > > > > The symptom is:    
>> > > 
>> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
>> > > they have been seeing something similar, and you might well have saved
>> > > them the trouble of bisecting.
>> > > 
>> > > [ . . . ]
>> > >   
>> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
>> > > 
>> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
>> > > and RCU is not permitted to disturb them -- the battery-powered embedded
>> > > guys get very annoyed by that sort of thing.  What happens instead is
>> > > that each CPU updates a per-CPU state variable when entering or exiting
>> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
>> > > message) checks these state variables, and if when sees an idle CPU,
>> > > it reports a quiescent state on that CPU's behalf.
>> > > 
>> > > But the grace-period kthread can only do this work if it gets a chance
>> > > to run.  And the message above says that this kthread hasn't had a chance
>> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
>> > > says that grace period #1566 is in progress, the "f0x0" says that no one
>> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
>> > > that the grace-period kthread has fully initialized the current grace
>> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
>> > > Finally, the "->state=0x1" says that the grace-period kthread is in
>> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
>> > 
>> > Thanks for the explanation!  
>> > > 
>> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
>> > > split up config options) do to prevent the grace-period kthread from
>> > > getting a chance to run?"   
>> > 
>> > As far as we can tell it was a side effect of that patch.
>> > 
>> > The real cause is that patch changed the result of defconfigs to stop running
>> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
>> > 
>> > Enabling that on 4.13-rc2 (and presumably everything in between)
>> > means we don't see the problem any more.
>> >   
>> > > I must confess that I don't see anything
>> > > obvious in that commit, so my second question is "Are we sure that
>> > > reverting this commit makes the problem go away?"  
>> > 
>> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
>> > That detector fires up a thread on every cpu, which may be relevant.  
>> 
>> Interesting...  Why should it be necessary to fire up a thread on every
>> CPU in order to make sure that RCU's grace-period kthreads get some
>> CPU time?  Especially give how many idle CPUs you had on your system.
>> 
>> So I have to ask if there is some other bug that the softlockup detector
>> is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Just to report, turning softlockup back on fixes things for me on
sparc64 too.

The thing about softlockup is it runs an hrtimer, which seems to run
about every 4 seconds.

So I wonder if this is a NO_HZ problem.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-25 21:10             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-25 21:10 UTC (permalink / raw)
  To: Jonathan.Cameron
  Cc: paulmck, npiggin, linux-arm-kernel, linuxarm, akpm, abdhalee,
	linuxppc-dev, dzickus, sparclinux, sfr

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Wed, 26 Jul 2017 00:52:07 +0800

> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
>> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
>> > On Tue, 25 Jul 2017 06:46:26 -0700
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>> >   
>> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
>> > > > On Tue, 25 Jul 2017 19:32:10 +0800
>> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>> > > >     
>> > > > > Hi All,
>> > > > > 
>> > > > > We observed a regression on our d05 boards (but curiously not
>> > > > > the fairly similar but single socket / smaller core count
>> > > > > d03), initially seen with linux-next prior to the merge window
>> > > > > and still present in v4.13-rc2.
>> > > > > 
>> > > > > The symptom is:    
>> > > 
>> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
>> > > they have been seeing something similar, and you might well have saved
>> > > them the trouble of bisecting.
>> > > 
>> > > [ . . . ]
>> > >   
>> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
>> > > 
>> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
>> > > and RCU is not permitted to disturb them -- the battery-powered embedded
>> > > guys get very annoyed by that sort of thing.  What happens instead is
>> > > that each CPU updates a per-CPU state variable when entering or exiting
>> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
>> > > message) checks these state variables, and if when sees an idle CPU,
>> > > it reports a quiescent state on that CPU's behalf.
>> > > 
>> > > But the grace-period kthread can only do this work if it gets a chance
>> > > to run.  And the message above says that this kthread hasn't had a chance
>> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
>> > > says that grace period #1566 is in progress, the "f0x0" says that no one
>> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
>> > > that the grace-period kthread has fully initialized the current grace
>> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
>> > > Finally, the "->state=0x1" says that the grace-period kthread is in
>> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
>> > 
>> > Thanks for the explanation!  
>> > > 
>> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
>> > > split up config options) do to prevent the grace-period kthread from
>> > > getting a chance to run?"   
>> > 
>> > As far as we can tell it was a side effect of that patch.
>> > 
>> > The real cause is that patch changed the result of defconfigs to stop running
>> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
>> > 
>> > Enabling that on 4.13-rc2 (and presumably everything in between)
>> > means we don't see the problem any more.
>> >   
>> > > I must confess that I don't see anything
>> > > obvious in that commit, so my second question is "Are we sure that
>> > > reverting this commit makes the problem go away?"  
>> > 
>> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
>> > That detector fires up a thread on every cpu, which may be relevant.  
>> 
>> Interesting...  Why should it be necessary to fire up a thread on every
>> CPU in order to make sure that RCU's grace-period kthreads get some
>> CPU time?  Especially give how many idle CPUs you had on your system.
>> 
>> So I have to ask if there is some other bug that the softlockup detector
>> is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Just to report, turning softlockup back on fixes things for me on
sparc64 too.

The thing about softlockup is it runs an hrtimer, which seems to run
about every 4 seconds.

So I wonder if this is a NO_HZ problem.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-25 21:10             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-25 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Wed, 26 Jul 2017 00:52:07 +0800

> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
>> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
>> > On Tue, 25 Jul 2017 06:46:26 -0700
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>> >   
>> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
>> > > > On Tue, 25 Jul 2017 19:32:10 +0800
>> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>> > > >     
>> > > > > Hi All,
>> > > > > 
>> > > > > We observed a regression on our d05 boards (but curiously not
>> > > > > the fairly similar but single socket / smaller core count
>> > > > > d03), initially seen with linux-next prior to the merge window
>> > > > > and still present in v4.13-rc2.
>> > > > > 
>> > > > > The symptom is:    
>> > > 
>> > > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
>> > > they have been seeing something similar, and you might well have saved
>> > > them the trouble of bisecting.
>> > > 
>> > > [ . . . ]
>> > >   
>> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
>> > > 
>> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
>> > > and RCU is not permitted to disturb them -- the battery-powered embedded
>> > > guys get very annoyed by that sort of thing.  What happens instead is
>> > > that each CPU updates a per-CPU state variable when entering or exiting
>> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
>> > > message) checks these state variables, and if when sees an idle CPU,
>> > > it reports a quiescent state on that CPU's behalf.
>> > > 
>> > > But the grace-period kthread can only do this work if it gets a chance
>> > > to run.  And the message above says that this kthread hasn't had a chance
>> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
>> > > says that grace period #1566 is in progress, the "f0x0" says that no one
>> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
>> > > that the grace-period kthread has fully initialized the current grace
>> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
>> > > Finally, the "->state=0x1" says that the grace-period kthread is in
>> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
>> > 
>> > Thanks for the explanation!  
>> > > 
>> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
>> > > split up config options) do to prevent the grace-period kthread from
>> > > getting a chance to run?"   
>> > 
>> > As far as we can tell it was a side effect of that patch.
>> > 
>> > The real cause is that patch changed the result of defconfigs to stop running
>> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
>> > 
>> > Enabling that on 4.13-rc2 (and presumably everything in between)
>> > means we don't see the problem any more.
>> >   
>> > > I must confess that I don't see anything
>> > > obvious in that commit, so my second question is "Are we sure that
>> > > reverting this commit makes the problem go away?"  
>> > 
>> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
>> > That detector fires up a thread on every cpu, which may be relevant.  
>> 
>> Interesting...  Why should it be necessary to fire up a thread on every
>> CPU in order to make sure that RCU's grace-period kthreads get some
>> CPU time?  Especially give how many idle CPUs you had on your system.
>> 
>> So I have to ask if there is some other bug that the softlockup detector
>> is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Just to report, turning softlockup back on fixes things for me on
sparc64 too.

The thing about softlockup is it runs an hrtimer, which seems to run
about every 4 seconds.

So I wonder if this is a NO_HZ problem.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-25 16:52           ` Jonathan Cameron
  (?)
@ 2017-07-26  3:53             ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > Hi All,
> > > > > > 
> > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > the fairly similar but single socket / smaller core count
> > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > and still present in v4.13-rc2.
> > > > > > 
> > > > > > The symptom is:    
> > > > 
> > > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > > they have been seeing something similar, and you might well have saved
> > > > them the trouble of bisecting.
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > > 
> > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > message) checks these state variables, and if when sees an idle CPU,
> > > > it reports a quiescent state on that CPU's behalf.
> > > > 
> > > > But the grace-period kthread can only do this work if it gets a chance
> > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > that the grace-period kthread has fully initialized the current grace
> > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > > 
> > > Thanks for the explanation!  
> > > > 
> > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > split up config options) do to prevent the grace-period kthread from
> > > > getting a chance to run?"   
> > > 
> > > As far as we can tell it was a side effect of that patch.
> > > 
> > > The real cause is that patch changed the result of defconfigs to stop running
> > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > 
> > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > means we don't see the problem any more.
> > >   
> > > > I must confess that I don't see anything
> > > > obvious in that commit, so my second question is "Are we sure that
> > > > reverting this commit makes the problem go away?"  
> > > 
> > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > That detector fires up a thread on every cpu, which may be relevant.  
> > 
> > Interesting...  Why should it be necessary to fire up a thread on every
> > CPU in order to make sure that RCU's grace-period kthreads get some
> > CPU time?  Especially give how many idle CPUs you had on your system.
> > 
> > So I have to ask if there is some other bug that the softlockup detector
> > is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Looking forward to seeing results!

> > > > and my third is "Is
> > > > this an intermittent problem that led to a false bisection?"  
> > > 
> > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > side of that patch and since with the option enabled on latest mainline.
> > > 
> > > Also can cause the issue before that patch by disabling the previous
> > > relevant option on 4.12.  
> > 
> > OK, thank you -- hard to argue with that!  ;-)
> 
> We thought it was a pretty unlikely a bisection result
> hence hammered it thoroughly ;)

Glad that I am not the only paranoid one out here.  ;-)

> > Except that I am still puzzled as to why per-CPU softlockup threads
> > are needed for RCU's kthreads to get their wakeups.  We really should
> > be able to disable softlockup and still have kthreads get wakeups and
> > access to CPU, after all.
> > 
> > > > [ . . . ]
> > > >   
> > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > 
> > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > also been seen mid way through performance tests.
> > > > > > 
> > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > 
> > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > the arm64 defconfig.
> > > > > > 
> > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > issue occur there as well.
> > > > > > 
> > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > which might be related but beyond that we are still looking into
> > > > > > the issue.
> > > > > > 
> > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > it might help us to focus in on where to look!    
> > > > > 
> > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > this is a good help.
> > > > > 
> > > > > http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> > > > > 
> > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > confirmed, but it looks like the same thing.
> > > > > 
> > > > > A bug in RCU stall detection?    
> > > > 
> > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > changes.  ;-)
> > > > 
> > > > One possibility is that the timer isn't firing and another is that the
> > > > timer's wakeup is being lost somehow.
> > > > 
> > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > which is why this boot parameter will also cause a splat if it ever has
> > > > to do anything.  
> > > 
> > > Running that now will let you know how it goes.  Not seen the issue yet
> > > but might just be a 'lucky' run - will give it a few hours.  
> > 
> > Thank you very much!
> 
> So far it's not actually shown any splats.  I did a quick drop back to running
> without the parameter and got the original splat in less that 5 minutes.

That is a bit strange.  Sensitive to code position or some such???

> I've spun up another board with this parameter set as well and will leave
> them both running overnight to see if anything interesting happens.
> 
> Thanks for your help with this,

And thanks to you as well!!!

							Thanx, Paul

> Jonathan
> 
> > 
> > 							Thanx, Paul
> > 
> > > Jonathan  
> > > > 
> > > > Does this help at all?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > > > > > In the meantime we'll carry on digging.
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > Jonathan
> > > > > > 
> > > > > > p.s. As a more general question, do we want to have the
> > > > > > soft lockup detector enabledon arm64 by default?    
> > > > > 
> > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > should have been more careful with that.
> > > > > 
> > > > > Thanks,
> > > > > Nick
> > > > >     
> > > >   
> > > 
> > >   
> > 
> 
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-26  3:53             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:53 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Nicholas Piggin, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > Hi All,
> > > > > > 
> > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > the fairly similar but single socket / smaller core count
> > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > and still present in v4.13-rc2.
> > > > > > 
> > > > > > The symptom is:    
> > > > 
> > > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > > they have been seeing something similar, and you might well have saved
> > > > them the trouble of bisecting.
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > > 
> > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > message) checks these state variables, and if when sees an idle CPU,
> > > > it reports a quiescent state on that CPU's behalf.
> > > > 
> > > > But the grace-period kthread can only do this work if it gets a chance
> > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > that the grace-period kthread has fully initialized the current grace
> > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > > 
> > > Thanks for the explanation!  
> > > > 
> > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > split up config options) do to prevent the grace-period kthread from
> > > > getting a chance to run?"   
> > > 
> > > As far as we can tell it was a side effect of that patch.
> > > 
> > > The real cause is that patch changed the result of defconfigs to stop running
> > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > 
> > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > means we don't see the problem any more.
> > >   
> > > > I must confess that I don't see anything
> > > > obvious in that commit, so my second question is "Are we sure that
> > > > reverting this commit makes the problem go away?"  
> > > 
> > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > That detector fires up a thread on every cpu, which may be relevant.  
> > 
> > Interesting...  Why should it be necessary to fire up a thread on every
> > CPU in order to make sure that RCU's grace-period kthreads get some
> > CPU time?  Especially give how many idle CPUs you had on your system.
> > 
> > So I have to ask if there is some other bug that the softlockup detector
> > is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Looking forward to seeing results!

> > > > and my third is "Is
> > > > this an intermittent problem that led to a false bisection?"  
> > > 
> > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > side of that patch and since with the option enabled on latest mainline.
> > > 
> > > Also can cause the issue before that patch by disabling the previous
> > > relevant option on 4.12.  
> > 
> > OK, thank you -- hard to argue with that!  ;-)
> 
> We thought it was a pretty unlikely a bisection result
> hence hammered it thoroughly ;)

Glad that I am not the only paranoid one out here.  ;-)

> > Except that I am still puzzled as to why per-CPU softlockup threads
> > are needed for RCU's kthreads to get their wakeups.  We really should
> > be able to disable softlockup and still have kthreads get wakeups and
> > access to CPU, after all.
> > 
> > > > [ . . . ]
> > > >   
> > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > 
> > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > also been seen mid way through performance tests.
> > > > > > 
> > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > 
> > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > the arm64 defconfig.
> > > > > > 
> > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > issue occur there as well.
> > > > > > 
> > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > which might be related but beyond that we are still looking into
> > > > > > the issue.
> > > > > > 
> > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > it might help us to focus in on where to look!    
> > > > > 
> > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > this is a good help.
> > > > > 
> > > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > > 
> > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > confirmed, but it looks like the same thing.
> > > > > 
> > > > > A bug in RCU stall detection?    
> > > > 
> > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > changes.  ;-)
> > > > 
> > > > One possibility is that the timer isn't firing and another is that the
> > > > timer's wakeup is being lost somehow.
> > > > 
> > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > which is why this boot parameter will also cause a splat if it ever has
> > > > to do anything.  
> > > 
> > > Running that now will let you know how it goes.  Not seen the issue yet
> > > but might just be a 'lucky' run - will give it a few hours.  
> > 
> > Thank you very much!
> 
> So far it's not actually shown any splats.  I did a quick drop back to running
> without the parameter and got the original splat in less that 5 minutes.

That is a bit strange.  Sensitive to code position or some such???

> I've spun up another board with this parameter set as well and will leave
> them both running overnight to see if anything interesting happens.
> 
> Thanks for your help with this,

And thanks to you as well!!!

							Thanx, Paul

> Jonathan
> 
> > 
> > 							Thanx, Paul
> > 
> > > Jonathan  
> > > > 
> > > > Does this help at all?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > > > > > In the meantime we'll carry on digging.
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > Jonathan
> > > > > > 
> > > > > > p.s. As a more general question, do we want to have the
> > > > > > soft lockup detector enabledon arm64 by default?    
> > > > > 
> > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > should have been more careful with that.
> > > > > 
> > > > > Thanks,
> > > > > Nick
> > > > >     
> > > >   
> > > 
> > >   
> > 
> 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-26  3:53             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > Hi All,
> > > > > > 
> > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > the fairly similar but single socket / smaller core count
> > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > and still present in v4.13-rc2.
> > > > > > 
> > > > > > The symptom is:    
> > > > 
> > > > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> > > > they have been seeing something similar, and you might well have saved
> > > > them the trouble of bisecting.
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> > > > 
> > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > message) checks these state variables, and if when sees an idle CPU,
> > > > it reports a quiescent state on that CPU's behalf.
> > > > 
> > > > But the grace-period kthread can only do this work if it gets a chance
> > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > that the grace-period kthread has fully initialized the current grace
> > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > > 
> > > Thanks for the explanation!  
> > > > 
> > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > split up config options) do to prevent the grace-period kthread from
> > > > getting a chance to run?"   
> > > 
> > > As far as we can tell it was a side effect of that patch.
> > > 
> > > The real cause is that patch changed the result of defconfigs to stop running
> > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > 
> > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > means we don't see the problem any more.
> > >   
> > > > I must confess that I don't see anything
> > > > obvious in that commit, so my second question is "Are we sure that
> > > > reverting this commit makes the problem go away?"  
> > > 
> > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > That detector fires up a thread on every cpu, which may be relevant.  
> > 
> > Interesting...  Why should it be necessary to fire up a thread on every
> > CPU in order to make sure that RCU's grace-period kthreads get some
> > CPU time?  Especially give how many idle CPUs you had on your system.
> > 
> > So I have to ask if there is some other bug that the softlockup detector
> > is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Looking forward to seeing results!

> > > > and my third is "Is
> > > > this an intermittent problem that led to a false bisection?"  
> > > 
> > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > side of that patch and since with the option enabled on latest mainline.
> > > 
> > > Also can cause the issue before that patch by disabling the previous
> > > relevant option on 4.12.  
> > 
> > OK, thank you -- hard to argue with that!  ;-)
> 
> We thought it was a pretty unlikely a bisection result
> hence hammered it thoroughly ;)

Glad that I am not the only paranoid one out here.  ;-)

> > Except that I am still puzzled as to why per-CPU softlockup threads
> > are needed for RCU's kthreads to get their wakeups.  We really should
> > be able to disable softlockup and still have kthreads get wakeups and
> > access to CPU, after all.
> > 
> > > > [ . . . ]
> > > >   
> > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > 
> > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > also been seen mid way through performance tests.
> > > > > > 
> > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > 
> > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > the arm64 defconfig.
> > > > > > 
> > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > issue occur there as well.
> > > > > > 
> > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > which might be related but beyond that we are still looking into
> > > > > > the issue.
> > > > > > 
> > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > it might help us to focus in on where to look!    
> > > > > 
> > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > this is a good help.
> > > > > 
> > > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > > 
> > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > confirmed, but it looks like the same thing.
> > > > > 
> > > > > A bug in RCU stall detection?    
> > > > 
> > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > changes.  ;-)
> > > > 
> > > > One possibility is that the timer isn't firing and another is that the
> > > > timer's wakeup is being lost somehow.
> > > > 
> > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > which is why this boot parameter will also cause a splat if it ever has
> > > > to do anything.  
> > > 
> > > Running that now will let you know how it goes.  Not seen the issue yet
> > > but might just be a 'lucky' run - will give it a few hours.  
> > 
> > Thank you very much!
> 
> So far it's not actually shown any splats.  I did a quick drop back to running
> without the parameter and got the original splat in less that 5 minutes.

That is a bit strange.  Sensitive to code position or some such???

> I've spun up another board with this parameter set as well and will leave
> them both running overnight to see if anything interesting happens.
> 
> Thanks for your help with this,

And thanks to you as well!!!

							Thanx, Paul

> Jonathan
> 
> > 
> > 							Thanx, Paul
> > 
> > > Jonathan  
> > > > 
> > > > Does this help at all?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > > > > > In the meantime we'll carry on digging.
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > Jonathan
> > > > > > 
> > > > > > p.s. As a more general question, do we want to have the
> > > > > > soft lockup detector enabledon arm64 by default?    
> > > > > 
> > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > should have been more careful with that.
> > > > > 
> > > > > Thanks,
> > > > > Nick
> > > > >     
> > > >   
> > > 
> > >   
> > 
> 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-25 21:10             ` David Miller
  (?)
@ 2017-07-26  3:55               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Wed, 26 Jul 2017 00:52:07 +0800
> 
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> >> > On Tue, 25 Jul 2017 06:46:26 -0700
> >> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >> >   
> >> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> >> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> >> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >> > > >     
> >> > > > > Hi All,
> >> > > > > 
> >> > > > > We observed a regression on our d05 boards (but curiously not
> >> > > > > the fairly similar but single socket / smaller core count
> >> > > > > d03), initially seen with linux-next prior to the merge window
> >> > > > > and still present in v4.13-rc2.
> >> > > > > 
> >> > > > > The symptom is:    
> >> > > 
> >> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> >> > > they have been seeing something similar, and you might well have saved
> >> > > them the trouble of bisecting.
> >> > > 
> >> > > [ . . . ]
> >> > >   
> >> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> >> > > 
> >> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> >> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> >> > > guys get very annoyed by that sort of thing.  What happens instead is
> >> > > that each CPU updates a per-CPU state variable when entering or exiting
> >> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> >> > > message) checks these state variables, and if when sees an idle CPU,
> >> > > it reports a quiescent state on that CPU's behalf.
> >> > > 
> >> > > But the grace-period kthread can only do this work if it gets a chance
> >> > > to run.  And the message above says that this kthread hasn't had a chance
> >> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> >> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> >> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> >> > > that the grace-period kthread has fully initialized the current grace
> >> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> >> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> >> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> >> > 
> >> > Thanks for the explanation!  
> >> > > 
> >> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> >> > > split up config options) do to prevent the grace-period kthread from
> >> > > getting a chance to run?"   
> >> > 
> >> > As far as we can tell it was a side effect of that patch.
> >> > 
> >> > The real cause is that patch changed the result of defconfigs to stop running
> >> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> >> > 
> >> > Enabling that on 4.13-rc2 (and presumably everything in between)
> >> > means we don't see the problem any more.
> >> >   
> >> > > I must confess that I don't see anything
> >> > > obvious in that commit, so my second question is "Are we sure that
> >> > > reverting this commit makes the problem go away?"  
> >> > 
> >> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> >> > That detector fires up a thread on every cpu, which may be relevant.  
> >> 
> >> Interesting...  Why should it be necessary to fire up a thread on every
> >> CPU in order to make sure that RCU's grace-period kthreads get some
> >> CPU time?  Especially give how many idle CPUs you had on your system.
> >> 
> >> So I have to ask if there is some other bug that the softlockup detector
> >> is masking.
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)
> 
> Just to report, turning softlockup back on fixes things for me on
> sparc64 too.

Very good!

> The thing about softlockup is it runs an hrtimer, which seems to run
> about every 4 seconds.

I could see where that could shake things loose, but I am surprised that
it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
with no trouble, but I will be running a longer test later on.

> So I wonder if this is a NO_HZ problem.

Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
you running?  (Again, my symptoms are slightly different, so I might
be seeing a different bug.)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  3:55               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:55 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, npiggin, linux-arm-kernel, linuxarm, akpm,
	abdhalee, linuxppc-dev, dzickus, sparclinux, sfr

On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Wed, 26 Jul 2017 00:52:07 +0800
> 
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> >> > On Tue, 25 Jul 2017 06:46:26 -0700
> >> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >> >   
> >> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> >> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> >> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >> > > >     
> >> > > > > Hi All,
> >> > > > > 
> >> > > > > We observed a regression on our d05 boards (but curiously not
> >> > > > > the fairly similar but single socket / smaller core count
> >> > > > > d03), initially seen with linux-next prior to the merge window
> >> > > > > and still present in v4.13-rc2.
> >> > > > > 
> >> > > > > The symptom is:    
> >> > > 
> >> > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> >> > > they have been seeing something similar, and you might well have saved
> >> > > them the trouble of bisecting.
> >> > > 
> >> > > [ . . . ]
> >> > >   
> >> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> >> > > 
> >> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> >> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> >> > > guys get very annoyed by that sort of thing.  What happens instead is
> >> > > that each CPU updates a per-CPU state variable when entering or exiting
> >> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> >> > > message) checks these state variables, and if when sees an idle CPU,
> >> > > it reports a quiescent state on that CPU's behalf.
> >> > > 
> >> > > But the grace-period kthread can only do this work if it gets a chance
> >> > > to run.  And the message above says that this kthread hasn't had a chance
> >> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> >> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> >> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> >> > > that the grace-period kthread has fully initialized the current grace
> >> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> >> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> >> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> >> > 
> >> > Thanks for the explanation!  
> >> > > 
> >> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> >> > > split up config options) do to prevent the grace-period kthread from
> >> > > getting a chance to run?"   
> >> > 
> >> > As far as we can tell it was a side effect of that patch.
> >> > 
> >> > The real cause is that patch changed the result of defconfigs to stop running
> >> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> >> > 
> >> > Enabling that on 4.13-rc2 (and presumably everything in between)
> >> > means we don't see the problem any more.
> >> >   
> >> > > I must confess that I don't see anything
> >> > > obvious in that commit, so my second question is "Are we sure that
> >> > > reverting this commit makes the problem go away?"  
> >> > 
> >> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> >> > That detector fires up a thread on every cpu, which may be relevant.  
> >> 
> >> Interesting...  Why should it be necessary to fire up a thread on every
> >> CPU in order to make sure that RCU's grace-period kthreads get some
> >> CPU time?  Especially give how many idle CPUs you had on your system.
> >> 
> >> So I have to ask if there is some other bug that the softlockup detector
> >> is masking.
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)
> 
> Just to report, turning softlockup back on fixes things for me on
> sparc64 too.

Very good!

> The thing about softlockup is it runs an hrtimer, which seems to run
> about every 4 seconds.

I could see where that could shake things loose, but I am surprised that
it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
with no trouble, but I will be running a longer test later on.

> So I wonder if this is a NO_HZ problem.

Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
you running?  (Again, my symptoms are slightly different, so I might
be seeing a different bug.)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  3:55               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  3:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Wed, 26 Jul 2017 00:52:07 +0800
> 
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> >> > On Tue, 25 Jul 2017 06:46:26 -0700
> >> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >> >   
> >> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> >> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> >> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >> > > >     
> >> > > > > Hi All,
> >> > > > > 
> >> > > > > We observed a regression on our d05 boards (but curiously not
> >> > > > > the fairly similar but single socket / smaller core count
> >> > > > > d03), initially seen with linux-next prior to the merge window
> >> > > > > and still present in v4.13-rc2.
> >> > > > > 
> >> > > > > The symptom is:    
> >> > > 
> >> > > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> >> > > they have been seeing something similar, and you might well have saved
> >> > > them the trouble of bisecting.
> >> > > 
> >> > > [ . . . ]
> >> > >   
> >> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1    
> >> > > 
> >> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> >> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> >> > > guys get very annoyed by that sort of thing.  What happens instead is
> >> > > that each CPU updates a per-CPU state variable when entering or exiting
> >> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> >> > > message) checks these state variables, and if when sees an idle CPU,
> >> > > it reports a quiescent state on that CPU's behalf.
> >> > > 
> >> > > But the grace-period kthread can only do this work if it gets a chance
> >> > > to run.  And the message above says that this kthread hasn't had a chance
> >> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> >> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> >> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> >> > > that the grace-period kthread has fully initialized the current grace
> >> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> >> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> >> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> >> > 
> >> > Thanks for the explanation!  
> >> > > 
> >> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> >> > > split up config options) do to prevent the grace-period kthread from
> >> > > getting a chance to run?"   
> >> > 
> >> > As far as we can tell it was a side effect of that patch.
> >> > 
> >> > The real cause is that patch changed the result of defconfigs to stop running
> >> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> >> > 
> >> > Enabling that on 4.13-rc2 (and presumably everything in between)
> >> > means we don't see the problem any more.
> >> >   
> >> > > I must confess that I don't see anything
> >> > > obvious in that commit, so my second question is "Are we sure that
> >> > > reverting this commit makes the problem go away?"  
> >> > 
> >> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> >> > That detector fires up a thread on every cpu, which may be relevant.  
> >> 
> >> Interesting...  Why should it be necessary to fire up a thread on every
> >> CPU in order to make sure that RCU's grace-period kthreads get some
> >> CPU time?  Especially give how many idle CPUs you had on your system.
> >> 
> >> So I have to ask if there is some other bug that the softlockup detector
> >> is masking.
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)
> 
> Just to report, turning softlockup back on fixes things for me on
> sparc64 too.

Very good!

> The thing about softlockup is it runs an hrtimer, which seems to run
> about every 4 seconds.

I could see where that could shake things loose, but I am surprised that
it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
with no trouble, but I will be running a longer test later on.

> So I wonder if this is a NO_HZ problem.

Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
you running?  (Again, my symptoms are slightly different, so I might
be seeing a different bug.)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26  3:55               ` Paul E. McKenney
  (?)
@ 2017-07-26  4:02                 ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26  4:02 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 25 Jul 2017 20:55:45 -0700

> On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
>> Just to report, turning softlockup back on fixes things for me on
>> sparc64 too.
> 
> Very good!
> 
>> The thing about softlockup is it runs an hrtimer, which seems to run
>> about every 4 seconds.
> 
> I could see where that could shake things loose, but I am surprised that
> it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> with no trouble, but I will be running a longer test later on.
> 
>> So I wonder if this is a NO_HZ problem.
> 
> Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> you running?  (Again, my symptoms are slightly different, so I might
> be seeing a different bug.)

I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.

To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  4:02                 ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26  4:02 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan.Cameron, npiggin, linux-arm-kernel, linuxarm, akpm,
	abdhalee, linuxppc-dev, dzickus, sparclinux, sfr

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 25 Jul 2017 20:55:45 -0700

> On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
>> Just to report, turning softlockup back on fixes things for me on
>> sparc64 too.
> 
> Very good!
> 
>> The thing about softlockup is it runs an hrtimer, which seems to run
>> about every 4 seconds.
> 
> I could see where that could shake things loose, but I am surprised that
> it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> with no trouble, but I will be running a longer test later on.
> 
>> So I wonder if this is a NO_HZ problem.
> 
> Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> you running?  (Again, my symptoms are slightly different, so I might
> be seeing a different bug.)

I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.

To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  4:02                 ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26  4:02 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 25 Jul 2017 20:55:45 -0700

> On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
>> Just to report, turning softlockup back on fixes things for me on
>> sparc64 too.
> 
> Very good!
> 
>> The thing about softlockup is it runs an hrtimer, which seems to run
>> about every 4 seconds.
> 
> I could see where that could shake things loose, but I am surprised that
> it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> with no trouble, but I will be running a longer test later on.
> 
>> So I wonder if this is a NO_HZ problem.
> 
> Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> you running?  (Again, my symptoms are slightly different, so I might
> be seeing a different bug.)

I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.

To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26  4:02                 ` David Miller
  (?)
@ 2017-07-26  4:12                   ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  4:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Tue, 25 Jul 2017 20:55:45 -0700
> 
> > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> >> Just to report, turning softlockup back on fixes things for me on
> >> sparc64 too.
> > 
> > Very good!
> > 
> >> The thing about softlockup is it runs an hrtimer, which seems to run
> >> about every 4 seconds.
> > 
> > I could see where that could shake things loose, but I am surprised that
> > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > with no trouble, but I will be running a longer test later on.
> > 
> >> So I wonder if this is a NO_HZ problem.
> > 
> > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > you running?  (Again, my symptoms are slightly different, so I might
> > be seeing a different bug.)
> 
> I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> 
> To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

Same here -- but my failure case happens fairly rarely, so it will take
some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
had effect.

But you are right, might be interesting to try NO_HZ_PERIODIC=y
or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  4:12                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  4:12 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, npiggin, linux-arm-kernel, linuxarm, akpm,
	abdhalee, linuxppc-dev, dzickus, sparclinux, sfr

On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Tue, 25 Jul 2017 20:55:45 -0700
> 
> > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> >> Just to report, turning softlockup back on fixes things for me on
> >> sparc64 too.
> > 
> > Very good!
> > 
> >> The thing about softlockup is it runs an hrtimer, which seems to run
> >> about every 4 seconds.
> > 
> > I could see where that could shake things loose, but I am surprised that
> > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > with no trouble, but I will be running a longer test later on.
> > 
> >> So I wonder if this is a NO_HZ problem.
> > 
> > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > you running?  (Again, my symptoms are slightly different, so I might
> > be seeing a different bug.)
> 
> I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> 
> To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

Same here -- but my failure case happens fairly rarely, so it will take
some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
had effect.

But you are right, might be interesting to try NO_HZ_PERIODIC=y
or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  4:12                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26  4:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Tue, 25 Jul 2017 20:55:45 -0700
> 
> > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> >> Just to report, turning softlockup back on fixes things for me on
> >> sparc64 too.
> > 
> > Very good!
> > 
> >> The thing about softlockup is it runs an hrtimer, which seems to run
> >> about every 4 seconds.
> > 
> > I could see where that could shake things loose, but I am surprised that
> > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > with no trouble, but I will be running a longer test later on.
> > 
> >> So I wonder if this is a NO_HZ problem.
> > 
> > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > you running?  (Again, my symptoms are slightly different, so I might
> > be seeing a different bug.)
> 
> I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> 
> To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

Same here -- but my failure case happens fairly rarely, so it will take
some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
had effect.

But you are right, might be interesting to try NO_HZ_PERIODIC=y
or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
  2017-07-26  3:53             ` Paul E. McKenney
  (?)
@ 2017-07-26  7:51               ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 20:53:06 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:  
> > > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:    
> > > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > > >       
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > > the fairly similar but single socket / smaller core count
> > > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > > and still present in v4.13-rc2.
> > > > > > > 
> > > > > > > The symptom is:      
> > > > > 
> > > > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > > > they have been seeing something similar, and you might well have saved
> > > > > them the trouble of bisecting.
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1      
> > > > > 
> > > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > > message) checks these state variables, and if when sees an idle CPU,
> > > > > it reports a quiescent state on that CPU's behalf.
> > > > > 
> > > > > But the grace-period kthread can only do this work if it gets a chance
> > > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > > that the grace-period kthread has fully initialized the current grace
> > > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.    
> > > > 
> > > > Thanks for the explanation!    
> > > > > 
> > > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > > split up config options) do to prevent the grace-period kthread from
> > > > > getting a chance to run?"     
> > > > 
> > > > As far as we can tell it was a side effect of that patch.
> > > > 
> > > > The real cause is that patch changed the result of defconfigs to stop running
> > > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > > 
> > > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > > means we don't see the problem any more.
> > > >     
> > > > > I must confess that I don't see anything
> > > > > obvious in that commit, so my second question is "Are we sure that
> > > > > reverting this commit makes the problem go away?"    
> > > > 
> > > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > > That detector fires up a thread on every cpu, which may be relevant.    
> > > 
> > > Interesting...  Why should it be necessary to fire up a thread on every
> > > CPU in order to make sure that RCU's grace-period kthreads get some
> > > CPU time?  Especially give how many idle CPUs you had on your system.
> > > 
> > > So I have to ask if there is some other bug that the softlockup detector
> > > is masking.  
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)  
> 
> Looking forward to seeing results!
> 
> > > > > and my third is "Is
> > > > > this an intermittent problem that led to a false bisection?"    
> > > > 
> > > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > > side of that patch and since with the option enabled on latest mainline.
> > > > 
> > > > Also can cause the issue before that patch by disabling the previous
> > > > relevant option on 4.12.    
> > > 
> > > OK, thank you -- hard to argue with that!  ;-)  
> > 
> > We thought it was a pretty unlikely a bisection result
> > hence hammered it thoroughly ;)  
> 
> Glad that I am not the only paranoid one out here.  ;-)
> 
> > > Except that I am still puzzled as to why per-CPU softlockup threads
> > > are needed for RCU's kthreads to get their wakeups.  We really should
> > > be able to disable softlockup and still have kthreads get wakeups and
> > > access to CPU, after all.
> > >   
> > > > > [ . . . ]
> > > > >     
> > > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > > 
> > > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > > also been seen mid way through performance tests.
> > > > > > > 
> > > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > > 
> > > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > > the arm64 defconfig.
> > > > > > > 
> > > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > > issue occur there as well.
> > > > > > > 
> > > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > > which might be related but beyond that we are still looking into
> > > > > > > the issue.
> > > > > > > 
> > > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > > it might help us to focus in on where to look!      
> > > > > > 
> > > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > > this is a good help.
> > > > > > 
> > > > > > http://marc.info/?l=linuxppc-embedded&m\x149872815523646&w=2
> > > > > > 
> > > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > > confirmed, but it looks like the same thing.
> > > > > > 
> > > > > > A bug in RCU stall detection?      
> > > > > 
> > > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > > changes.  ;-)
> > > > > 
> > > > > One possibility is that the timer isn't firing and another is that the
> > > > > timer's wakeup is being lost somehow.
> > > > > 
> > > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > > which is why this boot parameter will also cause a splat if it ever has
> > > > > to do anything.    
> > > > 
> > > > Running that now will let you know how it goes.  Not seen the issue yet
> > > > but might just be a 'lucky' run - will give it a few hours.    
> > > 
> > > Thank you very much!  
> > 
> > So far it's not actually shown any splats.  I did a quick drop back to running
> > without the parameter and got the original splat in less that 5 minutes.  
> 
> That is a bit strange.  Sensitive to code position or some such???
> 
Could be.  An overnight run on two boards showed no splats either.

If we are looking at code position then we may find it is hidden again
in older kernels even if nothing 'relevant' changed.

Jonathan

> > I've spun up another board with this parameter set as well and will leave
> > them both running overnight to see if anything interesting happens.
> > 
> > Thanks for your help with this,  
> 
> And thanks to you as well!!!
> 
> 							Thanx, Paul
> 
> > Jonathan
> >   
> > > 
> > > 							Thanx, Paul
> > >   
> > > > Jonathan    
> > > > > 
> > > > > Does this help at all?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > > > > > In the meantime we'll carry on digging.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > Jonathan
> > > > > > > 
> > > > > > > p.s. As a more general question, do we want to have the
> > > > > > > soft lockup detector enabledon arm64 by default?      
> > > > > > 
> > > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > > should have been more careful with that.
> > > > > > 
> > > > > > Thanks,
> > > > > > Nick
> > > > > >       
> > > > >     
> > > > 
> > > >     
> > >   
> > 
> >   
> 



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-26  7:51               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  7:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Nicholas Piggin, linux-arm-kernel, linuxarm, Andrew Morton,
	Abdul Haleem, linuxppc-dev, Don Zickus, David Miller, sparclinux,
	Stephen Rothwell

On Tue, 25 Jul 2017 20:53:06 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:  
> > > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:    
> > > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > > >       
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > > the fairly similar but single socket / smaller core count
> > > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > > and still present in v4.13-rc2.
> > > > > > > 
> > > > > > > The symptom is:      
> > > > > 
> > > > > Adding Dave Miller and the sparclinux@vger.kernel.org email on CC, as
> > > > > they have been seeing something similar, and you might well have saved
> > > > > them the trouble of bisecting.
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1      
> > > > > 
> > > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > > message) checks these state variables, and if when sees an idle CPU,
> > > > > it reports a quiescent state on that CPU's behalf.
> > > > > 
> > > > > But the grace-period kthread can only do this work if it gets a chance
> > > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > > that the grace-period kthread has fully initialized the current grace
> > > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.    
> > > > 
> > > > Thanks for the explanation!    
> > > > > 
> > > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > > split up config options) do to prevent the grace-period kthread from
> > > > > getting a chance to run?"     
> > > > 
> > > > As far as we can tell it was a side effect of that patch.
> > > > 
> > > > The real cause is that patch changed the result of defconfigs to stop running
> > > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > > 
> > > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > > means we don't see the problem any more.
> > > >     
> > > > > I must confess that I don't see anything
> > > > > obvious in that commit, so my second question is "Are we sure that
> > > > > reverting this commit makes the problem go away?"    
> > > > 
> > > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > > That detector fires up a thread on every cpu, which may be relevant.    
> > > 
> > > Interesting...  Why should it be necessary to fire up a thread on every
> > > CPU in order to make sure that RCU's grace-period kthreads get some
> > > CPU time?  Especially give how many idle CPUs you had on your system.
> > > 
> > > So I have to ask if there is some other bug that the softlockup detector
> > > is masking.  
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)  
> 
> Looking forward to seeing results!
> 
> > > > > and my third is "Is
> > > > > this an intermittent problem that led to a false bisection?"    
> > > > 
> > > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > > side of that patch and since with the option enabled on latest mainline.
> > > > 
> > > > Also can cause the issue before that patch by disabling the previous
> > > > relevant option on 4.12.    
> > > 
> > > OK, thank you -- hard to argue with that!  ;-)  
> > 
> > We thought it was a pretty unlikely a bisection result
> > hence hammered it thoroughly ;)  
> 
> Glad that I am not the only paranoid one out here.  ;-)
> 
> > > Except that I am still puzzled as to why per-CPU softlockup threads
> > > are needed for RCU's kthreads to get their wakeups.  We really should
> > > be able to disable softlockup and still have kthreads get wakeups and
> > > access to CPU, after all.
> > >   
> > > > > [ . . . ]
> > > > >     
> > > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > > 
> > > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > > also been seen mid way through performance tests.
> > > > > > > 
> > > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > > 
> > > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > > the arm64 defconfig.
> > > > > > > 
> > > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > > issue occur there as well.
> > > > > > > 
> > > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > > which might be related but beyond that we are still looking into
> > > > > > > the issue.
> > > > > > > 
> > > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > > it might help us to focus in on where to look!      
> > > > > > 
> > > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > > this is a good help.
> > > > > > 
> > > > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > > > 
> > > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > > confirmed, but it looks like the same thing.
> > > > > > 
> > > > > > A bug in RCU stall detection?      
> > > > > 
> > > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > > changes.  ;-)
> > > > > 
> > > > > One possibility is that the timer isn't firing and another is that the
> > > > > timer's wakeup is being lost somehow.
> > > > > 
> > > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > > which is why this boot parameter will also cause a splat if it ever has
> > > > > to do anything.    
> > > > 
> > > > Running that now will let you know how it goes.  Not seen the issue yet
> > > > but might just be a 'lucky' run - will give it a few hours.    
> > > 
> > > Thank you very much!  
> > 
> > So far it's not actually shown any splats.  I did a quick drop back to running
> > without the parameter and got the original splat in less that 5 minutes.  
> 
> That is a bit strange.  Sensitive to code position or some such???
> 
Could be.  An overnight run on two boards showed no splats either.

If we are looking at code position then we may find it is hidden again
in older kernels even if nothing 'relevant' changed.

Jonathan

> > I've spun up another board with this parameter set as well and will leave
> > them both running overnight to see if anything interesting happens.
> > 
> > Thanks for your help with this,  
> 
> And thanks to you as well!!!
> 
> 							Thanx, Paul
> 
> > Jonathan
> >   
> > > 
> > > 							Thanx, Paul
> > >   
> > > > Jonathan    
> > > > > 
> > > > > Does this help at all?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > > > > > In the meantime we'll carry on digging.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > Jonathan
> > > > > > > 
> > > > > > > p.s. As a more general question, do we want to have the
> > > > > > > soft lockup detector enabledon arm64 by default?      
> > > > > > 
> > > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > > should have been more careful with that.
> > > > > > 
> > > > > > Thanks,
> > > > > > Nick
> > > > > >       
> > > > >     
> > > > 
> > > >     
> > >   
> > 
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n  - any one else seeing this?
@ 2017-07-26  7:51               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 20:53:06 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:  
> > > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:    
> > > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > > >       
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > > the fairly similar but single socket / smaller core count
> > > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > > and still present in v4.13-rc2.
> > > > > > > 
> > > > > > > The symptom is:      
> > > > > 
> > > > > Adding Dave Miller and the sparclinux at vger.kernel.org email on CC, as
> > > > > they have been seeing something similar, and you might well have saved
> > > > > them the trouble of bisecting.
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1      
> > > > > 
> > > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > > message) checks these state variables, and if when sees an idle CPU,
> > > > > it reports a quiescent state on that CPU's behalf.
> > > > > 
> > > > > But the grace-period kthread can only do this work if it gets a chance
> > > > > to run.  And the message above says that this kthread hasn't had a chance
> > > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > > that the grace-period kthread has fully initialized the current grace
> > > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.    
> > > > 
> > > > Thanks for the explanation!    
> > > > > 
> > > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > > split up config options) do to prevent the grace-period kthread from
> > > > > getting a chance to run?"     
> > > > 
> > > > As far as we can tell it was a side effect of that patch.
> > > > 
> > > > The real cause is that patch changed the result of defconfigs to stop running
> > > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > > 
> > > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > > means we don't see the problem any more.
> > > >     
> > > > > I must confess that I don't see anything
> > > > > obvious in that commit, so my second question is "Are we sure that
> > > > > reverting this commit makes the problem go away?"    
> > > > 
> > > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > > That detector fires up a thread on every cpu, which may be relevant.    
> > > 
> > > Interesting...  Why should it be necessary to fire up a thread on every
> > > CPU in order to make sure that RCU's grace-period kthreads get some
> > > CPU time?  Especially give how many idle CPUs you had on your system.
> > > 
> > > So I have to ask if there is some other bug that the softlockup detector
> > > is masking.  
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)  
> 
> Looking forward to seeing results!
> 
> > > > > and my third is "Is
> > > > > this an intermittent problem that led to a false bisection?"    
> > > > 
> > > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > > side of that patch and since with the option enabled on latest mainline.
> > > > 
> > > > Also can cause the issue before that patch by disabling the previous
> > > > relevant option on 4.12.    
> > > 
> > > OK, thank you -- hard to argue with that!  ;-)  
> > 
> > We thought it was a pretty unlikely a bisection result
> > hence hammered it thoroughly ;)  
> 
> Glad that I am not the only paranoid one out here.  ;-)
> 
> > > Except that I am still puzzled as to why per-CPU softlockup threads
> > > are needed for RCU's kthreads to get their wakeups.  We really should
> > > be able to disable softlockup and still have kthreads get wakeups and
> > > access to CPU, after all.
> > >   
> > > > > [ . . . ]
> > > > >     
> > > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > > but we are seeing even with the default value of 24 seconds.
> > > > > > > 
> > > > > > > Tends to occur after a period or relatively low usage, but has
> > > > > > > also been seen mid way through performance tests.
> > > > > > > 
> > > > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > > > 
> > > > > > > Which was odd until we discovered that a side effect of this patch
> > > > > > > was to change whether the softlockup detector was enabled or not in
> > > > > > > the arm64 defconfig.
> > > > > > > 
> > > > > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > > > > issue occur there as well.
> > > > > > > 
> > > > > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > > > > which might be related but beyond that we are still looking into
> > > > > > > the issue.
> > > > > > > 
> > > > > > > So the obvious question is whether anyone else is seeing this as
> > > > > > > it might help us to focus in on where to look!      
> > > > > > 
> > > > > > Huh. Something similar has been seen very intermittently on powerpc
> > > > > > as well. We couldn't reproduce it reliably to bisect it already, so
> > > > > > this is a good help.
> > > > > > 
> > > > > > http://marc.info/?l=linuxppc-embedded&m=149872815523646&w=2
> > > > > > 
> > > > > > It looks like the watchdog patch has a similar effect on powerpc in
> > > > > > that it stops enabling the softlockup detector by default. Haven't
> > > > > > confirmed, but it looks like the same thing.
> > > > > > 
> > > > > > A bug in RCU stall detection?      
> > > > > 
> > > > > Well, if I am expected to make grace periods complete when my grace-period
> > > > > kthreads aren't getting any CPU time, I will have to make some substantial
> > > > > changes.  ;-)
> > > > > 
> > > > > One possibility is that the timer isn't firing and another is that the
> > > > > timer's wakeup is being lost somehow.
> > > > > 
> > > > > So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
> > > > > This will cause RCU to do redundant wakeups on the grace-period kthread
> > > > > if the grace period is moving slowly.  This is of course a crude hack,
> > > > > which is why this boot parameter will also cause a splat if it ever has
> > > > > to do anything.    
> > > > 
> > > > Running that now will let you know how it goes.  Not seen the issue yet
> > > > but might just be a 'lucky' run - will give it a few hours.    
> > > 
> > > Thank you very much!  
> > 
> > So far it's not actually shown any splats.  I did a quick drop back to running
> > without the parameter and got the original splat in less that 5 minutes.  
> 
> That is a bit strange.  Sensitive to code position or some such???
> 
Could be.  An overnight run on two boards showed no splats either.

If we are looking at code position then we may find it is hidden again
in older kernels even if nothing 'relevant' changed.

Jonathan

> > I've spun up another board with this parameter set as well and will leave
> > them both running overnight to see if anything interesting happens.
> > 
> > Thanks for your help with this,  
> 
> And thanks to you as well!!!
> 
> 							Thanx, Paul
> 
> > Jonathan
> >   
> > > 
> > > 							Thanx, Paul
> > >   
> > > > Jonathan    
> > > > > 
> > > > > Does this help at all?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > > > > > In the meantime we'll carry on digging.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > Jonathan
> > > > > > > 
> > > > > > > p.s. As a more general question, do we want to have the
> > > > > > > soft lockup detector enabledon arm64 by default?      
> > > > > > 
> > > > > > I've cc'ed Don. My patch should not have changed defconfigs, I
> > > > > > should have been more careful with that.
> > > > > > 
> > > > > > Thanks,
> > > > > > Nick
> > > > > >       
> > > > >     
> > > > 
> > > >     
> > >   
> > 
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26  4:12                   ` Paul E. McKenney
  (?)
@ 2017-07-26  8:16                     ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 21:12:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Tue, 25 Jul 2017 20:55:45 -0700
> >   
> > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > >> Just to report, turning softlockup back on fixes things for me on
> > >> sparc64 too.  
> > > 
> > > Very good!
> > >   
> > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > >> about every 4 seconds.  
> > > 
> > > I could see where that could shake things loose, but I am surprised that
> > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > with no trouble, but I will be running a longer test later on.
> > >   
> > >> So I wonder if this is a NO_HZ problem.  
> > > 
> > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > you running?  (Again, my symptoms are slightly different, so I might
> > > be seeing a different bug.)  
> > 
> > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > 
> > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.  
> 
> Same here -- but my failure case happens fairly rarely, so it will take
> some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> had effect.
> 
> But you are right, might be interesting to try NO_HZ_PERIODIC=y
> or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> 
> 							Thanx, Paul
> 
I'll be the headless chicken running around and trying as many tests
as I can fit in.  Typical time to see the failure for us is sub 10
minutes so we'll see how far we get.

Make me a list to run if you like ;)

NO_HZ_PERIODIC=y running now.

Jonathan


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  8:16                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  8:16 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Miller, npiggin, linux-arm-kernel, linuxarm, akpm,
	abdhalee, linuxppc-dev, dzickus, sparclinux, sfr

On Tue, 25 Jul 2017 21:12:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Tue, 25 Jul 2017 20:55:45 -0700
> >   
> > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > >> Just to report, turning softlockup back on fixes things for me on
> > >> sparc64 too.  
> > > 
> > > Very good!
> > >   
> > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > >> about every 4 seconds.  
> > > 
> > > I could see where that could shake things loose, but I am surprised that
> > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > with no trouble, but I will be running a longer test later on.
> > >   
> > >> So I wonder if this is a NO_HZ problem.  
> > > 
> > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > you running?  (Again, my symptoms are slightly different, so I might
> > > be seeing a different bug.)  
> > 
> > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > 
> > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.  
> 
> Same here -- but my failure case happens fairly rarely, so it will take
> some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> had effect.
> 
> But you are right, might be interesting to try NO_HZ_PERIODIC=y
> or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> 
> 							Thanx, Paul
> 
I'll be the headless chicken running around and trying as many tests
as I can fit in.  Typical time to see the failure for us is sub 10
minutes so we'll see how far we get.

Make me a list to run if you like ;)

NO_HZ_PERIODIC=y running now.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  8:16                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 25 Jul 2017 21:12:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Tue, 25 Jul 2017 20:55:45 -0700
> >   
> > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > >> Just to report, turning softlockup back on fixes things for me on
> > >> sparc64 too.  
> > > 
> > > Very good!
> > >   
> > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > >> about every 4 seconds.  
> > > 
> > > I could see where that could shake things loose, but I am surprised that
> > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > with no trouble, but I will be running a longer test later on.
> > >   
> > >> So I wonder if this is a NO_HZ problem.  
> > > 
> > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > you running?  (Again, my symptoms are slightly different, so I might
> > > be seeing a different bug.)  
> > 
> > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > 
> > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.  
> 
> Same here -- but my failure case happens fairly rarely, so it will take
> some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> had effect.
> 
> But you are right, might be interesting to try NO_HZ_PERIODIC=y
> or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> 
> 							Thanx, Paul
> 
I'll be the headless chicken running around and trying as many tests
as I can fit in.  Typical time to see the failure for us is sub 10
minutes so we'll see how far we get.

Make me a list to run if you like ;)

NO_HZ_PERIODIC=y running now.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26  8:16                     ` Jonathan Cameron
  (?)
@ 2017-07-26  9:32                       ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 09:16:23 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Tue, 25 Jul 2017 21:12:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > >     
> > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:    
> > > >> Just to report, turning softlockup back on fixes things for me on
> > > >> sparc64 too.    
> > > > 
> > > > Very good!
> > > >     
> > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > >> about every 4 seconds.    
> > > > 
> > > > I could see where that could shake things loose, but I am surprised that
> > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > with no trouble, but I will be running a longer test later on.
> > > >     
> > > >> So I wonder if this is a NO_HZ problem.    
> > > > 
> > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > be seeing a different bug.)    
> > > 
> > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > 
> > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.    
> > 
> > Same here -- but my failure case happens fairly rarely, so it will take
> > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > had effect.
> > 
> > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > 
> > 							Thanx, Paul
> >   
> I'll be the headless chicken running around and trying as many tests
> as I can fit in.  Typical time to see the failure for us is sub 10
> minutes so we'll see how far we get.
> 
> Make me a list to run if you like ;)
> 
> NO_HZ_PERIODIC=y running now.
By which I mean CONFIG_HZ_PERIODIC=y

Anyhow, run for 40 minutes with out seeing a splat but my sanity check
on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
I won't have much confidence until we are a few hours in on this.

Anyhow, certainly looking like a promising direction for investigation!

Jonathan

> 
> Jonathan
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  9:32                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  9:32 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, 26 Jul 2017 09:16:23 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Tue, 25 Jul 2017 21:12:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > >     
> > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:    
> > > >> Just to report, turning softlockup back on fixes things for me on
> > > >> sparc64 too.    
> > > > 
> > > > Very good!
> > > >     
> > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > >> about every 4 seconds.    
> > > > 
> > > > I could see where that could shake things loose, but I am surprised that
> > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > with no trouble, but I will be running a longer test later on.
> > > >     
> > > >> So I wonder if this is a NO_HZ problem.    
> > > > 
> > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > be seeing a different bug.)    
> > > 
> > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > 
> > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.    
> > 
> > Same here -- but my failure case happens fairly rarely, so it will take
> > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > had effect.
> > 
> > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > 
> > 							Thanx, Paul
> >   
> I'll be the headless chicken running around and trying as many tests
> as I can fit in.  Typical time to see the failure for us is sub 10
> minutes so we'll see how far we get.
> 
> Make me a list to run if you like ;)
> 
> NO_HZ_PERIODIC=y running now.
By which I mean CONFIG_HZ_PERIODIC=y

Anyhow, run for 40 minutes with out seeing a splat but my sanity check
on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
I won't have much confidence until we are a few hours in on this.

Anyhow, certainly looking like a promising direction for investigation!

Jonathan

> 
> Jonathan
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26  9:32                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 09:16:23 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Tue, 25 Jul 2017 21:12:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > >     
> > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:    
> > > >> Just to report, turning softlockup back on fixes things for me on
> > > >> sparc64 too.    
> > > > 
> > > > Very good!
> > > >     
> > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > >> about every 4 seconds.    
> > > > 
> > > > I could see where that could shake things loose, but I am surprised that
> > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > with no trouble, but I will be running a longer test later on.
> > > >     
> > > >> So I wonder if this is a NO_HZ problem.    
> > > > 
> > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > be seeing a different bug.)    
> > > 
> > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > 
> > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.    
> > 
> > Same here -- but my failure case happens fairly rarely, so it will take
> > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > had effect.
> > 
> > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > 
> > 							Thanx, Paul
> >   
> I'll be the headless chicken running around and trying as many tests
> as I can fit in.  Typical time to see the failure for us is sub 10
> minutes so we'll see how far we get.
> 
> Make me a list to run if you like ;)
> 
> NO_HZ_PERIODIC=y running now.
By which I mean CONFIG_HZ_PERIODIC=y

Anyhow, run for 40 minutes with out seeing a splat but my sanity check
on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
I won't have much confidence until we are a few hours in on this.

Anyhow, certainly looking like a promising direction for investigation!

Jonathan

> 
> Jonathan
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26  9:32                       ` Jonathan Cameron
  (?)
@ 2017-07-26 12:28                         ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 10:32:32 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:16:23 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Tue, 25 Jul 2017 21:12:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > >       
> > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > >> sparc64 too.      
> > > > > 
> > > > > Very good!
> > > > >       
> > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > >> about every 4 seconds.      
> > > > > 
> > > > > I could see where that could shake things loose, but I am surprised that
> > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > with no trouble, but I will be running a longer test later on.
> > > > >       
> > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > 
> > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > be seeing a different bug.)      
> > > > 
> > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > 
> > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > 
> > > Same here -- but my failure case happens fairly rarely, so it will take
> > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > had effect.
> > > 
> > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > 
> > > 							Thanx, Paul
> > >     
> > I'll be the headless chicken running around and trying as many tests
> > as I can fit in.  Typical time to see the failure for us is sub 10
> > minutes so we'll see how far we get.
> > 
> > Make me a list to run if you like ;)
> > 
> > NO_HZ_PERIODIC=y running now.  
> By which I mean CONFIG_HZ_PERIODIC=y
> 
> Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> I won't have much confidence until we are a few hours in on this.
> 
> Anyhow, certainly looking like a promising direction for investigation!
> 
Well it's done over 3 hours without a splat so I think it is fine with
CONFIG_HZ_PERIODIC=y


> Jonathan
> 
> > 
> > Jonathan
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 12:28                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, 26 Jul 2017 10:32:32 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:16:23 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Tue, 25 Jul 2017 21:12:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > >       
> > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > >> sparc64 too.      
> > > > > 
> > > > > Very good!
> > > > >       
> > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > >> about every 4 seconds.      
> > > > > 
> > > > > I could see where that could shake things loose, but I am surprised that
> > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > with no trouble, but I will be running a longer test later on.
> > > > >       
> > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > 
> > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > be seeing a different bug.)      
> > > > 
> > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > 
> > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > 
> > > Same here -- but my failure case happens fairly rarely, so it will take
> > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > had effect.
> > > 
> > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > 
> > > 							Thanx, Paul
> > >     
> > I'll be the headless chicken running around and trying as many tests
> > as I can fit in.  Typical time to see the failure for us is sub 10
> > minutes so we'll see how far we get.
> > 
> > Make me a list to run if you like ;)
> > 
> > NO_HZ_PERIODIC=y running now.  
> By which I mean CONFIG_HZ_PERIODIC=y
> 
> Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> I won't have much confidence until we are a few hours in on this.
> 
> Anyhow, certainly looking like a promising direction for investigation!
> 
Well it's done over 3 hours without a splat so I think it is fine with
CONFIG_HZ_PERIODIC=y


> Jonathan
> 
> > 
> > Jonathan
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 12:28                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 10:32:32 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:16:23 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Tue, 25 Jul 2017 21:12:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > >       
> > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > >> sparc64 too.      
> > > > > 
> > > > > Very good!
> > > > >       
> > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > >> about every 4 seconds.      
> > > > > 
> > > > > I could see where that could shake things loose, but I am surprised that
> > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > with no trouble, but I will be running a longer test later on.
> > > > >       
> > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > 
> > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > be seeing a different bug.)      
> > > > 
> > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > 
> > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > 
> > > Same here -- but my failure case happens fairly rarely, so it will take
> > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > had effect.
> > > 
> > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > 
> > > 							Thanx, Paul
> > >     
> > I'll be the headless chicken running around and trying as many tests
> > as I can fit in.  Typical time to see the failure for us is sub 10
> > minutes so we'll see how far we get.
> > 
> > Make me a list to run if you like ;)
> > 
> > NO_HZ_PERIODIC=y running now.  
> By which I mean CONFIG_HZ_PERIODIC=y
> 
> Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> I won't have much confidence until we are a few hours in on this.
> 
> Anyhow, certainly looking like a promising direction for investigation!
> 
Well it's done over 3 hours without a splat so I think it is fine with
CONFIG_HZ_PERIODIC=y


> Jonathan
> 
> > 
> > Jonathan
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm at huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 12:28                         ` Jonathan Cameron
  (?)
@ 2017-07-26 12:49                           ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 13:28:01 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >         
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.        
> > > > > > 
> > > > > > Very good!
> > > > > >         
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.        
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >         
> > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)        
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >       
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.    
> > By which I mean CONFIG_HZ_PERIODIC=y
> > 
> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> >   
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y
> 
As I think we expected, the problem occurs with NO_HZ_FULL.
Happened pretty quickly but given the somewhat random nature,
might just be coincidence.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Jonathan
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm@huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm    
> > 
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 12:49                           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:49 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, 26 Jul 2017 13:28:01 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >         
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.        
> > > > > > 
> > > > > > Very good!
> > > > > >         
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.        
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >         
> > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)        
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >       
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.    
> > By which I mean CONFIG_HZ_PERIODIC=y
> > 
> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> >   
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y
> 
As I think we expected, the problem occurs with NO_HZ_FULL.
Happened pretty quickly but given the somewhat random nature,
might just be coincidence.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Jonathan
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm@huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm    
> > 
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 12:49                           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 13:28:01 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >         
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.        
> > > > > > 
> > > > > > Very good!
> > > > > >         
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.        
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >         
> > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)        
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >       
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.    
> > By which I mean CONFIG_HZ_PERIODIC=y
> > 
> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> >   
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y
> 
As I think we expected, the problem occurs with NO_HZ_FULL.
Happened pretty quickly but given the somewhat random nature,
might just be coincidence.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Jonathan
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm at huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm    
> > 
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm at huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 12:28                         ` Jonathan Cameron
  (?)
@ 2017-07-26 14:14                           ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 14:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.      
> > > > > > 
> > > > > > Very good!
> > > > > >       
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.      
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >       
> > > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)      
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.  
> > By which I mean CONFIG_HZ_PERIODIC=y

I did get that messed up, didn't I?  Sorry for my confusion!

> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> > 
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y

Thank you!

If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
user task waking up every few seconds on each CPU, does the problem occur?
(The question is whether any disturbance gets things going, or whether there
is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.

Dave, any other ideas on what might be causing this or what might be
tested?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 14:14                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 14:14 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.      
> > > > > > 
> > > > > > Very good!
> > > > > >       
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.      
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >       
> > > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)      
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.  
> > By which I mean CONFIG_HZ_PERIODIC=y

I did get that messed up, didn't I?  Sorry for my confusion!

> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> > 
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y

Thank you!

If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
user task waking up every few seconds on each CPU, does the problem occur?
(The question is whether any disturbance gets things going, or whether there
is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.

Dave, any other ideas on what might be causing this or what might be
tested?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 14:14                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 14:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:    
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:      
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.      
> > > > > > 
> > > > > > Very good!
> > > > > >       
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > >> about every 4 seconds.      
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >       
> > > > > >> So I wonder if this is a NO_HZ problem.      
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)      
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.      
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.  
> > By which I mean CONFIG_HZ_PERIODIC=y

I did get that messed up, didn't I?  Sorry for my confusion!

> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> > 
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y

Thank you!

If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
user task waking up every few seconds on each CPU, does the problem occur?
(The question is whether any disturbance gets things going, or whether there
is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.

Dave, any other ideas on what might be causing this or what might be
tested?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 14:14                           ` Paul E. McKenney
  (?)
@ 2017-07-26 14:23                             ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 07:14:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> > On Wed, 26 Jul 2017 10:32:32 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > >         
> > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > >> sparc64 too.        
> > > > > > > 
> > > > > > > Very good!
> > > > > > >         
> > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > >> about every 4 seconds.        
> > > > > > > 
> > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > >         
> > > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > > 
> > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > be seeing a different bug.)        
> > > > > > 
> > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > 
> > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > > 
> > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > had effect.
> > > > > 
> > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > 
> > > > > 							Thanx, Paul
> > > > >       
> > > > I'll be the headless chicken running around and trying as many tests
> > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > minutes so we'll see how far we get.
> > > > 
> > > > Make me a list to run if you like ;)
> > > > 
> > > > NO_HZ_PERIODIC=y running now.    
> > > By which I mean CONFIG_HZ_PERIODIC=y  
> 
> I did get that messed up, didn't I?  Sorry for my confusion!
> 
> > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > I won't have much confidence until we are a few hours in on this.
> > > 
> > > Anyhow, certainly looking like a promising direction for investigation!
> > >   
> > Well it's done over 3 hours without a splat so I think it is fine with
> > CONFIG_HZ_PERIODIC=y  
> 
> Thank you!
> 
> If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> user task waking up every few seconds on each CPU, does the problem occur?
> (The question is whether any disturbance gets things going, or whether there
> is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> 
> Dave, any other ideas on what might be causing this or what might be
> tested?
> 
> 							Thanx, Paul
> 

Although it's still early days (40 mins in) it looks like the issue first
occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)

Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.

Could be something different was hiding it in 4.10 though.  We have a fair
delta from mainline back then unfortunately so bisecting will be
'interesting'.

I'll see if I can get the test you suggest running.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 14:23                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 14:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, 26 Jul 2017 07:14:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> > On Wed, 26 Jul 2017 10:32:32 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > >         
> > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > >> sparc64 too.        
> > > > > > > 
> > > > > > > Very good!
> > > > > > >         
> > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > >> about every 4 seconds.        
> > > > > > > 
> > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > >         
> > > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > > 
> > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > be seeing a different bug.)        
> > > > > > 
> > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > 
> > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > > 
> > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > had effect.
> > > > > 
> > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > 
> > > > > 							Thanx, Paul
> > > > >       
> > > > I'll be the headless chicken running around and trying as many tests
> > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > minutes so we'll see how far we get.
> > > > 
> > > > Make me a list to run if you like ;)
> > > > 
> > > > NO_HZ_PERIODIC=y running now.    
> > > By which I mean CONFIG_HZ_PERIODIC=y  
> 
> I did get that messed up, didn't I?  Sorry for my confusion!
> 
> > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > I won't have much confidence until we are a few hours in on this.
> > > 
> > > Anyhow, certainly looking like a promising direction for investigation!
> > >   
> > Well it's done over 3 hours without a splat so I think it is fine with
> > CONFIG_HZ_PERIODIC=y  
> 
> Thank you!
> 
> If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> user task waking up every few seconds on each CPU, does the problem occur?
> (The question is whether any disturbance gets things going, or whether there
> is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> 
> Dave, any other ideas on what might be causing this or what might be
> tested?
> 
> 							Thanx, Paul
> 

Although it's still early days (40 mins in) it looks like the issue first
occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)

Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.

Could be something different was hiding it in 4.10 though.  We have a fair
delta from mainline back then unfortunately so bisecting will be
'interesting'.

I'll see if I can get the test you suggest running.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 14:23                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 07:14:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> > On Wed, 26 Jul 2017 10:32:32 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >   
> > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:      
> > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > >         
> > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:        
> > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > >> sparc64 too.        
> > > > > > > 
> > > > > > > Very good!
> > > > > > >         
> > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > >> about every 4 seconds.        
> > > > > > > 
> > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > >         
> > > > > > >> So I wonder if this is a NO_HZ problem.        
> > > > > > > 
> > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > be seeing a different bug.)        
> > > > > > 
> > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > 
> > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.        
> > > > > 
> > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > had effect.
> > > > > 
> > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > 
> > > > > 							Thanx, Paul
> > > > >       
> > > > I'll be the headless chicken running around and trying as many tests
> > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > minutes so we'll see how far we get.
> > > > 
> > > > Make me a list to run if you like ;)
> > > > 
> > > > NO_HZ_PERIODIC=y running now.    
> > > By which I mean CONFIG_HZ_PERIODIC=y  
> 
> I did get that messed up, didn't I?  Sorry for my confusion!
> 
> > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > I won't have much confidence until we are a few hours in on this.
> > > 
> > > Anyhow, certainly looking like a promising direction for investigation!
> > >   
> > Well it's done over 3 hours without a splat so I think it is fine with
> > CONFIG_HZ_PERIODIC=y  
> 
> Thank you!
> 
> If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> user task waking up every few seconds on each CPU, does the problem occur?
> (The question is whether any disturbance gets things going, or whether there
> is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> 
> Dave, any other ideas on what might be causing this or what might be
> tested?
> 
> 							Thanx, Paul
> 

Although it's still early days (40 mins in) it looks like the issue first
occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)

Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.

Could be something different was hiding it in 4.10 though.  We have a fair
delta from mainline back then unfortunately so bisecting will be
'interesting'.

I'll see if I can get the test you suggest running.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 14:23                             ` Jonathan Cameron
  (?)
@ 2017-07-26 15:33                               ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 15:23:15 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 07:14:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > >           
> > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > >> sparc64 too.          
> > > > > > > > 
> > > > > > > > Very good!
> > > > > > > >           
> > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > >> about every 4 seconds.          
> > > > > > > > 
> > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > >           
> > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > 
> > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > be seeing a different bug.)          
> > > > > > > 
> > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > 
> > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > 
> > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > had effect.
> > > > > > 
> > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > 
> > > > > > 							Thanx, Paul
> > > > > >         
> > > > > I'll be the headless chicken running around and trying as many tests
> > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > minutes so we'll see how far we get.
> > > > > 
> > > > > Make me a list to run if you like ;)
> > > > > 
> > > > > NO_HZ_PERIODIC=y running now.      
> > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > 
> > I did get that messed up, didn't I?  Sorry for my confusion!
> >   
> > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > I won't have much confidence until we are a few hours in on this.
> > > > 
> > > > Anyhow, certainly looking like a promising direction for investigation!
> > > >     
> > > Well it's done over 3 hours without a splat so I think it is fine with
> > > CONFIG_HZ_PERIODIC=y    
> > 
> > Thank you!
> > 
> > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > user task waking up every few seconds on each CPU, does the problem occur?
> > (The question is whether any disturbance gets things going, or whether there
> > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > 
> > Dave, any other ideas on what might be causing this or what might be
> > tested?
> > 
> > 							Thanx, Paul
> >   
> 
> Although it's still early days (40 mins in) it looks like the issue first
> occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> 
> Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
Didn't leave it long enough. Still bad on 4.10-rc7 just took over
an hour to occur.
> 
> Could be something different was hiding it in 4.10 though.  We have a fair
> delta from mainline back then unfortunately so bisecting will be
> 'interesting'.
> 
> I'll see if I can get the test you suggest running.
> 
> Jonathan
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 15:33                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 15:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, 26 Jul 2017 15:23:15 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 07:14:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > >           
> > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > >> sparc64 too.          
> > > > > > > > 
> > > > > > > > Very good!
> > > > > > > >           
> > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > >> about every 4 seconds.          
> > > > > > > > 
> > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > >           
> > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > 
> > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > be seeing a different bug.)          
> > > > > > > 
> > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > 
> > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > 
> > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > had effect.
> > > > > > 
> > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > 
> > > > > > 							Thanx, Paul
> > > > > >         
> > > > > I'll be the headless chicken running around and trying as many tests
> > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > minutes so we'll see how far we get.
> > > > > 
> > > > > Make me a list to run if you like ;)
> > > > > 
> > > > > NO_HZ_PERIODIC=y running now.      
> > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > 
> > I did get that messed up, didn't I?  Sorry for my confusion!
> >   
> > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > I won't have much confidence until we are a few hours in on this.
> > > > 
> > > > Anyhow, certainly looking like a promising direction for investigation!
> > > >     
> > > Well it's done over 3 hours without a splat so I think it is fine with
> > > CONFIG_HZ_PERIODIC=y    
> > 
> > Thank you!
> > 
> > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > user task waking up every few seconds on each CPU, does the problem occur?
> > (The question is whether any disturbance gets things going, or whether there
> > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > 
> > Dave, any other ideas on what might be causing this or what might be
> > tested?
> > 
> > 							Thanx, Paul
> >   
> 
> Although it's still early days (40 mins in) it looks like the issue first
> occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> 
> Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
Didn't leave it long enough. Still bad on 4.10-rc7 just took over
an hour to occur.
> 
> Could be something different was hiding it in 4.10 though.  We have a fair
> delta from mainline back then unfortunately so bisecting will be
> 'interesting'.
> 
> I'll see if I can get the test you suggest running.
> 
> Jonathan
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 15:33                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 15:23:15 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 07:14:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > >           
> > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > >> sparc64 too.          
> > > > > > > > 
> > > > > > > > Very good!
> > > > > > > >           
> > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > >> about every 4 seconds.          
> > > > > > > > 
> > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > >           
> > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > 
> > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > be seeing a different bug.)          
> > > > > > > 
> > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > 
> > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > 
> > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > had effect.
> > > > > > 
> > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > 
> > > > > > 							Thanx, Paul
> > > > > >         
> > > > > I'll be the headless chicken running around and trying as many tests
> > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > minutes so we'll see how far we get.
> > > > > 
> > > > > Make me a list to run if you like ;)
> > > > > 
> > > > > NO_HZ_PERIODIC=y running now.      
> > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > 
> > I did get that messed up, didn't I?  Sorry for my confusion!
> >   
> > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > I won't have much confidence until we are a few hours in on this.
> > > > 
> > > > Anyhow, certainly looking like a promising direction for investigation!
> > > >     
> > > Well it's done over 3 hours without a splat so I think it is fine with
> > > CONFIG_HZ_PERIODIC=y    
> > 
> > Thank you!
> > 
> > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > user task waking up every few seconds on each CPU, does the problem occur?
> > (The question is whether any disturbance gets things going, or whether there
> > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > 
> > Dave, any other ideas on what might be causing this or what might be
> > tested?
> > 
> > 							Thanx, Paul
> >   
> 
> Although it's still early days (40 mins in) it looks like the issue first
> occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> 
> Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
Didn't leave it long enough. Still bad on 4.10-rc7 just took over
an hour to occur.
> 
> Could be something different was hiding it in 4.10 though.  We have a fair
> delta from mainline back then unfortunately so bisecting will be
> 'interesting'.
> 
> I'll see if I can get the test you suggest running.
> 
> Jonathan
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 15:33                               ` Jonathan Cameron
  (?)
@ 2017-07-26 15:49                                 ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 15:23:15 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 07:14:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > > >           
> > > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > > >> sparc64 too.          
> > > > > > > > > 
> > > > > > > > > Very good!
> > > > > > > > >           
> > > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > > >> about every 4 seconds.          
> > > > > > > > > 
> > > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > > >           
> > > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > > 
> > > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > > be seeing a different bug.)          
> > > > > > > > 
> > > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > > 
> > > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > > 
> > > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > > had effect.
> > > > > > > 
> > > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > > 
> > > > > > > 							Thanx, Paul
> > > > > > >         
> > > > > > I'll be the headless chicken running around and trying as many tests
> > > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > > minutes so we'll see how far we get.
> > > > > > 
> > > > > > Make me a list to run if you like ;)
> > > > > > 
> > > > > > NO_HZ_PERIODIC=y running now.      
> > > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > > 
> > > I did get that messed up, didn't I?  Sorry for my confusion!
> > >   
> > > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > > I won't have much confidence until we are a few hours in on this.
> > > > > 
> > > > > Anyhow, certainly looking like a promising direction for investigation!
> > > > >     
> > > > Well it's done over 3 hours without a splat so I think it is fine with
> > > > CONFIG_HZ_PERIODIC=y    
> > > 
> > > Thank you!
> > > 
> > > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > > user task waking up every few seconds on each CPU, does the problem occur?
> > > (The question is whether any disturbance gets things going, or whether there
> > > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > > 
> > > Dave, any other ideas on what might be causing this or what might be
> > > tested?
> > > 
> > > 							Thanx, Paul
> > >   
> > 
> > Although it's still early days (40 mins in) it looks like the issue first
> > occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> > 
> > Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
> 
> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> an hour to occur.

And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
are just greatly reducing the probability of the problem rather than
completely preventing it.

Still, hopefully useful information, thank you for the testing!

							Thanx, Paul

> > Could be something different was hiding it in 4.10 though.  We have a fair
> > delta from mainline back then unfortunately so bisecting will be
> > 'interesting'.
> > 
> > I'll see if I can get the test you suggest running.
> > 
> > Jonathan
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 15:49                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 15:49 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 15:23:15 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 07:14:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > > >           
> > > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > > >> sparc64 too.          
> > > > > > > > > 
> > > > > > > > > Very good!
> > > > > > > > >           
> > > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > > >> about every 4 seconds.          
> > > > > > > > > 
> > > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > > >           
> > > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > > 
> > > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > > be seeing a different bug.)          
> > > > > > > > 
> > > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > > 
> > > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > > 
> > > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > > had effect.
> > > > > > > 
> > > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > > 
> > > > > > > 							Thanx, Paul
> > > > > > >         
> > > > > > I'll be the headless chicken running around and trying as many tests
> > > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > > minutes so we'll see how far we get.
> > > > > > 
> > > > > > Make me a list to run if you like ;)
> > > > > > 
> > > > > > NO_HZ_PERIODIC=y running now.      
> > > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > > 
> > > I did get that messed up, didn't I?  Sorry for my confusion!
> > >   
> > > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > > I won't have much confidence until we are a few hours in on this.
> > > > > 
> > > > > Anyhow, certainly looking like a promising direction for investigation!
> > > > >     
> > > > Well it's done over 3 hours without a splat so I think it is fine with
> > > > CONFIG_HZ_PERIODIC=y    
> > > 
> > > Thank you!
> > > 
> > > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > > user task waking up every few seconds on each CPU, does the problem occur?
> > > (The question is whether any disturbance gets things going, or whether there
> > > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > > 
> > > Dave, any other ideas on what might be causing this or what might be
> > > tested?
> > > 
> > > 							Thanx, Paul
> > >   
> > 
> > Although it's still early days (40 mins in) it looks like the issue first
> > occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> > 
> > Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
> 
> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> an hour to occur.

And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
are just greatly reducing the probability of the problem rather than
completely preventing it.

Still, hopefully useful information, thank you for the testing!

							Thanx, Paul

> > Could be something different was hiding it in 4.10 though.  We have a fair
> > delta from mainline back then unfortunately so bisecting will be
> > 'interesting'.
> > 
> > I'll see if I can get the test you suggest running.
> > 
> > Jonathan
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 15:49                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 15:23:15 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 07:14:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >     
> > > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >     
> > > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:        
> > > > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > > >           
> > > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:          
> > > > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > > > >> sparc64 too.          
> > > > > > > > > 
> > > > > > > > > Very good!
> > > > > > > > >           
> > > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > > > > > >> about every 4 seconds.          
> > > > > > > > > 
> > > > > > > > > I could see where that could shake things loose, but I am surprised that
> > > > > > > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > > >           
> > > > > > > > >> So I wonder if this is a NO_HZ problem.          
> > > > > > > > > 
> > > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > > > > be seeing a different bug.)          
> > > > > > > > 
> > > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > > 
> > > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.          
> > > > > > > 
> > > > > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > > > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > > > > > had effect.
> > > > > > > 
> > > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > > 
> > > > > > > 							Thanx, Paul
> > > > > > >         
> > > > > > I'll be the headless chicken running around and trying as many tests
> > > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > > minutes so we'll see how far we get.
> > > > > > 
> > > > > > Make me a list to run if you like ;)
> > > > > > 
> > > > > > NO_HZ_PERIODIC=y running now.      
> > > > > By which I mean CONFIG_HZ_PERIODIC=y    
> > > 
> > > I did get that messed up, didn't I?  Sorry for my confusion!
> > >   
> > > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > > I won't have much confidence until we are a few hours in on this.
> > > > > 
> > > > > Anyhow, certainly looking like a promising direction for investigation!
> > > > >     
> > > > Well it's done over 3 hours without a splat so I think it is fine with
> > > > CONFIG_HZ_PERIODIC=y    
> > > 
> > > Thank you!
> > > 
> > > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > > user task waking up every few seconds on each CPU, does the problem occur?
> > > (The question is whether any disturbance gets things going, or whether there
> > > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > > 
> > > Dave, any other ideas on what might be causing this or what might be
> > > tested?
> > > 
> > > 							Thanx, Paul
> > >   
> > 
> > Although it's still early days (40 mins in) it looks like the issue first
> > occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> > 
> > Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
> 
> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> an hour to occur.

And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
are just greatly reducing the probability of the problem rather than
completely preventing it.

Still, hopefully useful information, thank you for the testing!

							Thanx, Paul

> > Could be something different was hiding it in 4.10 though.  We have a fair
> > delta from mainline back then unfortunately so bisecting will be
> > 'interesting'.
> > 
> > I'll see if I can get the test you suggest running.
> > 
> > Jonathan
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm at huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 14:14                           ` Paul E. McKenney
  (?)
@ 2017-07-26 16:48                             ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 07:14:17 -0700

> Dave, any other ideas on what might be causing this or what might be
> tested?

I was going to go through the changes that happened between v4.12
and now to kernel/timer/tick-sched.c and see if reverting any of
those help.

But I don't know when I would get to that.

Commit 411fe24e6b7c283c3a1911450cdba6dd3aaea56e, looks particularly juicy.
:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 16:48                             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:48 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 07:14:17 -0700

> Dave, any other ideas on what might be causing this or what might be
> tested?

I was going to go through the changes that happened between v4.12
and now to kernel/timer/tick-sched.c and see if reverting any of
those help.

But I don't know when I would get to that.

Commit 411fe24e6b7c283c3a1911450cdba6dd3aaea56e, looks particularly juicy.
:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 16:48                             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 07:14:17 -0700

> Dave, any other ideas on what might be causing this or what might be
> tested?

I was going to go through the changes that happened between v4.12
and now to kernel/timer/tick-sched.c and see if reverting any of
those help.

But I don't know when I would get to that.

Commit 411fe24e6b7c283c3a1911450cdba6dd3aaea56e, looks particularly juicy.
:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 15:49                                 ` Paul E. McKenney
  (?)
@ 2017-07-26 16:54                                   ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 08:49:00 -0700

> On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
>> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
>> an hour to occur.
> 
> And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> are just greatly reducing the probability of the problem rather than
> completely preventing it.
> 
> Still, hopefully useful information, thank you for the testing!

I guess that invalidates my idea to test reverting recent changes to
the tick-sched.c code... :-/

In NO_HZ_IDLE mode, what is really supposed to happen on a completely
idle system?

All the cpus enter the idle loop, have no timers programmed, and they
all just go to sleep until an external event happens.

What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 16:54                                   ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:54 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 08:49:00 -0700

> On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
>> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
>> an hour to occur.
> 
> And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> are just greatly reducing the probability of the problem rather than
> completely preventing it.
> 
> Still, hopefully useful information, thank you for the testing!

I guess that invalidates my idea to test reverting recent changes to
the tick-sched.c code... :-/

In NO_HZ_IDLE mode, what is really supposed to happen on a completely
idle system?

All the cpus enter the idle loop, have no timers programmed, and they
all just go to sleep until an external event happens.

What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 16:54                                   ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 08:49:00 -0700

> On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
>> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
>> an hour to occur.
> 
> And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> are just greatly reducing the probability of the problem rather than
> completely preventing it.
> 
> Still, hopefully useful information, thank you for the testing!

I guess that invalidates my idea to test reverting recent changes to
the tick-sched.c code... :-/

In NO_HZ_IDLE mode, what is really supposed to happen on a completely
idle system?

All the cpus enter the idle loop, have no timers programmed, and they
all just go to sleep until an external event happens.

What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 16:54                                   ` David Miller
  (?)
@ 2017-07-26 17:13                                     ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 17:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 09:54:32 -0700
David Miller <davem@davemloft.net> wrote:

> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:  
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.  
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!  

Not sure it actually gives us much information, but no issues yet
with a simple program running every cpu that wakes up every 3 seconds.

Will leave it running overnight and report back in the morning.

> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 17:13                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 17:13 UTC (permalink / raw)
  To: David Miller
  Cc: paulmck, dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, linux-arm-kernel

On Wed, 26 Jul 2017 09:54:32 -0700
David Miller <davem@davemloft.net> wrote:

> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:  
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.  
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!  

Not sure it actually gives us much information, but no issues yet
with a simple program running every cpu that wakes up every 3 seconds.

Will leave it running overnight and report back in the morning.

> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 17:13                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-26 17:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 09:54:32 -0700
David Miller <davem@davemloft.net> wrote:

> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:  
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.  
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!  

Not sure it actually gives us much information, but no issues yet
with a simple program running every cpu that wakes up every 3 seconds.

Will leave it running overnight and report back in the morning.

> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 16:54                                   ` David Miller
  (?)
@ 2017-07-26 17:50                                     ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!
> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

There are several different situations with different mechanisms:

1.	No grace period is in progress and no RCU callbacks are pending
	anywhere in the system.  In this case, some other event would
	need to start a grace period, so RCU just stays idle until that
	happens, possibly indefinitely.  According to the battery-powered
	embedded guys, this is a feature, not a bug.  ;-)

2.	No grace period is in progress, but there is at least one RCU
	callback somewhere in the system.  In this case, the mechanism
	depends on CONFIG_RCU_FAST_NO_HZ:

	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
		queued will return "true" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  This will cause the scheduling-clock
		interrupt to remain on, despite the CPU being idle,
		which will in turn allow RCU's state machine to continue
		running out of softirq, triggered by the scheduling-clock
		interrupts.

	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
		will return "false" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  However, it will also request a next event
		about six seconds in the future if all callbacks do
		nothing but free memory (kfree_rcu()), or about four
		jiffies in the future if at least one callback does
		something more than just free memory.

		There is also a rcu_prepare_for_idle() function that
		is invoked later in the idle-entry process in this case
		which will wake up the grace-period kthread if need be.

3.	A grace period is in progress.  In this case the grace-period
	kthread is either currently running (in which case there will be
	at least one non-idle CPU) or is in a timed wait for its next
	scan for idle/offline CPUs (such CPUs need the grace-period
	kthread to report quiescent states on their behalf).  In this
	latter case, the timer subsystem will post a next event that
	will be the wakeup time for the grace-period kthread, or some
	earlier event.

	This is where we have been seeing trouble, if for no other
	reason because RCU CPU stall warnings only happen when there
	is a grace period in progress.

That is the theory, anyway...

And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
saw RCU CPU stall warnings with starvation of the grace-period kthread.
I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
to see if it makes a significance difference for my testing.  I do have
CONFIG_RCU_FAST_NO_HZ=y in my runs.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 17:50                                     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 17:50 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!
> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

There are several different situations with different mechanisms:

1.	No grace period is in progress and no RCU callbacks are pending
	anywhere in the system.  In this case, some other event would
	need to start a grace period, so RCU just stays idle until that
	happens, possibly indefinitely.  According to the battery-powered
	embedded guys, this is a feature, not a bug.  ;-)

2.	No grace period is in progress, but there is at least one RCU
	callback somewhere in the system.  In this case, the mechanism
	depends on CONFIG_RCU_FAST_NO_HZ:

	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
		queued will return "true" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  This will cause the scheduling-clock
		interrupt to remain on, despite the CPU being idle,
		which will in turn allow RCU's state machine to continue
		running out of softirq, triggered by the scheduling-clock
		interrupts.

	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
		will return "false" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  However, it will also request a next event
		about six seconds in the future if all callbacks do
		nothing but free memory (kfree_rcu()), or about four
		jiffies in the future if at least one callback does
		something more than just free memory.

		There is also a rcu_prepare_for_idle() function that
		is invoked later in the idle-entry process in this case
		which will wake up the grace-period kthread if need be.

3.	A grace period is in progress.  In this case the grace-period
	kthread is either currently running (in which case there will be
	at least one non-idle CPU) or is in a timed wait for its next
	scan for idle/offline CPUs (such CPUs need the grace-period
	kthread to report quiescent states on their behalf).  In this
	latter case, the timer subsystem will post a next event that
	will be the wakeup time for the grace-period kthread, or some
	earlier event.

	This is where we have been seeing trouble, if for no other
	reason because RCU CPU stall warnings only happen when there
	is a grace period in progress.

That is the theory, anyway...

And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
saw RCU CPU stall warnings with starvation of the grace-period kthread.
I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
to see if it makes a significance difference for my testing.  I do have
CONFIG_RCU_FAST_NO_HZ=y in my runs.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 17:50                                     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!
> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

There are several different situations with different mechanisms:

1.	No grace period is in progress and no RCU callbacks are pending
	anywhere in the system.  In this case, some other event would
	need to start a grace period, so RCU just stays idle until that
	happens, possibly indefinitely.  According to the battery-powered
	embedded guys, this is a feature, not a bug.  ;-)

2.	No grace period is in progress, but there is at least one RCU
	callback somewhere in the system.  In this case, the mechanism
	depends on CONFIG_RCU_FAST_NO_HZ:

	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
		queued will return "true" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  This will cause the scheduling-clock
		interrupt to remain on, despite the CPU being idle,
		which will in turn allow RCU's state machine to continue
		running out of softirq, triggered by the scheduling-clock
		interrupts.

	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
		will return "false" in response to the call to
		rcu_needs_cpu() that is made shortly before that CPU
		enters idle.  However, it will also request a next event
		about six seconds in the future if all callbacks do
		nothing but free memory (kfree_rcu()), or about four
		jiffies in the future if at least one callback does
		something more than just free memory.

		There is also a rcu_prepare_for_idle() function that
		is invoked later in the idle-entry process in this case
		which will wake up the grace-period kthread if need be.

3.	A grace period is in progress.  In this case the grace-period
	kthread is either currently running (in which case there will be
	at least one non-idle CPU) or is in a timed wait for its next
	scan for idle/offline CPUs (such CPUs need the grace-period
	kthread to report quiescent states on their behalf).  In this
	latter case, the timer subsystem will post a next event that
	will be the wakeup time for the grace-period kthread, or some
	earlier event.

	This is where we have been seeing trouble, if for no other
	reason because RCU CPU stall warnings only happen when there
	is a grace period in progress.

That is the theory, anyway...

And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
saw RCU CPU stall warnings with starvation of the grace-period kthread.
I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
to see if it makes a significance difference for my testing.  I do have
CONFIG_RCU_FAST_NO_HZ=y in my runs.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 17:50                                     ` Paul E. McKenney
  (?)
@ 2017-07-26 22:36                                       ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 22:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 10:50:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> > 
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?
> 
> There are several different situations with different mechanisms:
> 
> 1.	No grace period is in progress and no RCU callbacks are pending
> 	anywhere in the system.  In this case, some other event would
> 	need to start a grace period, so RCU just stays idle until that
> 	happens, possibly indefinitely.  According to the battery-powered
> 	embedded guys, this is a feature, not a bug.  ;-)
> 
> 2.	No grace period is in progress, but there is at least one RCU
> 	callback somewhere in the system.  In this case, the mechanism
> 	depends on CONFIG_RCU_FAST_NO_HZ:
> 
> 	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
> 		queued will return "true" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  This will cause the scheduling-clock
> 		interrupt to remain on, despite the CPU being idle,
> 		which will in turn allow RCU's state machine to continue
> 		running out of softirq, triggered by the scheduling-clock
> 		interrupts.
> 
> 	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
> 		will return "false" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  However, it will also request a next event
> 		about six seconds in the future if all callbacks do
> 		nothing but free memory (kfree_rcu()), or about four
> 		jiffies in the future if at least one callback does
> 		something more than just free memory.
> 
> 		There is also a rcu_prepare_for_idle() function that
> 		is invoked later in the idle-entry process in this case
> 		which will wake up the grace-period kthread if need be.
> 
> 3.	A grace period is in progress.  In this case the grace-period
> 	kthread is either currently running (in which case there will be
> 	at least one non-idle CPU) or is in a timed wait for its next
> 	scan for idle/offline CPUs (such CPUs need the grace-period
> 	kthread to report quiescent states on their behalf).  In this
> 	latter case, the timer subsystem will post a next event that
> 	will be the wakeup time for the grace-period kthread, or some
> 	earlier event.
> 
> 	This is where we have been seeing trouble, if for no other
> 	reason because RCU CPU stall warnings only happen when there
> 	is a grace period in progress.
> 
> That is the theory, anyway...
> 
> And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
> I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
> saw RCU CPU stall warnings with starvation of the grace-period kthread.
> I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
> to see if it makes a significance difference for my testing.  I do have
> CONFIG_RCU_FAST_NO_HZ=y in my runs.

And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
really is having an effect.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 22:36                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 22:36 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, Jul 26, 2017 at 10:50:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> > 
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?
> 
> There are several different situations with different mechanisms:
> 
> 1.	No grace period is in progress and no RCU callbacks are pending
> 	anywhere in the system.  In this case, some other event would
> 	need to start a grace period, so RCU just stays idle until that
> 	happens, possibly indefinitely.  According to the battery-powered
> 	embedded guys, this is a feature, not a bug.  ;-)
> 
> 2.	No grace period is in progress, but there is at least one RCU
> 	callback somewhere in the system.  In this case, the mechanism
> 	depends on CONFIG_RCU_FAST_NO_HZ:
> 
> 	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
> 		queued will return "true" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  This will cause the scheduling-clock
> 		interrupt to remain on, despite the CPU being idle,
> 		which will in turn allow RCU's state machine to continue
> 		running out of softirq, triggered by the scheduling-clock
> 		interrupts.
> 
> 	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
> 		will return "false" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  However, it will also request a next event
> 		about six seconds in the future if all callbacks do
> 		nothing but free memory (kfree_rcu()), or about four
> 		jiffies in the future if at least one callback does
> 		something more than just free memory.
> 
> 		There is also a rcu_prepare_for_idle() function that
> 		is invoked later in the idle-entry process in this case
> 		which will wake up the grace-period kthread if need be.
> 
> 3.	A grace period is in progress.  In this case the grace-period
> 	kthread is either currently running (in which case there will be
> 	at least one non-idle CPU) or is in a timed wait for its next
> 	scan for idle/offline CPUs (such CPUs need the grace-period
> 	kthread to report quiescent states on their behalf).  In this
> 	latter case, the timer subsystem will post a next event that
> 	will be the wakeup time for the grace-period kthread, or some
> 	earlier event.
> 
> 	This is where we have been seeing trouble, if for no other
> 	reason because RCU CPU stall warnings only happen when there
> 	is a grace period in progress.
> 
> That is the theory, anyway...
> 
> And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
> I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
> saw RCU CPU stall warnings with starvation of the grace-period kthread.
> I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
> to see if it makes a significance difference for my testing.  I do have
> CONFIG_RCU_FAST_NO_HZ=y in my runs.

And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
really is having an effect.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 22:36                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 22:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 10:50:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> > 
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?
> 
> There are several different situations with different mechanisms:
> 
> 1.	No grace period is in progress and no RCU callbacks are pending
> 	anywhere in the system.  In this case, some other event would
> 	need to start a grace period, so RCU just stays idle until that
> 	happens, possibly indefinitely.  According to the battery-powered
> 	embedded guys, this is a feature, not a bug.  ;-)
> 
> 2.	No grace period is in progress, but there is at least one RCU
> 	callback somewhere in the system.  In this case, the mechanism
> 	depends on CONFIG_RCU_FAST_NO_HZ:
> 
> 	CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
> 		queued will return "true" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  This will cause the scheduling-clock
> 		interrupt to remain on, despite the CPU being idle,
> 		which will in turn allow RCU's state machine to continue
> 		running out of softirq, triggered by the scheduling-clock
> 		interrupts.
> 
> 	CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
> 		will return "false" in response to the call to
> 		rcu_needs_cpu() that is made shortly before that CPU
> 		enters idle.  However, it will also request a next event
> 		about six seconds in the future if all callbacks do
> 		nothing but free memory (kfree_rcu()), or about four
> 		jiffies in the future if at least one callback does
> 		something more than just free memory.
> 
> 		There is also a rcu_prepare_for_idle() function that
> 		is invoked later in the idle-entry process in this case
> 		which will wake up the grace-period kthread if need be.
> 
> 3.	A grace period is in progress.  In this case the grace-period
> 	kthread is either currently running (in which case there will be
> 	at least one non-idle CPU) or is in a timed wait for its next
> 	scan for idle/offline CPUs (such CPUs need the grace-period
> 	kthread to report quiescent states on their behalf).  In this
> 	latter case, the timer subsystem will post a next event that
> 	will be the wakeup time for the grace-period kthread, or some
> 	earlier event.
> 
> 	This is where we have been seeing trouble, if for no other
> 	reason because RCU CPU stall warnings only happen when there
> 	is a grace period in progress.
> 
> That is the theory, anyway...
> 
> And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
> I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
> saw RCU CPU stall warnings with starvation of the grace-period kthread.
> I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
> to see if it makes a significance difference for my testing.  I do have
> CONFIG_RCU_FAST_NO_HZ=y in my runs.

And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
really is having an effect.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 22:36                                       ` Paul E. McKenney
  (?)
@ 2017-07-26 22:45                                         ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 22:45 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 15:36:58 -0700

> And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> really is having an effect.

Thanks for all of the info Paul, I'll digest this and scan over the
code myself.

Just out of curiousity, what x86 idle method is your machine using?
The mwait one or the one which simply uses 'halt'?  The mwait variant
might mask this bug, and halt would be a lot closer to how sparc64 and
Jonathan's system operates.

On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
local TICK register keeps advancing, and the local timer therefore
will still trigger.  Also, any externally generated interrupts
(including cross calls) will wake up the cpu as well.

The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
case.  One of my running theories is that we miss scheduling a tick
due to a race.  That would be consistent with the behavior we see
in the RCU dumps, I think.

Anyways, just a theory, and that's why I keep mentioning that commit
about the revert of the revert (specifically
411fe24e6b7c283c3a1911450cdba6dd3aaea56e).

:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 22:45                                         ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 22:45 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 15:36:58 -0700

> And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> really is having an effect.

Thanks for all of the info Paul, I'll digest this and scan over the
code myself.

Just out of curiousity, what x86 idle method is your machine using?
The mwait one or the one which simply uses 'halt'?  The mwait variant
might mask this bug, and halt would be a lot closer to how sparc64 and
Jonathan's system operates.

On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
local TICK register keeps advancing, and the local timer therefore
will still trigger.  Also, any externally generated interrupts
(including cross calls) will wake up the cpu as well.

The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
case.  One of my running theories is that we miss scheduling a tick
due to a race.  That would be consistent with the behavior we see
in the RCU dumps, I think.

Anyways, just a theory, and that's why I keep mentioning that commit
about the revert of the revert (specifically
411fe24e6b7c283c3a1911450cdba6dd3aaea56e).

:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 22:45                                         ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 22:45 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 15:36:58 -0700

> And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> really is having an effect.

Thanks for all of the info Paul, I'll digest this and scan over the
code myself.

Just out of curiousity, what x86 idle method is your machine using?
The mwait one or the one which simply uses 'halt'?  The mwait variant
might mask this bug, and halt would be a lot closer to how sparc64 and
Jonathan's system operates.

On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
local TICK register keeps advancing, and the local timer therefore
will still trigger.  Also, any externally generated interrupts
(including cross calls) will wake up the cpu as well.

The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
case.  One of my running theories is that we miss scheduling a tick
due to a race.  That would be consistent with the behavior we see
in the RCU dumps, I think.

Anyways, just a theory, and that's why I keep mentioning that commit
about the revert of the revert (specifically
411fe24e6b7c283c3a1911450cdba6dd3aaea56e).

:-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 22:45                                         ` David Miller
  (?)
@ 2017-07-26 23:15                                           ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 23:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 15:36:58 -0700
> 
> > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > really is having an effect.
> 
> Thanks for all of the info Paul, I'll digest this and scan over the
> code myself.
> 
> Just out of curiousity, what x86 idle method is your machine using?
> The mwait one or the one which simply uses 'halt'?  The mwait variant
> might mask this bug, and halt would be a lot closer to how sparc64 and
> Jonathan's system operates.

My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
I am not using the mwait one.  Here is a grep for IDLE in my .config:

	CONFIG_NO_HZ_IDLE=y
	CONFIG_GENERIC_SMP_IDLE_THREAD=y
	# CONFIG_IDLE_PAGE_TRACKING is not set
	CONFIG_ACPI_PROCESSOR_IDLE=y
	CONFIG_CPU_IDLE=y
	# CONFIG_CPU_IDLE_GOV_LADDER is not set
	CONFIG_CPU_IDLE_GOV_MENU=y
	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
	# CONFIG_INTEL_IDLE is not set

> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> local TICK register keeps advancing, and the local timer therefore
> will still trigger.  Also, any externally generated interrupts
> (including cross calls) will wake up the cpu as well.
> 
> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> case.  One of my running theories is that we miss scheduling a tick
> due to a race.  That would be consistent with the behavior we see
> in the RCU dumps, I think.

But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
warning?  By default, your grace period needs to extend for more than
21 seconds (more than one-third of a -minute-) to get one.  Or do
you mean that the ticks get shut off now and forever, as opposed to
just losing one of them?

> Anyways, just a theory, and that's why I keep mentioning that commit
> about the revert of the revert (specifically
> 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> 
> :-)

I am running an overnight test in preparation for attempting to push
some fixes for regressions into 4.12, but will try reverting this
and enabling CONFIG_HZ_PERIODIC tomorrow.

Jonathan, might the commit that Dave points out above be what reduces
the probability of occurrence as you test older releases?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 23:15                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 23:15 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 15:36:58 -0700
> 
> > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > really is having an effect.
> 
> Thanks for all of the info Paul, I'll digest this and scan over the
> code myself.
> 
> Just out of curiousity, what x86 idle method is your machine using?
> The mwait one or the one which simply uses 'halt'?  The mwait variant
> might mask this bug, and halt would be a lot closer to how sparc64 and
> Jonathan's system operates.

My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
I am not using the mwait one.  Here is a grep for IDLE in my .config:

	CONFIG_NO_HZ_IDLE=y
	CONFIG_GENERIC_SMP_IDLE_THREAD=y
	# CONFIG_IDLE_PAGE_TRACKING is not set
	CONFIG_ACPI_PROCESSOR_IDLE=y
	CONFIG_CPU_IDLE=y
	# CONFIG_CPU_IDLE_GOV_LADDER is not set
	CONFIG_CPU_IDLE_GOV_MENU=y
	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
	# CONFIG_INTEL_IDLE is not set

> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> local TICK register keeps advancing, and the local timer therefore
> will still trigger.  Also, any externally generated interrupts
> (including cross calls) will wake up the cpu as well.
> 
> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> case.  One of my running theories is that we miss scheduling a tick
> due to a race.  That would be consistent with the behavior we see
> in the RCU dumps, I think.

But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
warning?  By default, your grace period needs to extend for more than
21 seconds (more than one-third of a -minute-) to get one.  Or do
you mean that the ticks get shut off now and forever, as opposed to
just losing one of them?

> Anyways, just a theory, and that's why I keep mentioning that commit
> about the revert of the revert (specifically
> 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> 
> :-)

I am running an overnight test in preparation for attempting to push
some fixes for regressions into 4.12, but will try reverting this
and enabling CONFIG_HZ_PERIODIC tomorrow.

Jonathan, might the commit that Dave points out above be what reduces
the probability of occurrence as you test older releases?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 23:15                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-26 23:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 15:36:58 -0700
> 
> > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > really is having an effect.
> 
> Thanks for all of the info Paul, I'll digest this and scan over the
> code myself.
> 
> Just out of curiousity, what x86 idle method is your machine using?
> The mwait one or the one which simply uses 'halt'?  The mwait variant
> might mask this bug, and halt would be a lot closer to how sparc64 and
> Jonathan's system operates.

My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
I am not using the mwait one.  Here is a grep for IDLE in my .config:

	CONFIG_NO_HZ_IDLE=y
	CONFIG_GENERIC_SMP_IDLE_THREAD=y
	# CONFIG_IDLE_PAGE_TRACKING is not set
	CONFIG_ACPI_PROCESSOR_IDLE=y
	CONFIG_CPU_IDLE=y
	# CONFIG_CPU_IDLE_GOV_LADDER is not set
	CONFIG_CPU_IDLE_GOV_MENU=y
	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
	# CONFIG_INTEL_IDLE is not set

> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> local TICK register keeps advancing, and the local timer therefore
> will still trigger.  Also, any externally generated interrupts
> (including cross calls) will wake up the cpu as well.
> 
> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> case.  One of my running theories is that we miss scheduling a tick
> due to a race.  That would be consistent with the behavior we see
> in the RCU dumps, I think.

But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
warning?  By default, your grace period needs to extend for more than
21 seconds (more than one-third of a -minute-) to get one.  Or do
you mean that the ticks get shut off now and forever, as opposed to
just losing one of them?

> Anyways, just a theory, and that's why I keep mentioning that commit
> about the revert of the revert (specifically
> 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> 
> :-)

I am running an overnight test in preparation for attempting to push
some fixes for regressions into 4.12, but will try reverting this
and enabling CONFIG_HZ_PERIODIC tomorrow.

Jonathan, might the commit that Dave points out above be what reduces
the probability of occurrence as you test older releases?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 23:15                                           ` Paul E. McKenney
  (?)
@ 2017-07-26 23:22                                             ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 16:15:05 -0700

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
>> Just out of curiousity, what x86 idle method is your machine using?
>> The mwait one or the one which simply uses 'halt'?  The mwait variant
>> might mask this bug, and halt would be a lot closer to how sparc64 and
>> Jonathan's system operates.
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set

No, that doesn't influence it.  It is determined by cpu features at
run time.

If you are using mwait, it'll say so in your kernel log like this:

	using mwait in idle threads

>> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
>> local TICK register keeps advancing, and the local timer therefore
>> will still trigger.  Also, any externally generated interrupts
>> (including cross calls) will wake up the cpu as well.
>> 
>> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
>> case.  One of my running theories is that we miss scheduling a tick
>> due to a race.  That would be consistent with the behavior we see
>> in the RCU dumps, I think.
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?

Hmmm, good point.  And I was talking about simply missing one tick.

Indeed, that really wouldn't explain how we end up with a RCU stall
dump listing almost all of the cpus as having missed a grace period.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 23:22                                             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 23:22 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 16:15:05 -0700

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
>> Just out of curiousity, what x86 idle method is your machine using?
>> The mwait one or the one which simply uses 'halt'?  The mwait variant
>> might mask this bug, and halt would be a lot closer to how sparc64 and
>> Jonathan's system operates.
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set

No, that doesn't influence it.  It is determined by cpu features at
run time.

If you are using mwait, it'll say so in your kernel log like this:

	using mwait in idle threads

>> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
>> local TICK register keeps advancing, and the local timer therefore
>> will still trigger.  Also, any externally generated interrupts
>> (including cross calls) will wake up the cpu as well.
>> 
>> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
>> case.  One of my running theories is that we miss scheduling a tick
>> due to a race.  That would be consistent with the behavior we see
>> in the RCU dumps, I think.
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?

Hmmm, good point.  And I was talking about simply missing one tick.

Indeed, that really wouldn't explain how we end up with a RCU stall
dump listing almost all of the cpus as having missed a grace period.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-26 23:22                                             ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-26 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 26 Jul 2017 16:15:05 -0700

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
>> Just out of curiousity, what x86 idle method is your machine using?
>> The mwait one or the one which simply uses 'halt'?  The mwait variant
>> might mask this bug, and halt would be a lot closer to how sparc64 and
>> Jonathan's system operates.
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set

No, that doesn't influence it.  It is determined by cpu features at
run time.

If you are using mwait, it'll say so in your kernel log like this:

	using mwait in idle threads

>> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
>> local TICK register keeps advancing, and the local timer therefore
>> will still trigger.  Also, any externally generated interrupts
>> (including cross calls) will wake up the cpu as well.
>> 
>> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
>> case.  One of my running theories is that we miss scheduling a tick
>> due to a race.  That would be consistent with the behavior we see
>> in the RCU dumps, I think.
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?

Hmmm, good point.  And I was talking about simply missing one tick.

Indeed, that really wouldn't explain how we end up with a RCU stall
dump listing almost all of the cpus as having missed a grace period.

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 23:22                                             ` David Miller
  (?)
@ 2017-07-27  1:42                                               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27  1:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 16:15:05 -0700
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> >> Just out of curiousity, what x86 idle method is your machine using?
> >> The mwait one or the one which simply uses 'halt'?  The mwait variant
> >> might mask this bug, and halt would be a lot closer to how sparc64 and
> >> Jonathan's system operates.
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> 
> No, that doesn't influence it.  It is determined by cpu features at
> run time.
> 
> If you are using mwait, it'll say so in your kernel log like this:
> 
> 	using mwait in idle threads

Thank you for the hint!

And vim says:

"E486: Pattern not found: using mwait in idle threads"

> >> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> >> local TICK register keeps advancing, and the local timer therefore
> >> will still trigger.  Also, any externally generated interrupts
> >> (including cross calls) will wake up the cpu as well.
> >> 
> >> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> >> case.  One of my running theories is that we miss scheduling a tick
> >> due to a race.  That would be consistent with the behavior we see
> >> in the RCU dumps, I think.
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> 
> Hmmm, good point.  And I was talking about simply missing one tick.
> 
> Indeed, that really wouldn't explain how we end up with a RCU stall
> dump listing almost all of the cpus as having missed a grace period.

I have seen stranger things, but admittedly not often.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  1:42                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27  1:42 UTC (permalink / raw)
  To: David Miller
  Cc: Jonathan.Cameron, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 16:15:05 -0700
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> >> Just out of curiousity, what x86 idle method is your machine using?
> >> The mwait one or the one which simply uses 'halt'?  The mwait variant
> >> might mask this bug, and halt would be a lot closer to how sparc64 and
> >> Jonathan's system operates.
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> 
> No, that doesn't influence it.  It is determined by cpu features at
> run time.
> 
> If you are using mwait, it'll say so in your kernel log like this:
> 
> 	using mwait in idle threads

Thank you for the hint!

And vim says:

"E486: Pattern not found: using mwait in idle threads"

> >> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> >> local TICK register keeps advancing, and the local timer therefore
> >> will still trigger.  Also, any externally generated interrupts
> >> (including cross calls) will wake up the cpu as well.
> >> 
> >> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> >> case.  One of my running theories is that we miss scheduling a tick
> >> due to a race.  That would be consistent with the behavior we see
> >> in the RCU dumps, I think.
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> 
> Hmmm, good point.  And I was talking about simply missing one tick.
> 
> Indeed, that really wouldn't explain how we end up with a RCU stall
> dump listing almost all of the cpus as having missed a grace period.

I have seen stranger things, but admittedly not often.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  1:42                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27  1:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Wed, 26 Jul 2017 16:15:05 -0700
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> >> Just out of curiousity, what x86 idle method is your machine using?
> >> The mwait one or the one which simply uses 'halt'?  The mwait variant
> >> might mask this bug, and halt would be a lot closer to how sparc64 and
> >> Jonathan's system operates.
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> 
> No, that doesn't influence it.  It is determined by cpu features at
> run time.
> 
> If you are using mwait, it'll say so in your kernel log like this:
> 
> 	using mwait in idle threads

Thank you for the hint!

And vim says:

"E486: Pattern not found: using mwait in idle threads"

> >> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> >> local TICK register keeps advancing, and the local timer therefore
> >> will still trigger.  Also, any externally generated interrupts
> >> (including cross calls) will wake up the cpu as well.
> >> 
> >> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> >> case.  One of my running theories is that we miss scheduling a tick
> >> due to a race.  That would be consistent with the behavior we see
> >> in the RCU dumps, I think.
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> 
> Hmmm, good point.  And I was talking about simply missing one tick.
> 
> Indeed, that really wouldn't explain how we end up with a RCU stall
> dump listing almost all of the cpus as having missed a grace period.

I have seen stranger things, but admittedly not often.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27  1:42                                               ` Paul E. McKenney
  (?)
@ 2017-07-27  4:34                                                 ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-07-27  4:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 18:42:14 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:

> > Indeed, that really wouldn't explain how we end up with a RCU stall
> > dump listing almost all of the cpus as having missed a grace period.  
> 
> I have seen stranger things, but admittedly not often.

So the backtraces show the RCU gp thread in schedule_timeout.

Are you sure that it's timeout has expired and it's not being scheduled,
or could it be a bad (large) timeout (looks unlikely) or that it's being
scheduled but not correctly noting gps on other CPUs?

It's not in R state, so if it's not being scheduled at all, then it's
because the timer has not fired:

[ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
[ 1984.643626] Call trace:
[ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
[ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
[ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
[ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
[ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
[ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  4:34                                                 ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-07-27  4:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Miller, Jonathan.Cameron, dzickus, sfr, linuxarm, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, 26 Jul 2017 18:42:14 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:

> > Indeed, that really wouldn't explain how we end up with a RCU stall
> > dump listing almost all of the cpus as having missed a grace period.  
> 
> I have seen stranger things, but admittedly not often.

So the backtraces show the RCU gp thread in schedule_timeout.

Are you sure that it's timeout has expired and it's not being scheduled,
or could it be a bad (large) timeout (looks unlikely) or that it's being
scheduled but not correctly noting gps on other CPUs?

It's not in R state, so if it's not being scheduled at all, then it's
because the timer has not fired:

[ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
[ 1984.643626] Call trace:
[ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
[ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
[ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
[ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
[ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
[ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  4:34                                                 ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-07-27  4:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 18:42:14 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:

> > Indeed, that really wouldn't explain how we end up with a RCU stall
> > dump listing almost all of the cpus as having missed a grace period.  
> 
> I have seen stranger things, but admittedly not often.

So the backtraces show the RCU gp thread in schedule_timeout.

Are you sure that it's timeout has expired and it's not being scheduled,
or could it be a bad (large) timeout (looks unlikely) or that it's being
scheduled but not correctly noting gps on other CPUs?

It's not in R state, so if it's not being scheduled at all, then it's
because the timer has not fired:

[ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
[ 1984.643626] Call trace:
[ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
[ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
[ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
[ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
[ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
[ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
[ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 17:13                                     ` Jonathan Cameron
  (?)
@ 2017-07-27  7:41                                       ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 18:13:12 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:54:32 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> >   
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:    
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.    
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!    
> 
> Not sure it actually gives us much information, but no issues yet
> with a simple program running every cpu that wakes up every 3 seconds.
> 
> Will leave it running overnight and report back in the morning.
Perhaps unsurprisingly the above test didn't show any splats.

So it appears a userspace wakeup is enough to stop the issue happening
(or at least make it a lot less likely).

Jonathan
> 
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?  
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  7:41                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27  7:41 UTC (permalink / raw)
  To: David Miller
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	paulmck, linuxppc-dev, linux-arm-kernel

On Wed, 26 Jul 2017 18:13:12 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:54:32 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> >   
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:    
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.    
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!    
> 
> Not sure it actually gives us much information, but no issues yet
> with a simple program running every cpu that wakes up every 3 seconds.
> 
> Will leave it running overnight and report back in the morning.
Perhaps unsurprisingly the above test didn't show any splats.

So it appears a userspace wakeup is enough to stop the issue happening
(or at least make it a lot less likely).

Jonathan
> 
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?  
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27  7:41                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 18:13:12 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 09:54:32 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> >   
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:    
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.    
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!    
> 
> Not sure it actually gives us much information, but no issues yet
> with a simple program running every cpu that wakes up every 3 seconds.
> 
> Will leave it running overnight and report back in the morning.
Perhaps unsurprisingly the above test didn't show any splats.

So it appears a userspace wakeup is enough to stop the issue happening
(or at least make it a lot less likely).

Jonathan
> 
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?  
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27  4:34                                                 ` Nicholas Piggin
  (?)
@ 2017-07-27 12:49                                                   ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> 
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.  
> > 
> > I have seen stranger things, but admittedly not often.
> 
> So the backtraces show the RCU gp thread in schedule_timeout.
> 
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
> 
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:

Good point, Nick!

Jonathan, could you please reproduce collecting timer event tracing?

							Thanx, Paul

> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 12:49                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 12:49 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: David Miller, Jonathan.Cameron, dzickus, sfr, linuxarm, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> 
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.  
> > 
> > I have seen stranger things, but admittedly not often.
> 
> So the backtraces show the RCU gp thread in schedule_timeout.
> 
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
> 
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:

Good point, Nick!

Jonathan, could you please reproduce collecting timer event tracing?

							Thanx, Paul

> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 12:49                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> 
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.  
> > 
> > I have seen stranger things, but admittedly not often.
> 
> So the backtraces show the RCU gp thread in schedule_timeout.
> 
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
> 
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:

Good point, Nick!

Jonathan, could you please reproduce collecting timer event tracing?

							Thanx, Paul

> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> [ 1984.643626] Call trace:
> [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27 12:49                                                   ` Paul E. McKenney
  (?)
@ 2017-07-27 13:49                                                     ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 13:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 27 Jul 2017 05:49:13 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> > On Wed, 26 Jul 2017 18:42:14 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:  
> >   
> > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > dump listing almost all of the cpus as having missed a grace period.    
> > > 
> > > I have seen stranger things, but admittedly not often.  
> > 
> > So the backtraces show the RCU gp thread in schedule_timeout.
> > 
> > Are you sure that it's timeout has expired and it's not being scheduled,
> > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > scheduled but not correctly noting gps on other CPUs?
> > 
> > It's not in R state, so if it's not being scheduled at all, then it's
> > because the timer has not fired:  
> 
> Good point, Nick!
> 
> Jonathan, could you please reproduce collecting timer event tracing?
I'm a little new to tracing (only started playing with it last week)
so fingers crossed I've set it up right.  No splats yet.  Was getting
splats on reading out the trace when running with the RCU stall timer
set to 4 so have increased that back to the default and am rerunning.

This may take a while.  Correct me if I've gotten this wrong to save time

echo "timer:*" > /sys/kernel/debug/tracing/set_event

when it dumps, just send you the relevant part of what is in
/sys/kernel/debug/tracing/trace?

Thanks,

Jonathan
> 
> 							Thanx, Paul
> 
> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > [ 1984.643626] Call trace:
> > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> >   
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 13:49                                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 13:49 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Nicholas Piggin, David Miller, dzickus, sfr, linuxarm, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Thu, 27 Jul 2017 05:49:13 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> > On Wed, 26 Jul 2017 18:42:14 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:  
> >   
> > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > dump listing almost all of the cpus as having missed a grace period.    
> > > 
> > > I have seen stranger things, but admittedly not often.  
> > 
> > So the backtraces show the RCU gp thread in schedule_timeout.
> > 
> > Are you sure that it's timeout has expired and it's not being scheduled,
> > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > scheduled but not correctly noting gps on other CPUs?
> > 
> > It's not in R state, so if it's not being scheduled at all, then it's
> > because the timer has not fired:  
> 
> Good point, Nick!
> 
> Jonathan, could you please reproduce collecting timer event tracing?
I'm a little new to tracing (only started playing with it last week)
so fingers crossed I've set it up right.  No splats yet.  Was getting
splats on reading out the trace when running with the RCU stall timer
set to 4 so have increased that back to the default and am rerunning.

This may take a while.  Correct me if I've gotten this wrong to save time

echo "timer:*" > /sys/kernel/debug/tracing/set_event

when it dumps, just send you the relevant part of what is in
/sys/kernel/debug/tracing/trace?

Thanks,

Jonathan
> 
> 							Thanx, Paul
> 
> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > [ 1984.643626] Call trace:
> > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 13:49                                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 13:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 27 Jul 2017 05:49:13 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> > On Wed, 26 Jul 2017 18:42:14 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:  
> >   
> > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > dump listing almost all of the cpus as having missed a grace period.    
> > > 
> > > I have seen stranger things, but admittedly not often.  
> > 
> > So the backtraces show the RCU gp thread in schedule_timeout.
> > 
> > Are you sure that it's timeout has expired and it's not being scheduled,
> > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > scheduled but not correctly noting gps on other CPUs?
> > 
> > It's not in R state, so if it's not being scheduled at all, then it's
> > because the timer has not fired:  
> 
> Good point, Nick!
> 
> Jonathan, could you please reproduce collecting timer event tracing?
I'm a little new to tracing (only started playing with it last week)
so fingers crossed I've set it up right.  No splats yet.  Was getting
splats on reading out the trace when running with the RCU stall timer
set to 4 so have increased that back to the default and am rerunning.

This may take a while.  Correct me if I've gotten this wrong to save time

echo "timer:*" > /sys/kernel/debug/tracing/set_event

when it dumps, just send you the relevant part of what is in
/sys/kernel/debug/tracing/trace?

Thanks,

Jonathan
> 
> 							Thanx, Paul
> 
> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > [ 1984.643626] Call trace:
> > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27 13:49                                                     ` Jonathan Cameron
  (?)
@ 2017-07-27 16:39                                                       ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="windows-1254", Size: 68745 bytes --]

On Thu, 27 Jul 2017 14:49:03 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 05:49:13 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > >     
> > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > 
> > > > I have seen stranger things, but admittedly not often.    
> > > 
> > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > 
> > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > scheduled but not correctly noting gps on other CPUs?
> > > 
> > > It's not in R state, so if it's not being scheduled at all, then it's
> > > because the timer has not fired:    
> > 
> > Good point, Nick!
> > 
> > Jonathan, could you please reproduce collecting timer event tracing?  
> I'm a little new to tracing (only started playing with it last week)
> so fingers crossed I've set it up right.  No splats yet.  Was getting
> splats on reading out the trace when running with the RCU stall timer
> set to 4 so have increased that back to the default and am rerunning.
> 
> This may take a while.  Correct me if I've gotten this wrong to save time
> 
> echo "timer:*" > /sys/kernel/debug/tracing/set_event
> 
> when it dumps, just send you the relevant part of what is in
> /sys/kernel/debug/tracing/trace?

Interestingly the only thing that can make trip for me with tracing on
is peaking in the tracing buffers.  Not sure this is a valid case or
not.

Anyhow all timer activity seems to stop around the area of interest.


[ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 9442.419107] 	1-...: (1 GPs behind) idle„4/0/0 softirq'747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
[ 9442.430224] 	3-...: (2 GPs behind) idle8/0/0 softirq2197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
[ 9442.441340] 	4-...: (7 GPs behind) idlet0/0/0 softirq"351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
[ 9442.452456] 	5-...: (2 GPs behind) idle›0/0/0 softirq!315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.463572] 	6-...: (2 GPs behind) idley4/0/0 softirq\x19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
[ 9442.474688] 	7-...: (2 GPs behind) idle¬4/0/0 softirq"547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.485803] 	8-...: (9 GPs behind) idle\x118/0/0 softirq(1/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
[ 9442.496571] 	9-...: (9 GPs behind) idlec/0/0 softirq(4/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
[ 9442.507339] 	10-...: (14 GPs behind) idle÷8/0/0 softirq%4/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
[ 9442.518281] 	11-...: (9 GPs behind) idleÉc/0/0 softirq01/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
[ 9442.529136] 	12-...: (9 GPs behind) idleJ4/0/0 softirqs5/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
[ 9442.539992] 	13-...: (9 GPs behind) idle4c/0/0 softirq\x1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
[ 9442.551020] 	14-...: (9 GPs behind) idle/4/0/0 softirqp7/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
[ 9442.561875] 	15-...: (2 GPs behind) idle³0/0/0 softirq‚1/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
[ 9442.572730] 	17-...: (2 GPs behind) idleZ8/0/0 softirq\x1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
[ 9442.583759] 	18-...: (2 GPs behind) idle.4/0/0 softirq\x1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
[ 9442.594787] 	19-...: (2 GPs behind) idle\x138/0/0 softirq\x1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
[ 9442.605816] 	20-...: (50 GPs behind) idlec4/0/0 softirq!7/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
[ 9442.616758] 	21-...: (2 GPs behind) idleë8/0/0 softirq\x1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
[ 9442.627786] 	22-...: (1 GPs behind) idleª8/0/0 softirq"9/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
[ 9442.638641] 	23-...: (1 GPs behind) idleH8/0/0 softirq$7/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
[ 9442.649496] 	24-...: (33 GPs behind) idle÷c/0/0 softirq19/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
[ 9442.660437] 	25-...: (33 GPs behind) idle”4/0/0 softirq08/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
[ 9442.671379] 	26-...: (9 GPs behind) idlem4/0/0 softirq&5/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
[ 9442.682234] 	27-...: (115 GPs behind) idleãc/0/0 softirq!2/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.693263] 	28-...: (9 GPs behind) idleê4/0/0 softirqT0/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
[ 9442.704118] 	29-...: (115 GPs behind) idleƒc/0/0 softirq42/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.715147] 	30-...: (33 GPs behind) idleãc/0/0 softirqP9/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
[ 9442.726088] 	31-...: (9 GPs behind) idleß4/0/0 softirqa9/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
[ 9442.736944] 	32-...: (9 GPs behind) idleª4/0/0 softirq\x1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.747972] 	34-...: (9 GPs behind) idleæc/0/0 softirqP82/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
[ 9442.759001] 	35-...: (9 GPs behind) idle\x7fc/0/0 softirq\x1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
[ 9442.770030] 	36-...: (0 ticks this GP) idleò8/0/0 softirq%5/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
[ 9442.781145] 	37-...: (50 GPs behind) idleSc/0/0 softirq"7/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
[ 9442.792087] 	38-...: (9 GPs behind) idle•8/0/0 softirq\x185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.802942] 	40-...: (389 GPs behind) idleAc/0/0 softirq\x131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
[ 9442.813971] 	41-...: (389 GPs behind) idle%8/0/0 softirq\x133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
[ 9442.825000] 	43-...: (50 GPs behind) idle%4/0/0 softirq\x113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.835942] 	44-...: (115 GPs behind) idle\x178/0/0 softirq\x1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
[ 9442.847144] 	45-...: (2 GPs behind) idle\x04a/1/0 softirq64/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
[ 9442.857999] 	46-...: (9 GPs behind) idleì4/0/0 softirq\x183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.868854] 	47-...: (115 GPs behind) idle\b8/0/0 softirq\x135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.879883] 	48-...: (389 GPs behind) idle 0/0/0 softirq\x103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
[ 9442.890911] 	49-...: (9 GPs behind) idle¢4/0/0 softirq 5/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.901766] 	50-...: (25 GPs behind) idle§4/0/0 softirq\x144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.912708] 	51-...: (50 GPs behind) idleö8/0/0 softirq\x116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
[ 9442.923650] 	52-...: (9 GPs behind) idleà8/0/0 softirq 2/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
[ 9442.934505] 	53-...: (2 GPs behind) idle\x128/0/0 softirq65/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
[ 9442.945360] 	54-...: (9 GPs behind) idleÎ8/0/0 softirq\x126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
[ 9442.956215] 	56-...: (9 GPs behind) idle30/0/0 softirq!16/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
[ 9442.967243] 	57-...: (1 GPs behind) idle(8/0/0 softirq\x1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
[ 9442.978272] 	58-...: (37 GPs behind) idle90/0/0 softirq\x1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
[ 9442.989387] 	59-...: (37 GPs behind) idleå4/0/0 softirq\x1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
[ 9443.000502] 	60-...: (116 GPs behind) idle{4/0/0 softirq’/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
[ 9443.011357] 	61-...: (9 GPs behind) idle8/0/0 softirq\x161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
[ 9443.022212] 	62-...: (115 GPs behind) idleª8/0/0 softirq•/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
[ 9443.033154] 	63-...: (50 GPs behind) idle•8/0/0 softirq/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
[ 9443.043920] 	(detected by 39, tT03 jiffies, gD3, cD2, q=1)
[ 9443.049919] Task dump for CPU 1:
[ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
[ 9443.060173] Call trace:
[ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.067744] [<          (null)>]           (null)
[ 9443.072434] Task dump for CPU 3:
[ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
[ 9443.082686] Call trace:
[ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.090246] [<          (null)>]           (null)
[ 9443.094936] Task dump for CPU 4:
[ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
[ 9443.105188] Call trace:
[ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.118224] Task dump for CPU 5:
[ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
[ 9443.128476] Call trace:
[ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.136035] [<          (null)>]           (null)
[ 9443.140725] Task dump for CPU 6:
[ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
[ 9443.150976] Call trace:
[ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.158535] [<          (null)>]           (null)
[ 9443.163226] Task dump for CPU 7:
[ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
[ 9443.173478] Call trace:
[ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.181037] [<          (null)>]           (null)
[ 9443.185727] Task dump for CPU 8:
[ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
[ 9443.195979] Call trace:
[ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.203537] [<          (null)>]           (null)
[ 9443.208227] Task dump for CPU 9:
[ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
[ 9443.218479] Call trace:
[ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.231510] Task dump for CPU 10:
[ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
[ 9443.241848] Call trace:
[ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.254879] Task dump for CPU 11:
[ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
[ 9443.265218] Call trace:
[ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.272776] [<          (null)>]           (null)
[ 9443.277467] Task dump for CPU 12:
[ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
[ 9443.287806] Call trace:
[ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.295364] [<          (null)>]           (null)
[ 9443.300054] Task dump for CPU 13:
[ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
[ 9443.310394] Call trace:
[ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.317953] [<          (null)>]           (null)
[ 9443.322643] Task dump for CPU 14:
[ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
[ 9443.332981] Call trace:
[ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.340540] [<          (null)>]           (null)
[ 9443.345230] Task dump for CPU 15:
[ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
[ 9443.355568] Call trace:
[ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.368599] Task dump for CPU 17:
[ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
[ 9443.378937] Call trace:
[ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.391968] Task dump for CPU 18:
[ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
[ 9443.402306] Call trace:
[ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.415336] Task dump for CPU 19:
[ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
[ 9443.425675] Call trace:
[ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.433234] [<          (null)>]           (null)
[ 9443.437924] Task dump for CPU 20:
[ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
[ 9443.448263] Call trace:
[ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.473458] Task dump for CPU 21:
[ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
[ 9443.483796] Call trace:
[ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.491354] [<          (null)>]           (null)
[ 9443.496045] Task dump for CPU 22:
[ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
[ 9443.506383] Call trace:
[ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.519414] Task dump for CPU 23:
[ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
[ 9443.529752] Call trace:
[ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.542784] Task dump for CPU 24:
[ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
[ 9443.553122] Call trace:
[ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.566153] Task dump for CPU 25:
[ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
[ 9443.576491] Call trace:
[ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.601682] Task dump for CPU 26:
[ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
[ 9443.612021] Call trace:
[ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.625052] Task dump for CPU 27:
[ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
[ 9443.635390] Call trace:
[ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.648421] Task dump for CPU 28:
[ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
[ 9443.658759] Call trace:
[ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.666318] [<          (null)>]           (null)
[ 9443.671008] Task dump for CPU 29:
[ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
[ 9443.681346] Call trace:
[ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.694377] Task dump for CPU 30:
[ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
[ 9443.704715] Call trace:
[ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.729907] Task dump for CPU 31:
[ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
[ 9443.740246] Call trace:
[ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.747805] [<          (null)>]           (null)
[ 9443.752496] Task dump for CPU 32:
[ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
[ 9443.762833] Call trace:
[ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.770392] [<          (null)>]           (null)
[ 9443.775082] Task dump for CPU 34:
[ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
[ 9443.785420] Call trace:
[ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.798451] Task dump for CPU 35:
[ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
[ 9443.808789] Call trace:
[ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.821820] Task dump for CPU 36:
[ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
[ 9443.832158] Call trace:
[ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.857350] Task dump for CPU 37:
[ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
[ 9443.867688] Call trace:
[ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.892880] Task dump for CPU 38:
[ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
[ 9443.903218] Call trace:
[ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.910776] [<          (null)>]           (null)
[ 9443.915466] Task dump for CPU 40:
[ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
[ 9443.925805] Call trace:
[ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.938836] Task dump for CPU 41:
[ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
[ 9443.949174] Call trace:
[ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.956733] [<          (null)>]           (null)
[ 9443.961423] Task dump for CPU 43:
[ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
[ 9443.971761] Call trace:
[ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.984791] Task dump for CPU 44:
[ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
[ 9443.995130] Call trace:
[ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.002688] [<          (null)>]           (null)
[ 9444.007378] Task dump for CPU 45:
[ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
[ 9444.017716] Call trace:
[ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.025275] [<          (null)>]           (null)
[ 9444.029965] Task dump for CPU 46:
[ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
[ 9444.040302] Call trace:
[ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.053333] Task dump for CPU 47:
[ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
[ 9444.063672] Call trace:
[ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.076702] Task dump for CPU 48:
[ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
[ 9444.087041] Call trace:
[ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.100071] Task dump for CPU 49:
[ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
[ 9444.110409] Call trace:
[ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.123440] Task dump for CPU 50:
[ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
[ 9444.133777] Call trace:
[ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.146807] Task dump for CPU 51:
[ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
[ 9444.157144] Call trace:
[ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.164703] [<          (null)>]           (null)
[ 9444.169393] Task dump for CPU 52:
[ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
[ 9444.179731] Call trace:
[ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.192761] Task dump for CPU 53:
[ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
[ 9444.203099] Call trace:
[ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.216129] Task dump for CPU 54:
[ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
[ 9444.226467] Call trace:
[ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.239498] Task dump for CPU 56:
[ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
[ 9444.249837] Call trace:
[ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.257396] [<          (null)>]           (null)
[ 9444.262086] Task dump for CPU 57:
[ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
[ 9444.272424] Call trace:
[ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.279982] [<          (null)>]           (null)
[ 9444.284672] Task dump for CPU 58:
[ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
[ 9444.295011] Call trace:
[ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.320202] Task dump for CPU 59:
[ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
[ 9444.330540] Call trace:
[ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.343571] Task dump for CPU 60:
[ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
[ 9444.353909] Call trace:
[ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.379101] Task dump for CPU 61:
[ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
[ 9444.389438] Call trace:
[ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.402469] Task dump for CPU 62:
[ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
[ 9444.412808] Call trace:
[ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.425838] Task dump for CPU 63:
[ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
[ 9444.436177] Call trace:
[ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 9444.458416] rcu_sched       S    0    10      2 0x00000000
[ 9444.463889] Call trace:
[ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
[ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
[ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
[ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
[ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
[ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50



And the relevant chunk of trace is:
(I have a lot more.  There are substantial other pauses from to time, but not this long)


   rcu_preempt-9     [057] ....  9419.837631: timer_init: timerÿff8017d5fcfda0
     rcu_preempt-9     [057] d..1  9419.837632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97246837 [timeout=1] cpuW idx=0 flags          <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimerÿff8017db99e808
          <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18164000000 softexpires”18164000000
          <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimerÿff8017db99e808
          <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18164001440
          <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimerÿff8017db99e808
          <idle>-0     [057] d.s2  9419.845623: timer_cancel: timerÿff8017d5fcfda0
          <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97246838
          <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timerÿff8017d5fcfda0
          <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18168000000 softexpires”18168000000
          <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimerÿff8017db99e808
          <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18188000000 softexpires”18188000000
          <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimerÿff8017db99e808
          <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18188001420
          <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimerÿff8017db99e808
          <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires˜58983202655 softexpires˜58983202655
          <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimerÿff8017fbc3d808
          <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimerÿff8017fbc3d808 function=tick_sched_timer now”18204006760
          <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimerÿff8017fbc3d808
          <idle>-0     [016] d.s2  9419.885629: timer_cancel: timerÿff8017d37dbca0
          <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timerÿff8017d37dbca0 function=process_timeout nowB97246848
          <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timerÿff8017d37dbca0
          <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”18208000000 softexpires”18208000000
      khugepaged-778   [016] ....  9419.885668: timer_init: timerÿff8017d37dbca0
      khugepaged-778   [016] d..1  9419.885668: timer_start: timerÿff8017d37dbca0 function=process_timeout expiresB97249348 [timeout%00] cpu\x16 idx=0 flags          <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
          <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimerÿff8017fbc3d808
          <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”28444000000 softexpires”28444000000
          <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18209219940
          <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18310221420 softexpires”18309221420
          <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18236005860
          <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9419.917628: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97246856
          <idle>-0     [000] d.s2  9419.917630: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97246881 [timeout%] cpu=0 idx flags          <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”18340000000 softexpires”18340000000
          <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18310225960
          <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18411227320 softexpires”18410227320
          <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18340005520
          <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9420.021627: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97246882
          <idle>-0     [000] d.s2  9420.021629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247107 [timeout"5] cpu=0 idx4 flags          <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
          <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18411231780
          <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18512233720 softexpires”18511233720
          <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”18460002540
          <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimerÿff8017db968808
          <idle>-0     [055] d.s2  9420.141626: timer_cancel: timerÿff80177db6cc08
          <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97246912
          <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timerÿff80177db6cc08
          <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”18464000000 softexpires”18464000000
    kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247162 [timeout%0] cpuU idxˆ flags=I
          <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19484000000 softexpires”19484000000
          <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18512235660
          <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18613237260 softexpires”18612237260
          <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18613238380
          <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18714240000 softexpires”18713240000
          <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18714241380
          <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18815242920 softexpires”18814242920
          <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimerÿff8017dbb69808
          <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimerÿff8017dbb69808 function=tick_sched_timer now”18780002180
          <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimerÿff8017dbb69808
          <idle>-0     [042] d.s2  9420.461624: timer_cancel: timerÿff80177db6d408
          <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timerÿff80177db6d408 functionÞlayed_work_timer_fn nowB97246992
          <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timerÿff80177db6d408
          <idle>-0     [042] dns2  9420.461627: timer_cancel: timerÿff8017797d7868
          <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timerÿff8017797d7868 function=hns_nic_service_timer nowB97246992
          <idle>-0     [042] dns2  9420.461628: timer_start: timerÿff8017797d7868 function=hns_nic_service_timer expiresB97247242 [timeout%0] cpuB idx˜ flags          <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timerÿff8017797d7868
          <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”18784000000 softexpires”18784000000
    kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timerÿff80177db6d408 functionÞlayed_work_timer_fn expiresB97247242 [timeout%0] cpuB idx˜ flags=I
          <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
          <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimerÿff8017dbb69808
          <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”19804000000 softexpires”19804000000
          <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18815244580
          <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18916246140 softexpires”18915246140
          <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18916247280
          <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19017248760 softexpires”19016248760
          <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimerÿff8017dba76808
          <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimerÿff8017dba76808 function=tick_sched_timer now”18940002160
          <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimerÿff8017dba76808
          <idle>-0     [033] d.s2  9420.621624: timer_cancel: timerÿff00000917be40
          <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timerÿff00000917be40 functionÞlayed_work_timer_fn nowB97247032
          <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timerÿff00000917be40
          <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”18944000000 softexpires”18944000000
           <...>-1631  [033] d..1  9420.621636: timer_start: timerÿff00000917be40 functionÞlayed_work_timer_fn expiresB97247282 [timeout%0] cpu3 idx\x103 flags=I
          <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
          <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimerÿff8017dba76808
          <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”19964000000 softexpires”19964000000
          <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
          <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
          <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
          <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59803202655 softexpires˜59803202655
          <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19017253180
          <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19118254640 softexpires”19117254640
          <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19118255760
          <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19219257140 softexpires”19218257140
          <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
          <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
          <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
          <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59983202655 softexpires˜59983202655
          <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19219261580
          <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19320263160 softexpires”19319263160
          <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimerÿff8017fbe5b808
          <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimerÿff8017fbe5b808 function=tick_sched_timer expires˜60023202655 softexpires˜60023202655
          <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19260001400
          <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9420.941623: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97247112
          <idle>-0     [000] d.s2  9420.941624: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97247137 [timeout%] cpu=0 idx\x113 flags          <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d.s2  9420.941629: timer_cancel: timerÿff8017fbe42558
          <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timerÿff8017fbe42558 functionÞlayed_work_timer_fn nowB97247112
          <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timerÿff8017fbe42558
          <idle>-0     [000] dns2  9420.941631: timer_cancel: timerÿff00000910a628
          <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97247112
          <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timerÿff00000910a628
          <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
          <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
     kworker/0:0-3     [000] d..1  9420.941650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97247500 [timeout88] cpu=0 idx\x100 flags=D|I
     kworker/2:0-22    [002] d..1  9420.941651: timer_start: timerÿff8017fbe78558 functionÞlayed_work_timer_fn expiresB97247494 [timeout82] cpu=2 idx\x114 flags=D|I
          <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19364000000 softexpires”19364000000
          <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜60055202655 softexpires˜60055202655
          <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19320267640
          <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19421269000 softexpires”19420269000
          <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19364005380
          <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9421.045627: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97247138
          <idle>-0     [000] d.s2  9421.045628: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247363 [timeout"5] cpu=0 idx4 flags          <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
          <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19421273420
          <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19522275040 softexpires”19521275040
          <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”19484002280
          <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimerÿff8017db968808
          <idle>-0     [055] d.s2  9421.165624: timer_cancel: timerÿff80177db6cc08
          <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97247168
          <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timerÿff80177db6cc08
          <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19488000000 softexpires”19488000000
    kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247418 [timeout%0] cpuU idx\x120 flags=I
          <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”20508000000 softexpires”20508000000
          <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19522276980
          <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19623278460 softexpires”19622278460
          <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19623279580
          <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19724281060 softexpires”19723281060
          <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
          <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
          <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
              sh-2256  [002] ....  9421.381193: timer_init: timerÿff80176c26fb40
              sh-2256  [002] d..1  9421.381194: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97247223 [timeout=2] cpu=2 idx=0 flags          <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19708000000 softexpires”19708000000
          <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”19708002000
          <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimerÿff8017fbe76808
          <idle>-0     [002] d.s2  9421.389624: timer_cancel: timerÿff80176c26fb40
          <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timerÿff80176c26fb40 function=process_timeout nowB97247224
          <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timerÿff80176c26fb40
          <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimerÿff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
          <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimerÿff8017dbb18808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
          <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19724000000 softexpires”19724000000
          <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42832000000 softexpires”42832000000
          <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
              sh-2256  [002] ....  9444.510857: timer_init: timerÿff80176c26fb40
              sh-2256  [002] d..1  9444.510857: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97253006 [timeout=2] cpu=2 idx=0 flags          <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
          <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42844005600
          <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42844005460
          <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42844005300
          <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimerÿff8017fbe76808
          <idle>-0     [002] d.s2  9444.525629: timer_cancel: timerÿff8017fbe78558
          <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timerÿff8017fbe78558 functionÞlayed_work_timer_fn nowB97253008
          <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9444.525631: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97253008
          <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timerÿff8017fbe78558
          <idle>-0     [000] d.s2  9444.525632: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97253033 [timeout%] cpu=0 idx‚ flags          <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d.s2  9444.525634: timer_cancel: timerÿff00000910a628
          <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97253008
          <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timerÿff00000910a628
          <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
          <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
          <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
     rcu_preempt-9     [036] ....  9444.525648: timer_init: timerÿff8017d5fcfda0
          <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42860000000 softexpires”42860000000
     rcu_preempt-9     [036] d..1  9444.525649: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253009 [timeout=1] cpu6 idx=0 flags     kworker/0:0-3     [000] d..1  9444.525650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97253250 [timeout$2] cpu=0 idx‚ flags=D|I
          <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42948000000 softexpires”42948000000
          <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42852000000 softexpires”42852000000
          <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42852004760
          <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.s2  9444.533627: timer_cancel: timerÿff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253010
          <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timerÿff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42856000000 softexpires”42856000000
          <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42876000000 softexpires”42876000000
          <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42860007120
          <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimerÿff8017fbe76808
          <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42864000000 softexpires”42864000000
          <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimerÿff8017fbe76808
          <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”44316000000 softexpires”44316000000
          <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42876008220
          <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42880000000 softexpires”42880000000
          <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42892000000 softexpires”42892000000
          <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42892001340
          <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42896000000 softexpires”42896000000
     rcu_preempt-9     [036] ....  9444.573631: timer_init: timerÿff8017d5fcfda0
     rcu_preempt-9     [036] d..1  9444.573632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253021 [timeout=1] cpu6 idx=0 flags          <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42900000000 softexpires”42900000000
          <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42900001400
          <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.s2  9444.581623: timer_cancel: timerÿff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253022
          <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timerÿff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42904000000 softexpires”42904000000
          <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42924000000 softexpires”42924000000
          <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”42900098200
          <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43001101380 softexpires”43000101380
          <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42924001600
          <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires˜83719202655 softexpires˜83719202655
          <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42948005580
          <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
          <idle>-0     [000] d.s2  9444.629628: timer_cancel: timerÿff80177fdc0840
          <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97253034
          <idle>-0     [000] d.s2  9444.629629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97253259 [timeout"5] cpu=0 idxB flags          <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timerÿff80177fdc0840
          <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”43868000000 softexpires”43868000000
          <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43001105940
          <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43102107440 softexpires”43101107440
          <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”43036006240
          <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimerÿff8017db968808
          <idle>-0     [055] d.s2  9444.717630: timer_cancel: timerÿff80177db6cc08
          <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97253056
          <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timerÿff80177db6cc08
          <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”43040000000 softexpires”43040000000
    kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97253306 [timeout%0] cpuU idxˆ flags=I
          <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimerÿff8017db968808
          <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”44060000000 softexpires”44060000000
          <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43102109380
          <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43203110880 softexpires”43202110880
          <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43203112000
          <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43304113380 softexpires”43303113380
          <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimerÿff80176cb7ca90
          <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43304114500
          <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43405116440 softexpires”43404116440
          <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
          <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimerÿff8017dbb69808
> 
> Thanks,
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > [ 1984.643626] Call trace:
> > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > >     
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 16:39                                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 16:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

On Thu, 27 Jul 2017 14:49:03 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 05:49:13 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > >     
> > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > 
> > > > I have seen stranger things, but admittedly not often.    
> > > 
> > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > 
> > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > scheduled but not correctly noting gps on other CPUs?
> > > 
> > > It's not in R state, so if it's not being scheduled at all, then it's
> > > because the timer has not fired:    
> > 
> > Good point, Nick!
> > 
> > Jonathan, could you please reproduce collecting timer event tracing?  
> I'm a little new to tracing (only started playing with it last week)
> so fingers crossed I've set it up right.  No splats yet.  Was getting
> splats on reading out the trace when running with the RCU stall timer
> set to 4 so have increased that back to the default and am rerunning.
> 
> This may take a while.  Correct me if I've gotten this wrong to save time
> 
> echo "timer:*" > /sys/kernel/debug/tracing/set_event
> 
> when it dumps, just send you the relevant part of what is in
> /sys/kernel/debug/tracing/trace?

Interestingly the only thing that can make trip for me with tracing on
is peaking in the tracing buffers.  Not sure this is a valid case or
not.

Anyhow all timer activity seems to stop around the area of interest.


[ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
[ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
[ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
[ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
[ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
[ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
[ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
[ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
[ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
[ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
[ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
[ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
[ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
[ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
[ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
[ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
[ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
[ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
[ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
[ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
[ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
[ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
[ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
[ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
[ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
[ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
[ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
[ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
[ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
[ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
[ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
[ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
[ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
[ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
[ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
[ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
[ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
[ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
[ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
[ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
[ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
[ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
[ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
[ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
[ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
[ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
[ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
[ 9443.049919] Task dump for CPU 1:
[ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
[ 9443.060173] Call trace:
[ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.067744] [<          (null)>]           (null)
[ 9443.072434] Task dump for CPU 3:
[ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
[ 9443.082686] Call trace:
[ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.090246] [<          (null)>]           (null)
[ 9443.094936] Task dump for CPU 4:
[ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
[ 9443.105188] Call trace:
[ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.118224] Task dump for CPU 5:
[ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
[ 9443.128476] Call trace:
[ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.136035] [<          (null)>]           (null)
[ 9443.140725] Task dump for CPU 6:
[ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
[ 9443.150976] Call trace:
[ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.158535] [<          (null)>]           (null)
[ 9443.163226] Task dump for CPU 7:
[ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
[ 9443.173478] Call trace:
[ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.181037] [<          (null)>]           (null)
[ 9443.185727] Task dump for CPU 8:
[ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
[ 9443.195979] Call trace:
[ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.203537] [<          (null)>]           (null)
[ 9443.208227] Task dump for CPU 9:
[ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
[ 9443.218479] Call trace:
[ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.231510] Task dump for CPU 10:
[ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
[ 9443.241848] Call trace:
[ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.254879] Task dump for CPU 11:
[ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
[ 9443.265218] Call trace:
[ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.272776] [<          (null)>]           (null)
[ 9443.277467] Task dump for CPU 12:
[ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
[ 9443.287806] Call trace:
[ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.295364] [<          (null)>]           (null)
[ 9443.300054] Task dump for CPU 13:
[ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
[ 9443.310394] Call trace:
[ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.317953] [<          (null)>]           (null)
[ 9443.322643] Task dump for CPU 14:
[ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
[ 9443.332981] Call trace:
[ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.340540] [<          (null)>]           (null)
[ 9443.345230] Task dump for CPU 15:
[ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
[ 9443.355568] Call trace:
[ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.368599] Task dump for CPU 17:
[ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
[ 9443.378937] Call trace:
[ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.391968] Task dump for CPU 18:
[ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
[ 9443.402306] Call trace:
[ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.415336] Task dump for CPU 19:
[ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
[ 9443.425675] Call trace:
[ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.433234] [<          (null)>]           (null)
[ 9443.437924] Task dump for CPU 20:
[ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
[ 9443.448263] Call trace:
[ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.473458] Task dump for CPU 21:
[ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
[ 9443.483796] Call trace:
[ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.491354] [<          (null)>]           (null)
[ 9443.496045] Task dump for CPU 22:
[ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
[ 9443.506383] Call trace:
[ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.519414] Task dump for CPU 23:
[ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
[ 9443.529752] Call trace:
[ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.542784] Task dump for CPU 24:
[ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
[ 9443.553122] Call trace:
[ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.566153] Task dump for CPU 25:
[ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
[ 9443.576491] Call trace:
[ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.601682] Task dump for CPU 26:
[ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
[ 9443.612021] Call trace:
[ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.625052] Task dump for CPU 27:
[ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
[ 9443.635390] Call trace:
[ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.648421] Task dump for CPU 28:
[ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
[ 9443.658759] Call trace:
[ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.666318] [<          (null)>]           (null)
[ 9443.671008] Task dump for CPU 29:
[ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
[ 9443.681346] Call trace:
[ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.694377] Task dump for CPU 30:
[ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
[ 9443.704715] Call trace:
[ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.729907] Task dump for CPU 31:
[ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
[ 9443.740246] Call trace:
[ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.747805] [<          (null)>]           (null)
[ 9443.752496] Task dump for CPU 32:
[ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
[ 9443.762833] Call trace:
[ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.770392] [<          (null)>]           (null)
[ 9443.775082] Task dump for CPU 34:
[ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
[ 9443.785420] Call trace:
[ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.798451] Task dump for CPU 35:
[ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
[ 9443.808789] Call trace:
[ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.821820] Task dump for CPU 36:
[ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
[ 9443.832158] Call trace:
[ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.857350] Task dump for CPU 37:
[ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
[ 9443.867688] Call trace:
[ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.892880] Task dump for CPU 38:
[ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
[ 9443.903218] Call trace:
[ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.910776] [<          (null)>]           (null)
[ 9443.915466] Task dump for CPU 40:
[ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
[ 9443.925805] Call trace:
[ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.938836] Task dump for CPU 41:
[ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
[ 9443.949174] Call trace:
[ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.956733] [<          (null)>]           (null)
[ 9443.961423] Task dump for CPU 43:
[ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
[ 9443.971761] Call trace:
[ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.984791] Task dump for CPU 44:
[ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
[ 9443.995130] Call trace:
[ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.002688] [<          (null)>]           (null)
[ 9444.007378] Task dump for CPU 45:
[ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
[ 9444.017716] Call trace:
[ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.025275] [<          (null)>]           (null)
[ 9444.029965] Task dump for CPU 46:
[ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
[ 9444.040302] Call trace:
[ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.053333] Task dump for CPU 47:
[ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
[ 9444.063672] Call trace:
[ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.076702] Task dump for CPU 48:
[ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
[ 9444.087041] Call trace:
[ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.100071] Task dump for CPU 49:
[ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
[ 9444.110409] Call trace:
[ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.123440] Task dump for CPU 50:
[ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
[ 9444.133777] Call trace:
[ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.146807] Task dump for CPU 51:
[ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
[ 9444.157144] Call trace:
[ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.164703] [<          (null)>]           (null)
[ 9444.169393] Task dump for CPU 52:
[ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
[ 9444.179731] Call trace:
[ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.192761] Task dump for CPU 53:
[ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
[ 9444.203099] Call trace:
[ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.216129] Task dump for CPU 54:
[ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
[ 9444.226467] Call trace:
[ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.239498] Task dump for CPU 56:
[ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
[ 9444.249837] Call trace:
[ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.257396] [<          (null)>]           (null)
[ 9444.262086] Task dump for CPU 57:
[ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
[ 9444.272424] Call trace:
[ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.279982] [<          (null)>]           (null)
[ 9444.284672] Task dump for CPU 58:
[ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
[ 9444.295011] Call trace:
[ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.320202] Task dump for CPU 59:
[ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
[ 9444.330540] Call trace:
[ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.343571] Task dump for CPU 60:
[ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
[ 9444.353909] Call trace:
[ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.379101] Task dump for CPU 61:
[ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
[ 9444.389438] Call trace:
[ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.402469] Task dump for CPU 62:
[ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
[ 9444.412808] Call trace:
[ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.425838] Task dump for CPU 63:
[ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
[ 9444.436177] Call trace:
[ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 9444.458416] rcu_sched       S    0    10      2 0x00000000
[ 9444.463889] Call trace:
[ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
[ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
[ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
[ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
[ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
[ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50



And the relevant chunk of trace is:
(I have a lot more.  There are substantial other pauses from to time, but not this long)


   rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
     rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
          <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
          <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
          <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
          <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
          <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
          <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
          <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
          <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
          <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
          <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
          <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
          <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
      khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
      khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
          <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
          <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
          <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
          <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
          <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
          <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
          <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
          <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
          <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
          <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
          <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
          <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
          <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
          <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
          <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
          <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
          <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
          <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
    kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
          <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
          <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
          <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
          <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
          <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
          <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
          <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
          <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
          <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
          <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
          <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
          <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
          <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
          <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
          <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
          <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
    kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
          <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
          <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
          <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
          <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
          <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
          <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
          <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
          <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
          <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
          <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
          <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
          <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
          <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
           <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
          <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
          <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
          <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
          <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
          <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
          <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
          <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
          <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
          <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
          <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
          <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
          <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
          <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
          <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
          <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
          <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
          <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
          <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
          <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
          <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
          <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
          <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
          <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
          <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
          <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
          <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
          <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
          <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
     kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
     kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
          <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
          <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
          <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
          <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
          <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
          <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
          <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
          <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
          <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
          <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
          <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
          <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
          <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
    kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
          <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
          <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
          <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
          <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
          <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
          <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
          <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
          <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
              sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
              sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
          <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
          <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
          <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
          <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
          <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
          <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
          <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
          <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000
          <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
          <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
              sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
              sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
          <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
          <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
          <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
          <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
          <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
          <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
          <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
          <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
          <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
          <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
          <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
          <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
          <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
          <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
          <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
     rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
          <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
     rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
     kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
          <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
          <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
          <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
          <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
          <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
          <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
          <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
          <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
          <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
          <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
          <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
          <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
          <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
          <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
     rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
     rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
          <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
          <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
          <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
          <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
          <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
          <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
          <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
          <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
          <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
          <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
          <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
          <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
          <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
          <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
          <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
          <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
          <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
          <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
    kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
          <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
          <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
          <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
          <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
          <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
          <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
          <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
          <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808
> 
> Thanks,
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > [ 1984.643626] Call trace:
> > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > >     
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 16:39                                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-27 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 27 Jul 2017 14:49:03 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 05:49:13 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > >     
> > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > 
> > > > I have seen stranger things, but admittedly not often.    
> > > 
> > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > 
> > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > scheduled but not correctly noting gps on other CPUs?
> > > 
> > > It's not in R state, so if it's not being scheduled at all, then it's
> > > because the timer has not fired:    
> > 
> > Good point, Nick!
> > 
> > Jonathan, could you please reproduce collecting timer event tracing?  
> I'm a little new to tracing (only started playing with it last week)
> so fingers crossed I've set it up right.  No splats yet.  Was getting
> splats on reading out the trace when running with the RCU stall timer
> set to 4 so have increased that back to the default and am rerunning.
> 
> This may take a while.  Correct me if I've gotten this wrong to save time
> 
> echo "timer:*" > /sys/kernel/debug/tracing/set_event
> 
> when it dumps, just send you the relevant part of what is in
> /sys/kernel/debug/tracing/trace?

Interestingly the only thing that can make trip for me with tracing on
is peaking in the tracing buffers.  Not sure this is a valid case or
not.

Anyhow all timer activity seems to stop around the area of interest.


[ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
[ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
[ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
[ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
[ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
[ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
[ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
[ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
[ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
[ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
[ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
[ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
[ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
[ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
[ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
[ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
[ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
[ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
[ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
[ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
[ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
[ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
[ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
[ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
[ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
[ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
[ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
[ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
[ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
[ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
[ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
[ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
[ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
[ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
[ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
[ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
[ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
[ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
[ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
[ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
[ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
[ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
[ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
[ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
[ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
[ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
[ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
[ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
[ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
[ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
[ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
[ 9443.049919] Task dump for CPU 1:
[ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
[ 9443.060173] Call trace:
[ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.067744] [<          (null)>]           (null)
[ 9443.072434] Task dump for CPU 3:
[ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
[ 9443.082686] Call trace:
[ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.090246] [<          (null)>]           (null)
[ 9443.094936] Task dump for CPU 4:
[ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
[ 9443.105188] Call trace:
[ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.118224] Task dump for CPU 5:
[ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
[ 9443.128476] Call trace:
[ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.136035] [<          (null)>]           (null)
[ 9443.140725] Task dump for CPU 6:
[ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
[ 9443.150976] Call trace:
[ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.158535] [<          (null)>]           (null)
[ 9443.163226] Task dump for CPU 7:
[ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
[ 9443.173478] Call trace:
[ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.181037] [<          (null)>]           (null)
[ 9443.185727] Task dump for CPU 8:
[ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
[ 9443.195979] Call trace:
[ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.203537] [<          (null)>]           (null)
[ 9443.208227] Task dump for CPU 9:
[ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
[ 9443.218479] Call trace:
[ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.231510] Task dump for CPU 10:
[ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
[ 9443.241848] Call trace:
[ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.254879] Task dump for CPU 11:
[ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
[ 9443.265218] Call trace:
[ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.272776] [<          (null)>]           (null)
[ 9443.277467] Task dump for CPU 12:
[ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
[ 9443.287806] Call trace:
[ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.295364] [<          (null)>]           (null)
[ 9443.300054] Task dump for CPU 13:
[ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
[ 9443.310394] Call trace:
[ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.317953] [<          (null)>]           (null)
[ 9443.322643] Task dump for CPU 14:
[ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
[ 9443.332981] Call trace:
[ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.340540] [<          (null)>]           (null)
[ 9443.345230] Task dump for CPU 15:
[ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
[ 9443.355568] Call trace:
[ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.368599] Task dump for CPU 17:
[ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
[ 9443.378937] Call trace:
[ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.391968] Task dump for CPU 18:
[ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
[ 9443.402306] Call trace:
[ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.415336] Task dump for CPU 19:
[ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
[ 9443.425675] Call trace:
[ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.433234] [<          (null)>]           (null)
[ 9443.437924] Task dump for CPU 20:
[ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
[ 9443.448263] Call trace:
[ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.473458] Task dump for CPU 21:
[ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
[ 9443.483796] Call trace:
[ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.491354] [<          (null)>]           (null)
[ 9443.496045] Task dump for CPU 22:
[ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
[ 9443.506383] Call trace:
[ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.519414] Task dump for CPU 23:
[ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
[ 9443.529752] Call trace:
[ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.542784] Task dump for CPU 24:
[ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
[ 9443.553122] Call trace:
[ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.566153] Task dump for CPU 25:
[ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
[ 9443.576491] Call trace:
[ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.601682] Task dump for CPU 26:
[ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
[ 9443.612021] Call trace:
[ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.625052] Task dump for CPU 27:
[ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
[ 9443.635390] Call trace:
[ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.648421] Task dump for CPU 28:
[ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
[ 9443.658759] Call trace:
[ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.666318] [<          (null)>]           (null)
[ 9443.671008] Task dump for CPU 29:
[ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
[ 9443.681346] Call trace:
[ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.694377] Task dump for CPU 30:
[ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
[ 9443.704715] Call trace:
[ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.729907] Task dump for CPU 31:
[ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
[ 9443.740246] Call trace:
[ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.747805] [<          (null)>]           (null)
[ 9443.752496] Task dump for CPU 32:
[ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
[ 9443.762833] Call trace:
[ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.770392] [<          (null)>]           (null)
[ 9443.775082] Task dump for CPU 34:
[ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
[ 9443.785420] Call trace:
[ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.798451] Task dump for CPU 35:
[ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
[ 9443.808789] Call trace:
[ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.821820] Task dump for CPU 36:
[ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
[ 9443.832158] Call trace:
[ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.857350] Task dump for CPU 37:
[ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
[ 9443.867688] Call trace:
[ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.892880] Task dump for CPU 38:
[ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
[ 9443.903218] Call trace:
[ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.910776] [<          (null)>]           (null)
[ 9443.915466] Task dump for CPU 40:
[ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
[ 9443.925805] Call trace:
[ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.938836] Task dump for CPU 41:
[ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
[ 9443.949174] Call trace:
[ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.956733] [<          (null)>]           (null)
[ 9443.961423] Task dump for CPU 43:
[ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
[ 9443.971761] Call trace:
[ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9443.984791] Task dump for CPU 44:
[ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
[ 9443.995130] Call trace:
[ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.002688] [<          (null)>]           (null)
[ 9444.007378] Task dump for CPU 45:
[ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
[ 9444.017716] Call trace:
[ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.025275] [<          (null)>]           (null)
[ 9444.029965] Task dump for CPU 46:
[ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
[ 9444.040302] Call trace:
[ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.053333] Task dump for CPU 47:
[ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
[ 9444.063672] Call trace:
[ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.076702] Task dump for CPU 48:
[ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
[ 9444.087041] Call trace:
[ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.100071] Task dump for CPU 49:
[ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
[ 9444.110409] Call trace:
[ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.123440] Task dump for CPU 50:
[ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
[ 9444.133777] Call trace:
[ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.146807] Task dump for CPU 51:
[ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
[ 9444.157144] Call trace:
[ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.164703] [<          (null)>]           (null)
[ 9444.169393] Task dump for CPU 52:
[ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
[ 9444.179731] Call trace:
[ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.192761] Task dump for CPU 53:
[ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
[ 9444.203099] Call trace:
[ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.216129] Task dump for CPU 54:
[ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
[ 9444.226467] Call trace:
[ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.239498] Task dump for CPU 56:
[ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
[ 9444.249837] Call trace:
[ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.257396] [<          (null)>]           (null)
[ 9444.262086] Task dump for CPU 57:
[ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
[ 9444.272424] Call trace:
[ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.279982] [<          (null)>]           (null)
[ 9444.284672] Task dump for CPU 58:
[ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
[ 9444.295011] Call trace:
[ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.320202] Task dump for CPU 59:
[ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
[ 9444.330540] Call trace:
[ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.343571] Task dump for CPU 60:
[ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
[ 9444.353909] Call trace:
[ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.379101] Task dump for CPU 61:
[ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
[ 9444.389438] Call trace:
[ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.402469] Task dump for CPU 62:
[ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
[ 9444.412808] Call trace:
[ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.425838] Task dump for CPU 63:
[ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
[ 9444.436177] Call trace:
[ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
[ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 9444.458416] rcu_sched       S    0    10      2 0x00000000
[ 9444.463889] Call trace:
[ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
[ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
[ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
[ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
[ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
[ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50



And the relevant chunk of trace is:
(I have a lot more.  There are substantial other pauses from to time, but not this long)


   rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
     rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
          <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
          <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
          <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
          <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
          <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
          <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
          <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
          <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
          <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
          <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
          <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
          <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
          <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
          <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
          <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
      khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
      khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
          <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
          <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
          <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
          <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
          <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
          <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
          <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
          <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
          <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
          <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
          <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
          <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
          <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
          <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
          <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
          <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
          <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
          <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
          <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
    kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
          <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
          <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
          <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
          <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
          <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
          <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
          <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
          <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
          <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
          <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
          <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
          <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
          <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
          <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
          <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
          <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
    kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
          <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
          <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
          <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
          <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
          <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
          <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
          <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
          <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
          <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
          <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
          <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
          <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
          <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
          <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
           <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
          <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
          <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
          <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
          <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
          <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
          <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
          <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
          <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
          <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
          <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
          <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
          <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
          <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
          <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
          <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
          <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
          <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
          <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
          <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
          <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
          <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
          <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
          <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
          <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
          <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
          <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
          <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
          <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
          <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
     kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
     kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
          <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
          <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
          <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
          <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
          <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
          <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
          <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
          <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
          <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
          <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
          <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
          <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
          <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
    kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
          <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
          <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
          <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
          <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
          <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
          <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
          <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
          <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
              sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
              sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
          <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
          <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
          <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
          <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
          <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
          <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
          <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
          <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000
          <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
          <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
              sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
              sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
          <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
          <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
          <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
          <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
          <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
          <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
          <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
          <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
          <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
          <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
          <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
          <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
          <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
          <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
          <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
     rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
          <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
     rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
     kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
          <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
          <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
          <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
          <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
          <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
          <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
          <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
          <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
          <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
          <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
          <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
          <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
          <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
          <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
          <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
          <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
          <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
     rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
     rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
          <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
          <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
          <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
          <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
          <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
          <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
          <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
          <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
          <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
          <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
          <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
          <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
          <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
          <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
          <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
          <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
          <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
          <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
          <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
          <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
          <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
          <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
          <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
          <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
          <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
          <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
          <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
    kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
          <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
          <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
          <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
          <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
          <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
          <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
          <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
          <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
          <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
          <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
          <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
          <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808
> 
> Thanks,
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > [ 1984.643626] Call trace:
> > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > >     
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27 16:39                                                       ` Jonathan Cameron
  (?)
@ 2017-07-27 16:52                                                         ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="windows-1254", Size: 70580 bytes --]

On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> On Thu, 27 Jul 2017 14:49:03 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Thu, 27 Jul 2017 05:49:13 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > > >     
> > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > > 
> > > > > I have seen stranger things, but admittedly not often.    
> > > > 
> > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > 
> > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > scheduled but not correctly noting gps on other CPUs?
> > > > 
> > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > because the timer has not fired:    
> > > 
> > > Good point, Nick!
> > > 
> > > Jonathan, could you please reproduce collecting timer event tracing?  
> > I'm a little new to tracing (only started playing with it last week)
> > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > splats on reading out the trace when running with the RCU stall timer
> > set to 4 so have increased that back to the default and am rerunning.
> > 
> > This may take a while.  Correct me if I've gotten this wrong to save time
> > 
> > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > 
> > when it dumps, just send you the relevant part of what is in
> > /sys/kernel/debug/tracing/trace?
> 
> Interestingly the only thing that can make trip for me with tracing on
> is peaking in the tracing buffers.  Not sure this is a valid case or
> not.
> 
> Anyhow all timer activity seems to stop around the area of interest.
> 
> 
> [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 9442.419107] 	1-...: (1 GPs behind) idle„4/0/0 softirq'747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> [ 9442.430224] 	3-...: (2 GPs behind) idle8/0/0 softirq2197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> [ 9442.441340] 	4-...: (7 GPs behind) idlet0/0/0 softirq"351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> [ 9442.452456] 	5-...: (2 GPs behind) idle›0/0/0 softirq!315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.463572] 	6-...: (2 GPs behind) idley4/0/0 softirq\x19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> [ 9442.474688] 	7-...: (2 GPs behind) idle¬4/0/0 softirq"547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.485803] 	8-...: (9 GPs behind) idle\x118/0/0 softirq(1/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> [ 9442.496571] 	9-...: (9 GPs behind) idlec/0/0 softirq(4/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> [ 9442.507339] 	10-...: (14 GPs behind) idle÷8/0/0 softirq%4/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> [ 9442.518281] 	11-...: (9 GPs behind) idleÉc/0/0 softirq01/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> [ 9442.529136] 	12-...: (9 GPs behind) idleJ4/0/0 softirqs5/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> [ 9442.539992] 	13-...: (9 GPs behind) idle4c/0/0 softirq\x1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> [ 9442.551020] 	14-...: (9 GPs behind) idle/4/0/0 softirqp7/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> [ 9442.561875] 	15-...: (2 GPs behind) idle³0/0/0 softirq‚1/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> [ 9442.572730] 	17-...: (2 GPs behind) idleZ8/0/0 softirq\x1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> [ 9442.583759] 	18-...: (2 GPs behind) idle.4/0/0 softirq\x1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> [ 9442.594787] 	19-...: (2 GPs behind) idle\x138/0/0 softirq\x1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> [ 9442.605816] 	20-...: (50 GPs behind) idlec4/0/0 softirq!7/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> [ 9442.616758] 	21-...: (2 GPs behind) idleë8/0/0 softirq\x1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> [ 9442.627786] 	22-...: (1 GPs behind) idleª8/0/0 softirq"9/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> [ 9442.638641] 	23-...: (1 GPs behind) idleH8/0/0 softirq$7/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> [ 9442.649496] 	24-...: (33 GPs behind) idle÷c/0/0 softirq19/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> [ 9442.660437] 	25-...: (33 GPs behind) idle”4/0/0 softirq08/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> [ 9442.671379] 	26-...: (9 GPs behind) idlem4/0/0 softirq&5/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> [ 9442.682234] 	27-...: (115 GPs behind) idleãc/0/0 softirq!2/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.693263] 	28-...: (9 GPs behind) idleê4/0/0 softirqT0/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> [ 9442.704118] 	29-...: (115 GPs behind) idleƒc/0/0 softirq42/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.715147] 	30-...: (33 GPs behind) idleãc/0/0 softirqP9/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> [ 9442.726088] 	31-...: (9 GPs behind) idleß4/0/0 softirqa9/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> [ 9442.736944] 	32-...: (9 GPs behind) idleª4/0/0 softirq\x1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.747972] 	34-...: (9 GPs behind) idleæc/0/0 softirqP82/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> [ 9442.759001] 	35-...: (9 GPs behind) idle\x7fc/0/0 softirq\x1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> [ 9442.770030] 	36-...: (0 ticks this GP) idleò8/0/0 softirq%5/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> [ 9442.781145] 	37-...: (50 GPs behind) idleSc/0/0 softirq"7/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> [ 9442.792087] 	38-...: (9 GPs behind) idle•8/0/0 softirq\x185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.802942] 	40-...: (389 GPs behind) idleAc/0/0 softirq\x131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> [ 9442.813971] 	41-...: (389 GPs behind) idle%8/0/0 softirq\x133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> [ 9442.825000] 	43-...: (50 GPs behind) idle%4/0/0 softirq\x113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.835942] 	44-...: (115 GPs behind) idle\x178/0/0 softirq\x1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> [ 9442.847144] 	45-...: (2 GPs behind) idle\x04a/1/0 softirq64/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> [ 9442.857999] 	46-...: (9 GPs behind) idleì4/0/0 softirq\x183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.868854] 	47-...: (115 GPs behind) idle\b8/0/0 softirq\x135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.879883] 	48-...: (389 GPs behind) idle 0/0/0 softirq\x103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> [ 9442.890911] 	49-...: (9 GPs behind) idle¢4/0/0 softirq 5/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.901766] 	50-...: (25 GPs behind) idle§4/0/0 softirq\x144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.912708] 	51-...: (50 GPs behind) idleö8/0/0 softirq\x116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> [ 9442.923650] 	52-...: (9 GPs behind) idleà8/0/0 softirq 2/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> [ 9442.934505] 	53-...: (2 GPs behind) idle\x128/0/0 softirq65/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> [ 9442.945360] 	54-...: (9 GPs behind) idleÎ8/0/0 softirq\x126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> [ 9442.956215] 	56-...: (9 GPs behind) idle30/0/0 softirq!16/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> [ 9442.967243] 	57-...: (1 GPs behind) idle(8/0/0 softirq\x1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> [ 9442.978272] 	58-...: (37 GPs behind) idle90/0/0 softirq\x1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> [ 9442.989387] 	59-...: (37 GPs behind) idleå4/0/0 softirq\x1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> [ 9443.000502] 	60-...: (116 GPs behind) idle{4/0/0 softirq’/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> [ 9443.011357] 	61-...: (9 GPs behind) idle8/0/0 softirq\x161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> [ 9443.022212] 	62-...: (115 GPs behind) idleª8/0/0 softirq•/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> [ 9443.033154] 	63-...: (50 GPs behind) idle•8/0/0 softirq/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> [ 9443.043920] 	(detected by 39, tT03 jiffies, gD3, cD2, q=1)
> [ 9443.049919] Task dump for CPU 1:
> [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> [ 9443.060173] Call trace:
> [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.067744] [<          (null)>]           (null)
> [ 9443.072434] Task dump for CPU 3:
> [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> [ 9443.082686] Call trace:
> [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.090246] [<          (null)>]           (null)
> [ 9443.094936] Task dump for CPU 4:
> [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> [ 9443.105188] Call trace:
> [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.118224] Task dump for CPU 5:
> [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> [ 9443.128476] Call trace:
> [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.136035] [<          (null)>]           (null)
> [ 9443.140725] Task dump for CPU 6:
> [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> [ 9443.150976] Call trace:
> [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.158535] [<          (null)>]           (null)
> [ 9443.163226] Task dump for CPU 7:
> [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> [ 9443.173478] Call trace:
> [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.181037] [<          (null)>]           (null)
> [ 9443.185727] Task dump for CPU 8:
> [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> [ 9443.195979] Call trace:
> [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.203537] [<          (null)>]           (null)
> [ 9443.208227] Task dump for CPU 9:
> [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> [ 9443.218479] Call trace:
> [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.231510] Task dump for CPU 10:
> [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> [ 9443.241848] Call trace:
> [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.254879] Task dump for CPU 11:
> [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> [ 9443.265218] Call trace:
> [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.272776] [<          (null)>]           (null)
> [ 9443.277467] Task dump for CPU 12:
> [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> [ 9443.287806] Call trace:
> [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.295364] [<          (null)>]           (null)
> [ 9443.300054] Task dump for CPU 13:
> [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> [ 9443.310394] Call trace:
> [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.317953] [<          (null)>]           (null)
> [ 9443.322643] Task dump for CPU 14:
> [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> [ 9443.332981] Call trace:
> [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.340540] [<          (null)>]           (null)
> [ 9443.345230] Task dump for CPU 15:
> [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> [ 9443.355568] Call trace:
> [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.368599] Task dump for CPU 17:
> [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> [ 9443.378937] Call trace:
> [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.391968] Task dump for CPU 18:
> [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> [ 9443.402306] Call trace:
> [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.415336] Task dump for CPU 19:
> [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> [ 9443.425675] Call trace:
> [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.433234] [<          (null)>]           (null)
> [ 9443.437924] Task dump for CPU 20:
> [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> [ 9443.448263] Call trace:
> [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.473458] Task dump for CPU 21:
> [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> [ 9443.483796] Call trace:
> [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.491354] [<          (null)>]           (null)
> [ 9443.496045] Task dump for CPU 22:
> [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> [ 9443.506383] Call trace:
> [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.519414] Task dump for CPU 23:
> [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> [ 9443.529752] Call trace:
> [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.542784] Task dump for CPU 24:
> [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> [ 9443.553122] Call trace:
> [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.566153] Task dump for CPU 25:
> [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> [ 9443.576491] Call trace:
> [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.601682] Task dump for CPU 26:
> [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> [ 9443.612021] Call trace:
> [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.625052] Task dump for CPU 27:
> [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> [ 9443.635390] Call trace:
> [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.648421] Task dump for CPU 28:
> [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> [ 9443.658759] Call trace:
> [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.666318] [<          (null)>]           (null)
> [ 9443.671008] Task dump for CPU 29:
> [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> [ 9443.681346] Call trace:
> [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.694377] Task dump for CPU 30:
> [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> [ 9443.704715] Call trace:
> [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.729907] Task dump for CPU 31:
> [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> [ 9443.740246] Call trace:
> [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.747805] [<          (null)>]           (null)
> [ 9443.752496] Task dump for CPU 32:
> [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> [ 9443.762833] Call trace:
> [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.770392] [<          (null)>]           (null)
> [ 9443.775082] Task dump for CPU 34:
> [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> [ 9443.785420] Call trace:
> [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.798451] Task dump for CPU 35:
> [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> [ 9443.808789] Call trace:
> [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.821820] Task dump for CPU 36:
> [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> [ 9443.832158] Call trace:
> [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.857350] Task dump for CPU 37:
> [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> [ 9443.867688] Call trace:
> [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.892880] Task dump for CPU 38:
> [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> [ 9443.903218] Call trace:
> [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.910776] [<          (null)>]           (null)
> [ 9443.915466] Task dump for CPU 40:
> [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> [ 9443.925805] Call trace:
> [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.938836] Task dump for CPU 41:
> [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> [ 9443.949174] Call trace:
> [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.956733] [<          (null)>]           (null)
> [ 9443.961423] Task dump for CPU 43:
> [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> [ 9443.971761] Call trace:
> [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.984791] Task dump for CPU 44:
> [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> [ 9443.995130] Call trace:
> [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.002688] [<          (null)>]           (null)
> [ 9444.007378] Task dump for CPU 45:
> [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> [ 9444.017716] Call trace:
> [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.025275] [<          (null)>]           (null)
> [ 9444.029965] Task dump for CPU 46:
> [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> [ 9444.040302] Call trace:
> [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.053333] Task dump for CPU 47:
> [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> [ 9444.063672] Call trace:
> [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.076702] Task dump for CPU 48:
> [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> [ 9444.087041] Call trace:
> [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.100071] Task dump for CPU 49:
> [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> [ 9444.110409] Call trace:
> [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.123440] Task dump for CPU 50:
> [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> [ 9444.133777] Call trace:
> [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.146807] Task dump for CPU 51:
> [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> [ 9444.157144] Call trace:
> [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.164703] [<          (null)>]           (null)
> [ 9444.169393] Task dump for CPU 52:
> [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> [ 9444.179731] Call trace:
> [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.192761] Task dump for CPU 53:
> [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> [ 9444.203099] Call trace:
> [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.216129] Task dump for CPU 54:
> [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> [ 9444.226467] Call trace:
> [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.239498] Task dump for CPU 56:
> [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> [ 9444.249837] Call trace:
> [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.257396] [<          (null)>]           (null)
> [ 9444.262086] Task dump for CPU 57:
> [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> [ 9444.272424] Call trace:
> [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.279982] [<          (null)>]           (null)
> [ 9444.284672] Task dump for CPU 58:
> [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> [ 9444.295011] Call trace:
> [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.320202] Task dump for CPU 59:
> [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> [ 9444.330540] Call trace:
> [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.343571] Task dump for CPU 60:
> [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> [ 9444.353909] Call trace:
> [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.379101] Task dump for CPU 61:
> [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> [ 9444.389438] Call trace:
> [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.402469] Task dump for CPU 62:
> [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> [ 9444.412808] Call trace:
> [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.425838] Task dump for CPU 63:
> [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> [ 9444.436177] Call trace:
> [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> [ 9444.463889] Call trace:
> [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> 
> 
> 
> And the relevant chunk of trace is:
> (I have a lot more.  There are substantial other pauses from to time, but not this long)
> 
> 
>    rcu_preempt-9     [057] ....  9419.837631: timer_init: timerÿff8017d5fcfda0
>      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97246837 [timeout=1] cpuW idx=0 flags>           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimerÿff8017db99e808
>           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18164000000 softexpires”18164000000
>           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimerÿff8017db99e808
>           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18164001440
>           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimerÿff8017db99e808
>           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timerÿff8017d5fcfda0
>           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97246838
>           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timerÿff8017d5fcfda0
>           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18168000000 softexpires”18168000000
>           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimerÿff8017db99e808
>           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18188000000 softexpires”18188000000
>           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimerÿff8017db99e808
>           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18188001420
>           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimerÿff8017db99e808
>           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires˜58983202655 softexpires˜58983202655
>           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimerÿff8017fbc3d808
>           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimerÿff8017fbc3d808 function=tick_sched_timer now”18204006760
>           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimerÿff8017fbc3d808
>           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timerÿff8017d37dbca0
>           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timerÿff8017d37dbca0 function=process_timeout nowB97246848
>           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timerÿff8017d37dbca0
>           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”18208000000 softexpires”18208000000
>       khugepaged-778   [016] ....  9419.885668: timer_init: timerÿff8017d37dbca0
>       khugepaged-778   [016] d..1  9419.885668: timer_start: timerÿff8017d37dbca0 function=process_timeout expiresB97249348 [timeout%00] cpu\x16 idx=0 flags>           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
>           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimerÿff8017fbc3d808
>           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”28444000000 softexpires”28444000000
>           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18209219940
>           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18310221420 softexpires”18309221420
>           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18236005860
>           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97246856
>           <idle>-0     [000] d.s2  9419.917630: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97246881 [timeout%] cpu=0 idx flags>           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”18340000000 softexpires”18340000000
>           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18310225960
>           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18411227320 softexpires”18410227320
>           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18340005520
>           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97246882
>           <idle>-0     [000] d.s2  9420.021629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247107 [timeout"5] cpu=0 idx4 flags>           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
>           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18411231780
>           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18512233720 softexpires”18511233720
>           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”18460002540
>           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimerÿff8017db968808
>           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timerÿff80177db6cc08
>           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97246912
>           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timerÿff80177db6cc08
>           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”18464000000 softexpires”18464000000
>     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247162 [timeout%0] cpuU idxˆ flags=I
>           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19484000000 softexpires”19484000000
>           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18512235660
>           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18613237260 softexpires”18612237260
>           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18613238380
>           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18714240000 softexpires”18713240000
>           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18714241380
>           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18815242920 softexpires”18814242920
>           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimerÿff8017dbb69808
>           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimerÿff8017dbb69808 function=tick_sched_timer now”18780002180
>           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimerÿff8017dbb69808
>           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timerÿff80177db6d408
>           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timerÿff80177db6d408 functionÞlayed_work_timer_fn nowB97246992
>           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timerÿff80177db6d408
>           <idle>-0     [042] dns2  9420.461627: timer_cancel: timerÿff8017797d7868
>           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timerÿff8017797d7868 function=hns_nic_service_timer nowB97246992
>           <idle>-0     [042] dns2  9420.461628: timer_start: timerÿff8017797d7868 function=hns_nic_service_timer expiresB97247242 [timeout%0] cpuB idx˜ flags>           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timerÿff8017797d7868
>           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”18784000000 softexpires”18784000000
>     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timerÿff80177db6d408 functionÞlayed_work_timer_fn expiresB97247242 [timeout%0] cpuB idx˜ flags=I
>           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
>           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimerÿff8017dbb69808
>           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”19804000000 softexpires”19804000000
>           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18815244580
>           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18916246140 softexpires”18915246140
>           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18916247280
>           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19017248760 softexpires”19016248760
>           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimerÿff8017dba76808
>           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimerÿff8017dba76808 function=tick_sched_timer now”18940002160
>           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimerÿff8017dba76808
>           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timerÿff00000917be40
>           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timerÿff00000917be40 functionÞlayed_work_timer_fn nowB97247032
>           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timerÿff00000917be40
>           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”18944000000 softexpires”18944000000
>            <...>-1631  [033] d..1  9420.621636: timer_start: timerÿff00000917be40 functionÞlayed_work_timer_fn expiresB97247282 [timeout%0] cpu3 idx\x103 flags=I
>           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
>           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimerÿff8017dba76808
>           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”19964000000 softexpires”19964000000
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
>           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
>           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59803202655 softexpires˜59803202655
>           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19017253180
>           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19118254640 softexpires”19117254640
>           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19118255760
>           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19219257140 softexpires”19218257140
>           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
>           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
>           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
>           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59983202655 softexpires˜59983202655
>           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19219261580
>           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19320263160 softexpires”19319263160
>           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimerÿff8017fbe5b808
>           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimerÿff8017fbe5b808 function=tick_sched_timer expires˜60023202655 softexpires˜60023202655
>           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19260001400
>           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97247112
>           <idle>-0     [000] d.s2  9420.941624: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97247137 [timeout%] cpu=0 idx\x113 flags>           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timerÿff8017fbe42558
>           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timerÿff8017fbe42558 functionÞlayed_work_timer_fn nowB97247112
>           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timerÿff8017fbe42558
>           <idle>-0     [000] dns2  9420.941631: timer_cancel: timerÿff00000910a628
>           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97247112
>           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timerÿff00000910a628
>           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
>      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97247500 [timeout88] cpu=0 idx\x100 flags=D|I
>      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timerÿff8017fbe78558 functionÞlayed_work_timer_fn expiresB97247494 [timeout82] cpu=2 idx\x114 flags=D|I
>           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19364000000 softexpires”19364000000
>           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜60055202655 softexpires˜60055202655
>           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19320267640
>           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19421269000 softexpires”19420269000
>           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19364005380
>           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97247138
>           <idle>-0     [000] d.s2  9421.045628: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247363 [timeout"5] cpu=0 idx4 flags>           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
>           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19421273420
>           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19522275040 softexpires”19521275040
>           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”19484002280
>           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimerÿff8017db968808
>           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timerÿff80177db6cc08
>           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97247168
>           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timerÿff80177db6cc08
>           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19488000000 softexpires”19488000000
>     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247418 [timeout%0] cpuU idx\x120 flags=I
>           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”20508000000 softexpires”20508000000
>           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19522276980
>           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19623278460 softexpires”19622278460
>           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19623279580
>           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19724281060 softexpires”19723281060
>           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
>           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
>               sh-2256  [002] ....  9421.381193: timer_init: timerÿff80176c26fb40
>               sh-2256  [002] d..1  9421.381194: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97247223 [timeout=2] cpu=2 idx=0 flags>           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19708000000 softexpires”19708000000
>           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”19708002000
>           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timerÿff80176c26fb40
>           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timerÿff80176c26fb40 function=process_timeout nowB97247224
>           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timerÿff80176c26fb40
>           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimerÿff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimerÿff8017dbb18808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
>           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19724000000 softexpires”19724000000

This being the gap?

Interesting in that I am not seeing any timeouts at all associated with
the rcu_sched kthread...

							Thanx, Paul

>           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42832000000 softexpires”42832000000
>           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
>               sh-2256  [002] ....  9444.510857: timer_init: timerÿff80176c26fb40
>               sh-2256  [002] d..1  9444.510857: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97253006 [timeout=2] cpu=2 idx=0 flags>           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
>           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42844005600
>           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42844005460
>           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42844005300
>           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timerÿff8017fbe78558
>           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timerÿff8017fbe78558 functionÞlayed_work_timer_fn nowB97253008
>           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97253008
>           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timerÿff8017fbe78558
>           <idle>-0     [000] d.s2  9444.525632: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97253033 [timeout%] cpu=0 idx‚ flags>           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timerÿff00000910a628
>           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97253008
>           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timerÿff00000910a628
>           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
>           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
>           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
>      rcu_preempt-9     [036] ....  9444.525648: timer_init: timerÿff8017d5fcfda0
>           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42860000000 softexpires”42860000000
>      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253009 [timeout=1] cpu6 idx=0 flags>      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97253250 [timeout$2] cpu=0 idx‚ flags=D|I
>           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42948000000 softexpires”42948000000
>           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42852000000 softexpires”42852000000
>           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42852004760
>           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timerÿff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253010
>           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timerÿff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42856000000 softexpires”42856000000
>           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42876000000 softexpires”42876000000
>           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42860007120
>           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimerÿff8017fbe76808
>           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42864000000 softexpires”42864000000
>           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimerÿff8017fbe76808
>           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”44316000000 softexpires”44316000000
>           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42876008220
>           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42880000000 softexpires”42880000000
>           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42892000000 softexpires”42892000000
>           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42892001340
>           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42896000000 softexpires”42896000000
>      rcu_preempt-9     [036] ....  9444.573631: timer_init: timerÿff8017d5fcfda0
>      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253021 [timeout=1] cpu6 idx=0 flags>           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42900000000 softexpires”42900000000
>           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42900001400
>           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timerÿff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253022
>           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timerÿff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42904000000 softexpires”42904000000
>           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42924000000 softexpires”42924000000
>           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”42900098200
>           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43001101380 softexpires”43000101380
>           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42924001600
>           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
>           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires˜83719202655 softexpires˜83719202655
>           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42948005580
>           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
>           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timerÿff80177fdc0840
>           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97253034
>           <idle>-0     [000] d.s2  9444.629629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97253259 [timeout"5] cpu=0 idxB flags>           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timerÿff80177fdc0840
>           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”43868000000 softexpires”43868000000
>           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43001105940
>           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43102107440 softexpires”43101107440
>           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”43036006240
>           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimerÿff8017db968808
>           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timerÿff80177db6cc08
>           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97253056
>           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timerÿff80177db6cc08
>           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”43040000000 softexpires”43040000000
>     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97253306 [timeout%0] cpuU idxˆ flags=I
>           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimerÿff8017db968808
>           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”44060000000 softexpires”44060000000
>           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43102109380
>           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43203110880 softexpires”43202110880
>           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43203112000
>           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43304113380 softexpires”43303113380
>           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimerÿff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43304114500
>           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43405116440 softexpires”43404116440
>           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
>           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimerÿff8017dbb69808
> > 
> > Thanks,
> > 
> > Jonathan
> > > 
> > > 							Thanx, Paul
> > >   
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > [ 1984.643626] Call trace:
> > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > >     
> > >   
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 16:52                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 16:52 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> On Thu, 27 Jul 2017 14:49:03 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Thu, 27 Jul 2017 05:49:13 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > > >     
> > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > > 
> > > > > I have seen stranger things, but admittedly not often.    
> > > > 
> > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > 
> > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > scheduled but not correctly noting gps on other CPUs?
> > > > 
> > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > because the timer has not fired:    
> > > 
> > > Good point, Nick!
> > > 
> > > Jonathan, could you please reproduce collecting timer event tracing?  
> > I'm a little new to tracing (only started playing with it last week)
> > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > splats on reading out the trace when running with the RCU stall timer
> > set to 4 so have increased that back to the default and am rerunning.
> > 
> > This may take a while.  Correct me if I've gotten this wrong to save time
> > 
> > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > 
> > when it dumps, just send you the relevant part of what is in
> > /sys/kernel/debug/tracing/trace?
> 
> Interestingly the only thing that can make trip for me with tracing on
> is peaking in the tracing buffers.  Not sure this is a valid case or
> not.
> 
> Anyhow all timer activity seems to stop around the area of interest.
> 
> 
> [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> [ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> [ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> [ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> [ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> [ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> [ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> [ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> [ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> [ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> [ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> [ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> [ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> [ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> [ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> [ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> [ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> [ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> [ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> [ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> [ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> [ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> [ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> [ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> [ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> [ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> [ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> [ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> [ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> [ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> [ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> [ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> [ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> [ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> [ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> [ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> [ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> [ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> [ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> [ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> [ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> [ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> [ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> [ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> [ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> [ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> [ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
> [ 9443.049919] Task dump for CPU 1:
> [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> [ 9443.060173] Call trace:
> [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.067744] [<          (null)>]           (null)
> [ 9443.072434] Task dump for CPU 3:
> [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> [ 9443.082686] Call trace:
> [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.090246] [<          (null)>]           (null)
> [ 9443.094936] Task dump for CPU 4:
> [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> [ 9443.105188] Call trace:
> [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.118224] Task dump for CPU 5:
> [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> [ 9443.128476] Call trace:
> [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.136035] [<          (null)>]           (null)
> [ 9443.140725] Task dump for CPU 6:
> [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> [ 9443.150976] Call trace:
> [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.158535] [<          (null)>]           (null)
> [ 9443.163226] Task dump for CPU 7:
> [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> [ 9443.173478] Call trace:
> [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.181037] [<          (null)>]           (null)
> [ 9443.185727] Task dump for CPU 8:
> [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> [ 9443.195979] Call trace:
> [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.203537] [<          (null)>]           (null)
> [ 9443.208227] Task dump for CPU 9:
> [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> [ 9443.218479] Call trace:
> [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.231510] Task dump for CPU 10:
> [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> [ 9443.241848] Call trace:
> [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.254879] Task dump for CPU 11:
> [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> [ 9443.265218] Call trace:
> [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.272776] [<          (null)>]           (null)
> [ 9443.277467] Task dump for CPU 12:
> [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> [ 9443.287806] Call trace:
> [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.295364] [<          (null)>]           (null)
> [ 9443.300054] Task dump for CPU 13:
> [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> [ 9443.310394] Call trace:
> [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.317953] [<          (null)>]           (null)
> [ 9443.322643] Task dump for CPU 14:
> [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> [ 9443.332981] Call trace:
> [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.340540] [<          (null)>]           (null)
> [ 9443.345230] Task dump for CPU 15:
> [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> [ 9443.355568] Call trace:
> [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.368599] Task dump for CPU 17:
> [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> [ 9443.378937] Call trace:
> [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.391968] Task dump for CPU 18:
> [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> [ 9443.402306] Call trace:
> [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.415336] Task dump for CPU 19:
> [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> [ 9443.425675] Call trace:
> [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.433234] [<          (null)>]           (null)
> [ 9443.437924] Task dump for CPU 20:
> [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> [ 9443.448263] Call trace:
> [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.473458] Task dump for CPU 21:
> [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> [ 9443.483796] Call trace:
> [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.491354] [<          (null)>]           (null)
> [ 9443.496045] Task dump for CPU 22:
> [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> [ 9443.506383] Call trace:
> [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.519414] Task dump for CPU 23:
> [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> [ 9443.529752] Call trace:
> [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.542784] Task dump for CPU 24:
> [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> [ 9443.553122] Call trace:
> [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.566153] Task dump for CPU 25:
> [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> [ 9443.576491] Call trace:
> [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.601682] Task dump for CPU 26:
> [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> [ 9443.612021] Call trace:
> [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.625052] Task dump for CPU 27:
> [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> [ 9443.635390] Call trace:
> [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.648421] Task dump for CPU 28:
> [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> [ 9443.658759] Call trace:
> [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.666318] [<          (null)>]           (null)
> [ 9443.671008] Task dump for CPU 29:
> [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> [ 9443.681346] Call trace:
> [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.694377] Task dump for CPU 30:
> [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> [ 9443.704715] Call trace:
> [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.729907] Task dump for CPU 31:
> [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> [ 9443.740246] Call trace:
> [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.747805] [<          (null)>]           (null)
> [ 9443.752496] Task dump for CPU 32:
> [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> [ 9443.762833] Call trace:
> [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.770392] [<          (null)>]           (null)
> [ 9443.775082] Task dump for CPU 34:
> [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> [ 9443.785420] Call trace:
> [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.798451] Task dump for CPU 35:
> [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> [ 9443.808789] Call trace:
> [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.821820] Task dump for CPU 36:
> [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> [ 9443.832158] Call trace:
> [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.857350] Task dump for CPU 37:
> [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> [ 9443.867688] Call trace:
> [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.892880] Task dump for CPU 38:
> [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> [ 9443.903218] Call trace:
> [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.910776] [<          (null)>]           (null)
> [ 9443.915466] Task dump for CPU 40:
> [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> [ 9443.925805] Call trace:
> [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.938836] Task dump for CPU 41:
> [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> [ 9443.949174] Call trace:
> [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.956733] [<          (null)>]           (null)
> [ 9443.961423] Task dump for CPU 43:
> [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> [ 9443.971761] Call trace:
> [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.984791] Task dump for CPU 44:
> [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> [ 9443.995130] Call trace:
> [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.002688] [<          (null)>]           (null)
> [ 9444.007378] Task dump for CPU 45:
> [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> [ 9444.017716] Call trace:
> [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.025275] [<          (null)>]           (null)
> [ 9444.029965] Task dump for CPU 46:
> [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> [ 9444.040302] Call trace:
> [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.053333] Task dump for CPU 47:
> [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> [ 9444.063672] Call trace:
> [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.076702] Task dump for CPU 48:
> [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> [ 9444.087041] Call trace:
> [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.100071] Task dump for CPU 49:
> [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> [ 9444.110409] Call trace:
> [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.123440] Task dump for CPU 50:
> [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> [ 9444.133777] Call trace:
> [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.146807] Task dump for CPU 51:
> [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> [ 9444.157144] Call trace:
> [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.164703] [<          (null)>]           (null)
> [ 9444.169393] Task dump for CPU 52:
> [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> [ 9444.179731] Call trace:
> [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.192761] Task dump for CPU 53:
> [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> [ 9444.203099] Call trace:
> [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.216129] Task dump for CPU 54:
> [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> [ 9444.226467] Call trace:
> [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.239498] Task dump for CPU 56:
> [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> [ 9444.249837] Call trace:
> [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.257396] [<          (null)>]           (null)
> [ 9444.262086] Task dump for CPU 57:
> [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> [ 9444.272424] Call trace:
> [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.279982] [<          (null)>]           (null)
> [ 9444.284672] Task dump for CPU 58:
> [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> [ 9444.295011] Call trace:
> [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.320202] Task dump for CPU 59:
> [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> [ 9444.330540] Call trace:
> [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.343571] Task dump for CPU 60:
> [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> [ 9444.353909] Call trace:
> [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.379101] Task dump for CPU 61:
> [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> [ 9444.389438] Call trace:
> [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.402469] Task dump for CPU 62:
> [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> [ 9444.412808] Call trace:
> [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.425838] Task dump for CPU 63:
> [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> [ 9444.436177] Call trace:
> [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> [ 9444.463889] Call trace:
> [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> 
> 
> 
> And the relevant chunk of trace is:
> (I have a lot more.  There are substantial other pauses from to time, but not this long)
> 
> 
>    rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
>      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
>           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
>           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
>           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
>           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
>           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
>           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
>           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
>           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
>           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
>           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
>           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
>           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
>       khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
>       khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
>           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
>           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
>           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
>           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
>           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
>           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
>           <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
>           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
>           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
>           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
>           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
>           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
>           <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
>           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
>           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
>           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
>           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
>           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
>     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
>           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
>           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
>           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
>           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
>           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
>           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
>           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
>           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
>           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
>           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
>           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
>           <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
>           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
>           <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
>           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
>           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
>     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
>           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
>           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
>           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
>           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
>           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
>           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
>           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
>           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
>           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
>           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
>           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
>            <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
>           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
>           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
>           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
>           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
>           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
>           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
>           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
>           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
>           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
>           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
>           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
>           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
>           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
>           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
>           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
>           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
>           <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
>           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
>           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
>           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
>           <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
>           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
>           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
>           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
>      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
>      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
>           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
>           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
>           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
>           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
>           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
>           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
>           <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
>           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
>           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
>           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
>           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
>           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
>           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
>     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
>           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
>           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
>           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
>           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
>           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
>           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
>           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
>               sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
>               sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
>           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
>           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
>           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
>           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
>           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
>           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
>           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000

This being the gap?

Interesting in that I am not seeing any timeouts at all associated with
the rcu_sched kthread...

							Thanx, Paul

>           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
>           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
>               sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
>               sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
>           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
>           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
>           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
>           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
>           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
>           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
>           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
>           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
>           <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
>           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
>           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
>           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
>           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>      rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
>           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
>      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
>      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
>           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
>           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
>           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
>           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
>           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
>           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
>           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
>           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
>           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
>           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
>           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
>           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
>           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
>           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
>      rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
>      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
>           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
>           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
>           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
>           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
>           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
>           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
>           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
>           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
>           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
>           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
>           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
>           <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
>           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
>           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
>           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
>           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
>           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
>           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
>     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
>           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
>           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
>           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
>           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
>           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
>           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
>           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
>           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808
> > 
> > Thanks,
> > 
> > Jonathan
> > > 
> > > 							Thanx, Paul
> > >   
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > [ 1984.643626] Call trace:
> > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > >     
> > >   
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-27 16:52                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-27 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> On Thu, 27 Jul 2017 14:49:03 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Thu, 27 Jul 2017 05:49:13 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:    
> > > >     
> > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > dump listing almost all of the cpus as having missed a grace period.      
> > > > > 
> > > > > I have seen stranger things, but admittedly not often.    
> > > > 
> > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > 
> > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > scheduled but not correctly noting gps on other CPUs?
> > > > 
> > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > because the timer has not fired:    
> > > 
> > > Good point, Nick!
> > > 
> > > Jonathan, could you please reproduce collecting timer event tracing?  
> > I'm a little new to tracing (only started playing with it last week)
> > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > splats on reading out the trace when running with the RCU stall timer
> > set to 4 so have increased that back to the default and am rerunning.
> > 
> > This may take a while.  Correct me if I've gotten this wrong to save time
> > 
> > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > 
> > when it dumps, just send you the relevant part of what is in
> > /sys/kernel/debug/tracing/trace?
> 
> Interestingly the only thing that can make trip for me with tracing on
> is peaking in the tracing buffers.  Not sure this is a valid case or
> not.
> 
> Anyhow all timer activity seems to stop around the area of interest.
> 
> 
> [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> [ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> [ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> [ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> [ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> [ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> [ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> [ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> [ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> [ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> [ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> [ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> [ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> [ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> [ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> [ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> [ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> [ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> [ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> [ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> [ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> [ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> [ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> [ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> [ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> [ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> [ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> [ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> [ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> [ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> [ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> [ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> [ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> [ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> [ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> [ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> [ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> [ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> [ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> [ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> [ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> [ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> [ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> [ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> [ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> [ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> [ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> [ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> [ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> [ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> [ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
> [ 9443.049919] Task dump for CPU 1:
> [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> [ 9443.060173] Call trace:
> [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.067744] [<          (null)>]           (null)
> [ 9443.072434] Task dump for CPU 3:
> [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> [ 9443.082686] Call trace:
> [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.090246] [<          (null)>]           (null)
> [ 9443.094936] Task dump for CPU 4:
> [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> [ 9443.105188] Call trace:
> [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.118224] Task dump for CPU 5:
> [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> [ 9443.128476] Call trace:
> [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.136035] [<          (null)>]           (null)
> [ 9443.140725] Task dump for CPU 6:
> [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> [ 9443.150976] Call trace:
> [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.158535] [<          (null)>]           (null)
> [ 9443.163226] Task dump for CPU 7:
> [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> [ 9443.173478] Call trace:
> [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.181037] [<          (null)>]           (null)
> [ 9443.185727] Task dump for CPU 8:
> [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> [ 9443.195979] Call trace:
> [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.203537] [<          (null)>]           (null)
> [ 9443.208227] Task dump for CPU 9:
> [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> [ 9443.218479] Call trace:
> [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.231510] Task dump for CPU 10:
> [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> [ 9443.241848] Call trace:
> [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.254879] Task dump for CPU 11:
> [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> [ 9443.265218] Call trace:
> [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.272776] [<          (null)>]           (null)
> [ 9443.277467] Task dump for CPU 12:
> [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> [ 9443.287806] Call trace:
> [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.295364] [<          (null)>]           (null)
> [ 9443.300054] Task dump for CPU 13:
> [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> [ 9443.310394] Call trace:
> [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.317953] [<          (null)>]           (null)
> [ 9443.322643] Task dump for CPU 14:
> [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> [ 9443.332981] Call trace:
> [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.340540] [<          (null)>]           (null)
> [ 9443.345230] Task dump for CPU 15:
> [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> [ 9443.355568] Call trace:
> [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.368599] Task dump for CPU 17:
> [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> [ 9443.378937] Call trace:
> [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.391968] Task dump for CPU 18:
> [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> [ 9443.402306] Call trace:
> [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.415336] Task dump for CPU 19:
> [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> [ 9443.425675] Call trace:
> [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.433234] [<          (null)>]           (null)
> [ 9443.437924] Task dump for CPU 20:
> [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> [ 9443.448263] Call trace:
> [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.473458] Task dump for CPU 21:
> [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> [ 9443.483796] Call trace:
> [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.491354] [<          (null)>]           (null)
> [ 9443.496045] Task dump for CPU 22:
> [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> [ 9443.506383] Call trace:
> [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.519414] Task dump for CPU 23:
> [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> [ 9443.529752] Call trace:
> [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.542784] Task dump for CPU 24:
> [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> [ 9443.553122] Call trace:
> [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.566153] Task dump for CPU 25:
> [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> [ 9443.576491] Call trace:
> [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.601682] Task dump for CPU 26:
> [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> [ 9443.612021] Call trace:
> [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.625052] Task dump for CPU 27:
> [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> [ 9443.635390] Call trace:
> [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.648421] Task dump for CPU 28:
> [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> [ 9443.658759] Call trace:
> [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.666318] [<          (null)>]           (null)
> [ 9443.671008] Task dump for CPU 29:
> [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> [ 9443.681346] Call trace:
> [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.694377] Task dump for CPU 30:
> [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> [ 9443.704715] Call trace:
> [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.729907] Task dump for CPU 31:
> [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> [ 9443.740246] Call trace:
> [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.747805] [<          (null)>]           (null)
> [ 9443.752496] Task dump for CPU 32:
> [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> [ 9443.762833] Call trace:
> [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.770392] [<          (null)>]           (null)
> [ 9443.775082] Task dump for CPU 34:
> [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> [ 9443.785420] Call trace:
> [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.798451] Task dump for CPU 35:
> [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> [ 9443.808789] Call trace:
> [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.821820] Task dump for CPU 36:
> [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> [ 9443.832158] Call trace:
> [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.857350] Task dump for CPU 37:
> [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> [ 9443.867688] Call trace:
> [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.892880] Task dump for CPU 38:
> [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> [ 9443.903218] Call trace:
> [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.910776] [<          (null)>]           (null)
> [ 9443.915466] Task dump for CPU 40:
> [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> [ 9443.925805] Call trace:
> [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.938836] Task dump for CPU 41:
> [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> [ 9443.949174] Call trace:
> [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.956733] [<          (null)>]           (null)
> [ 9443.961423] Task dump for CPU 43:
> [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> [ 9443.971761] Call trace:
> [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9443.984791] Task dump for CPU 44:
> [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> [ 9443.995130] Call trace:
> [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.002688] [<          (null)>]           (null)
> [ 9444.007378] Task dump for CPU 45:
> [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> [ 9444.017716] Call trace:
> [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.025275] [<          (null)>]           (null)
> [ 9444.029965] Task dump for CPU 46:
> [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> [ 9444.040302] Call trace:
> [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.053333] Task dump for CPU 47:
> [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> [ 9444.063672] Call trace:
> [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.076702] Task dump for CPU 48:
> [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> [ 9444.087041] Call trace:
> [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.100071] Task dump for CPU 49:
> [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> [ 9444.110409] Call trace:
> [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.123440] Task dump for CPU 50:
> [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> [ 9444.133777] Call trace:
> [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.146807] Task dump for CPU 51:
> [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> [ 9444.157144] Call trace:
> [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.164703] [<          (null)>]           (null)
> [ 9444.169393] Task dump for CPU 52:
> [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> [ 9444.179731] Call trace:
> [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.192761] Task dump for CPU 53:
> [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> [ 9444.203099] Call trace:
> [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.216129] Task dump for CPU 54:
> [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> [ 9444.226467] Call trace:
> [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.239498] Task dump for CPU 56:
> [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> [ 9444.249837] Call trace:
> [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.257396] [<          (null)>]           (null)
> [ 9444.262086] Task dump for CPU 57:
> [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> [ 9444.272424] Call trace:
> [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.279982] [<          (null)>]           (null)
> [ 9444.284672] Task dump for CPU 58:
> [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> [ 9444.295011] Call trace:
> [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.320202] Task dump for CPU 59:
> [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> [ 9444.330540] Call trace:
> [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.343571] Task dump for CPU 60:
> [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> [ 9444.353909] Call trace:
> [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.379101] Task dump for CPU 61:
> [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> [ 9444.389438] Call trace:
> [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.402469] Task dump for CPU 62:
> [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> [ 9444.412808] Call trace:
> [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.425838] Task dump for CPU 63:
> [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> [ 9444.436177] Call trace:
> [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> [ 9444.463889] Call trace:
> [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> 
> 
> 
> And the relevant chunk of trace is:
> (I have a lot more.  There are substantial other pauses from to time, but not this long)
> 
> 
>    rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
>      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
>           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
>           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
>           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
>           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
>           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
>           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
>           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
>           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
>           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
>           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
>           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
>           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
>           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
>           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
>       khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
>       khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
>           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
>           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
>           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
>           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
>           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
>           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
>           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
>           <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
>           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
>           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
>           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
>           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
>           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
>           <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
>           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
>           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
>           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
>           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
>           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
>     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
>           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
>           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
>           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
>           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
>           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
>           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
>           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
>           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
>           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
>           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
>           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
>           <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
>           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
>           <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
>           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
>           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
>     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
>           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
>           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
>           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
>           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
>           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
>           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
>           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
>           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
>           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
>           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
>           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
>           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
>            <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
>           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
>           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
>           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
>           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
>           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
>           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
>           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
>           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
>           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
>           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
>           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
>           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
>           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
>           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
>           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
>           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
>           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
>           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
>           <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
>           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
>           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
>           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
>           <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
>           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
>           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
>           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
>      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
>      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
>           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
>           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
>           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
>           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
>           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
>           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
>           <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
>           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
>           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
>           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
>           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
>           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
>           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
>     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
>           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
>           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
>           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
>           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
>           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
>           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
>           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
>               sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
>               sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
>           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
>           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
>           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
>           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
>           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
>           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
>           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
>           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000

This being the gap?

Interesting in that I am not seeing any timeouts at all associated with
the rcu_sched kthread...

							Thanx, Paul

>           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
>           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
>               sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
>               sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
>           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
>           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
>           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
>           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
>           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
>           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
>           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
>           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
>           <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
>           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
>           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
>           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
>           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
>      rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
>           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
>      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
>      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
>           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
>           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
>           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
>           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
>           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
>           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
>           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
>           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
>           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
>           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
>           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
>           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
>           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
>           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
>           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
>           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
>           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
>      rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
>      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
>           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
>           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
>           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
>           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
>           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
>           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
>           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
>           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
>           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
>           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
>           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
>           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
>           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
>           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
>           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
>           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
>           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
>           <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
>           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
>           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
>           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
>           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
>           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
>           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
>           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
>           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
>           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
>           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
>     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
>           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
>           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
>           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
>           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
>           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
>           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
>           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
>           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
>           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
>           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
>           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
>           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808
> > 
> > Thanks,
> > 
> > Jonathan
> > > 
> > > 							Thanx, Paul
> > >   
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > [ 1984.643626] Call trace:
> > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > >     
> > >   
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm at huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-27 16:52                                                         ` Paul E. McKenney
  (?)
@ 2017-07-28  7:44                                                           ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28  7:44 UTC (permalink / raw)
  To: linux-arm-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="windows-1254", Size: 72756 bytes --]

On Thu, 27 Jul 2017 09:52:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> > On Thu, 27 Jul 2017 14:49:03 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:    
> > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:      
> > > > >       
> > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > dump listing almost all of the cpus as having missed a grace period.        
> > > > > > 
> > > > > > I have seen stranger things, but admittedly not often.      
> > > > > 
> > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > 
> > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > 
> > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > because the timer has not fired:      
> > > > 
> > > > Good point, Nick!
> > > > 
> > > > Jonathan, could you please reproduce collecting timer event tracing?    
> > > I'm a little new to tracing (only started playing with it last week)
> > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > splats on reading out the trace when running with the RCU stall timer
> > > set to 4 so have increased that back to the default and am rerunning.
> > > 
> > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > 
> > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > 
> > > when it dumps, just send you the relevant part of what is in
> > > /sys/kernel/debug/tracing/trace?  
> > 
> > Interestingly the only thing that can make trip for me with tracing on
> > is peaking in the tracing buffers.  Not sure this is a valid case or
> > not.
> > 
> > Anyhow all timer activity seems to stop around the area of interest.
> > 
> > 
> > [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [ 9442.419107] 	1-...: (1 GPs behind) idle„4/0/0 softirq'747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> > [ 9442.430224] 	3-...: (2 GPs behind) idle8/0/0 softirq2197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> > [ 9442.441340] 	4-...: (7 GPs behind) idlet0/0/0 softirq"351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> > [ 9442.452456] 	5-...: (2 GPs behind) idle›0/0/0 softirq!315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.463572] 	6-...: (2 GPs behind) idley4/0/0 softirq\x19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> > [ 9442.474688] 	7-...: (2 GPs behind) idle¬4/0/0 softirq"547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.485803] 	8-...: (9 GPs behind) idle\x118/0/0 softirq(1/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> > [ 9442.496571] 	9-...: (9 GPs behind) idlec/0/0 softirq(4/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> > [ 9442.507339] 	10-...: (14 GPs behind) idle÷8/0/0 softirq%4/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> > [ 9442.518281] 	11-...: (9 GPs behind) idleÉc/0/0 softirq01/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> > [ 9442.529136] 	12-...: (9 GPs behind) idleJ4/0/0 softirqs5/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> > [ 9442.539992] 	13-...: (9 GPs behind) idle4c/0/0 softirq\x1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> > [ 9442.551020] 	14-...: (9 GPs behind) idle/4/0/0 softirqp7/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> > [ 9442.561875] 	15-...: (2 GPs behind) idle³0/0/0 softirq‚1/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> > [ 9442.572730] 	17-...: (2 GPs behind) idleZ8/0/0 softirq\x1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> > [ 9442.583759] 	18-...: (2 GPs behind) idle.4/0/0 softirq\x1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> > [ 9442.594787] 	19-...: (2 GPs behind) idle\x138/0/0 softirq\x1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> > [ 9442.605816] 	20-...: (50 GPs behind) idlec4/0/0 softirq!7/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> > [ 9442.616758] 	21-...: (2 GPs behind) idleë8/0/0 softirq\x1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> > [ 9442.627786] 	22-...: (1 GPs behind) idleª8/0/0 softirq"9/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> > [ 9442.638641] 	23-...: (1 GPs behind) idleH8/0/0 softirq$7/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> > [ 9442.649496] 	24-...: (33 GPs behind) idle÷c/0/0 softirq19/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> > [ 9442.660437] 	25-...: (33 GPs behind) idle”4/0/0 softirq08/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> > [ 9442.671379] 	26-...: (9 GPs behind) idlem4/0/0 softirq&5/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> > [ 9442.682234] 	27-...: (115 GPs behind) idleãc/0/0 softirq!2/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.693263] 	28-...: (9 GPs behind) idleê4/0/0 softirqT0/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> > [ 9442.704118] 	29-...: (115 GPs behind) idleƒc/0/0 softirq42/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.715147] 	30-...: (33 GPs behind) idleãc/0/0 softirqP9/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> > [ 9442.726088] 	31-...: (9 GPs behind) idleß4/0/0 softirqa9/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> > [ 9442.736944] 	32-...: (9 GPs behind) idleª4/0/0 softirq\x1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.747972] 	34-...: (9 GPs behind) idleæc/0/0 softirqP82/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> > [ 9442.759001] 	35-...: (9 GPs behind) idle\x7fc/0/0 softirq\x1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> > [ 9442.770030] 	36-...: (0 ticks this GP) idleò8/0/0 softirq%5/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> > [ 9442.781145] 	37-...: (50 GPs behind) idleSc/0/0 softirq"7/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> > [ 9442.792087] 	38-...: (9 GPs behind) idle•8/0/0 softirq\x185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.802942] 	40-...: (389 GPs behind) idleAc/0/0 softirq\x131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> > [ 9442.813971] 	41-...: (389 GPs behind) idle%8/0/0 softirq\x133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> > [ 9442.825000] 	43-...: (50 GPs behind) idle%4/0/0 softirq\x113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.835942] 	44-...: (115 GPs behind) idle\x178/0/0 softirq\x1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> > [ 9442.847144] 	45-...: (2 GPs behind) idle\x04a/1/0 softirq64/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> > [ 9442.857999] 	46-...: (9 GPs behind) idleì4/0/0 softirq\x183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.868854] 	47-...: (115 GPs behind) idle\b8/0/0 softirq\x135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.879883] 	48-...: (389 GPs behind) idle 0/0/0 softirq\x103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> > [ 9442.890911] 	49-...: (9 GPs behind) idle¢4/0/0 softirq 5/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.901766] 	50-...: (25 GPs behind) idle§4/0/0 softirq\x144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.912708] 	51-...: (50 GPs behind) idleö8/0/0 softirq\x116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> > [ 9442.923650] 	52-...: (9 GPs behind) idleà8/0/0 softirq 2/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> > [ 9442.934505] 	53-...: (2 GPs behind) idle\x128/0/0 softirq65/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> > [ 9442.945360] 	54-...: (9 GPs behind) idleÎ8/0/0 softirq\x126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> > [ 9442.956215] 	56-...: (9 GPs behind) idle30/0/0 softirq!16/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> > [ 9442.967243] 	57-...: (1 GPs behind) idle(8/0/0 softirq\x1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> > [ 9442.978272] 	58-...: (37 GPs behind) idle90/0/0 softirq\x1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> > [ 9442.989387] 	59-...: (37 GPs behind) idleå4/0/0 softirq\x1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> > [ 9443.000502] 	60-...: (116 GPs behind) idle{4/0/0 softirq’/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> > [ 9443.011357] 	61-...: (9 GPs behind) idle8/0/0 softirq\x161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> > [ 9443.022212] 	62-...: (115 GPs behind) idleª8/0/0 softirq•/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> > [ 9443.033154] 	63-...: (50 GPs behind) idle•8/0/0 softirq/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> > [ 9443.043920] 	(detected by 39, tT03 jiffies, gD3, cD2, q=1)
> > [ 9443.049919] Task dump for CPU 1:
> > [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> > [ 9443.060173] Call trace:
> > [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.067744] [<          (null)>]           (null)
> > [ 9443.072434] Task dump for CPU 3:
> > [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> > [ 9443.082686] Call trace:
> > [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.090246] [<          (null)>]           (null)
> > [ 9443.094936] Task dump for CPU 4:
> > [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> > [ 9443.105188] Call trace:
> > [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.118224] Task dump for CPU 5:
> > [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> > [ 9443.128476] Call trace:
> > [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.136035] [<          (null)>]           (null)
> > [ 9443.140725] Task dump for CPU 6:
> > [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> > [ 9443.150976] Call trace:
> > [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.158535] [<          (null)>]           (null)
> > [ 9443.163226] Task dump for CPU 7:
> > [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> > [ 9443.173478] Call trace:
> > [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.181037] [<          (null)>]           (null)
> > [ 9443.185727] Task dump for CPU 8:
> > [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> > [ 9443.195979] Call trace:
> > [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.203537] [<          (null)>]           (null)
> > [ 9443.208227] Task dump for CPU 9:
> > [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> > [ 9443.218479] Call trace:
> > [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.231510] Task dump for CPU 10:
> > [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> > [ 9443.241848] Call trace:
> > [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.254879] Task dump for CPU 11:
> > [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> > [ 9443.265218] Call trace:
> > [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.272776] [<          (null)>]           (null)
> > [ 9443.277467] Task dump for CPU 12:
> > [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> > [ 9443.287806] Call trace:
> > [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.295364] [<          (null)>]           (null)
> > [ 9443.300054] Task dump for CPU 13:
> > [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> > [ 9443.310394] Call trace:
> > [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.317953] [<          (null)>]           (null)
> > [ 9443.322643] Task dump for CPU 14:
> > [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> > [ 9443.332981] Call trace:
> > [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.340540] [<          (null)>]           (null)
> > [ 9443.345230] Task dump for CPU 15:
> > [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> > [ 9443.355568] Call trace:
> > [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.368599] Task dump for CPU 17:
> > [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> > [ 9443.378937] Call trace:
> > [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.391968] Task dump for CPU 18:
> > [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> > [ 9443.402306] Call trace:
> > [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.415336] Task dump for CPU 19:
> > [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> > [ 9443.425675] Call trace:
> > [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.433234] [<          (null)>]           (null)
> > [ 9443.437924] Task dump for CPU 20:
> > [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> > [ 9443.448263] Call trace:
> > [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.473458] Task dump for CPU 21:
> > [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> > [ 9443.483796] Call trace:
> > [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.491354] [<          (null)>]           (null)
> > [ 9443.496045] Task dump for CPU 22:
> > [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> > [ 9443.506383] Call trace:
> > [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.519414] Task dump for CPU 23:
> > [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> > [ 9443.529752] Call trace:
> > [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.542784] Task dump for CPU 24:
> > [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> > [ 9443.553122] Call trace:
> > [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.566153] Task dump for CPU 25:
> > [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> > [ 9443.576491] Call trace:
> > [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.601682] Task dump for CPU 26:
> > [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> > [ 9443.612021] Call trace:
> > [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.625052] Task dump for CPU 27:
> > [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> > [ 9443.635390] Call trace:
> > [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.648421] Task dump for CPU 28:
> > [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> > [ 9443.658759] Call trace:
> > [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.666318] [<          (null)>]           (null)
> > [ 9443.671008] Task dump for CPU 29:
> > [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> > [ 9443.681346] Call trace:
> > [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.694377] Task dump for CPU 30:
> > [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> > [ 9443.704715] Call trace:
> > [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.729907] Task dump for CPU 31:
> > [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> > [ 9443.740246] Call trace:
> > [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.747805] [<          (null)>]           (null)
> > [ 9443.752496] Task dump for CPU 32:
> > [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> > [ 9443.762833] Call trace:
> > [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.770392] [<          (null)>]           (null)
> > [ 9443.775082] Task dump for CPU 34:
> > [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> > [ 9443.785420] Call trace:
> > [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.798451] Task dump for CPU 35:
> > [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> > [ 9443.808789] Call trace:
> > [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.821820] Task dump for CPU 36:
> > [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> > [ 9443.832158] Call trace:
> > [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.857350] Task dump for CPU 37:
> > [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> > [ 9443.867688] Call trace:
> > [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.892880] Task dump for CPU 38:
> > [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> > [ 9443.903218] Call trace:
> > [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.910776] [<          (null)>]           (null)
> > [ 9443.915466] Task dump for CPU 40:
> > [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> > [ 9443.925805] Call trace:
> > [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.938836] Task dump for CPU 41:
> > [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> > [ 9443.949174] Call trace:
> > [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.956733] [<          (null)>]           (null)
> > [ 9443.961423] Task dump for CPU 43:
> > [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> > [ 9443.971761] Call trace:
> > [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.984791] Task dump for CPU 44:
> > [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> > [ 9443.995130] Call trace:
> > [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.002688] [<          (null)>]           (null)
> > [ 9444.007378] Task dump for CPU 45:
> > [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> > [ 9444.017716] Call trace:
> > [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.025275] [<          (null)>]           (null)
> > [ 9444.029965] Task dump for CPU 46:
> > [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> > [ 9444.040302] Call trace:
> > [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.053333] Task dump for CPU 47:
> > [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> > [ 9444.063672] Call trace:
> > [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.076702] Task dump for CPU 48:
> > [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> > [ 9444.087041] Call trace:
> > [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.100071] Task dump for CPU 49:
> > [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> > [ 9444.110409] Call trace:
> > [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.123440] Task dump for CPU 50:
> > [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> > [ 9444.133777] Call trace:
> > [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.146807] Task dump for CPU 51:
> > [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> > [ 9444.157144] Call trace:
> > [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.164703] [<          (null)>]           (null)
> > [ 9444.169393] Task dump for CPU 52:
> > [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> > [ 9444.179731] Call trace:
> > [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.192761] Task dump for CPU 53:
> > [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> > [ 9444.203099] Call trace:
> > [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.216129] Task dump for CPU 54:
> > [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> > [ 9444.226467] Call trace:
> > [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.239498] Task dump for CPU 56:
> > [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> > [ 9444.249837] Call trace:
> > [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.257396] [<          (null)>]           (null)
> > [ 9444.262086] Task dump for CPU 57:
> > [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> > [ 9444.272424] Call trace:
> > [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.279982] [<          (null)>]           (null)
> > [ 9444.284672] Task dump for CPU 58:
> > [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> > [ 9444.295011] Call trace:
> > [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.320202] Task dump for CPU 59:
> > [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> > [ 9444.330540] Call trace:
> > [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.343571] Task dump for CPU 60:
> > [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> > [ 9444.353909] Call trace:
> > [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.379101] Task dump for CPU 61:
> > [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> > [ 9444.389438] Call trace:
> > [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.402469] Task dump for CPU 62:
> > [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> > [ 9444.412808] Call trace:
> > [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.425838] Task dump for CPU 63:
> > [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> > [ 9444.436177] Call trace:
> > [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> > [ 9444.463889] Call trace:
> > [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> > [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> > [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> > [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> > [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> > [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> > 
> > 
> > 
> > And the relevant chunk of trace is:
> > (I have a lot more.  There are substantial other pauses from to time, but not this long)
> > 
> > 
> >    rcu_preempt-9     [057] ....  9419.837631: timer_init: timerÿff8017d5fcfda0
> >      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97246837 [timeout=1] cpuW idx=0 flags> >           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18164000000 softexpires”18164000000
> >           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18164001440
> >           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timerÿff8017d5fcfda0
> >           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97246838
> >           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timerÿff8017d5fcfda0
> >           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18168000000 softexpires”18168000000
> >           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires”18188000000 softexpires”18188000000
> >           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimerÿff8017db99e808 function=tick_sched_timer now”18188001420
> >           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimerÿff8017db99e808
> >           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimerÿff8017db99e808 function=tick_sched_timer expires˜58983202655 softexpires˜58983202655
> >           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimerÿff8017fbc3d808
> >           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimerÿff8017fbc3d808 function=tick_sched_timer now”18204006760
> >           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimerÿff8017fbc3d808
> >           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timerÿff8017d37dbca0
> >           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timerÿff8017d37dbca0 function=process_timeout nowB97246848
> >           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timerÿff8017d37dbca0
> >           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”18208000000 softexpires”18208000000
> >       khugepaged-778   [016] ....  9419.885668: timer_init: timerÿff8017d37dbca0
> >       khugepaged-778   [016] d..1  9419.885668: timer_start: timerÿff8017d37dbca0 function=process_timeout expiresB97249348 [timeout%00] cpu\x16 idx=0 flags> >           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimerÿff8017fbc3d808
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimerÿff8017fbc3d808 function=tick_sched_timer expires”28444000000 softexpires”28444000000
> >           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18209219940
> >           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18310221420 softexpires”18309221420
> >           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18236005860
> >           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97246856
> >           <idle>-0     [000] d.s2  9419.917630: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97246881 [timeout%] cpu=0 idx flags> >           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”18340000000 softexpires”18340000000
> >           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18310225960
> >           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18411227320 softexpires”18410227320
> >           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”18340005520
> >           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97246882
> >           <idle>-0     [000] d.s2  9420.021629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247107 [timeout"5] cpu=0 idx4 flags> >           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
> >           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18411231780
> >           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18512233720 softexpires”18511233720
> >           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”18460002540
> >           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timerÿff80177db6cc08
> >           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97246912
> >           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timerÿff80177db6cc08
> >           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”18464000000 softexpires”18464000000
> >     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247162 [timeout%0] cpuU idxˆ flags=I
> >           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19484000000 softexpires”19484000000
> >           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18512235660
> >           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18613237260 softexpires”18612237260
> >           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18613238380
> >           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18714240000 softexpires”18713240000
> >           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18714241380
> >           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18815242920 softexpires”18814242920
> >           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimerÿff8017dbb69808
> >           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimerÿff8017dbb69808 function=tick_sched_timer now”18780002180
> >           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimerÿff8017dbb69808
> >           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timerÿff80177db6d408
> >           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timerÿff80177db6d408 functionÞlayed_work_timer_fn nowB97246992
> >           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timerÿff80177db6d408
> >           <idle>-0     [042] dns2  9420.461627: timer_cancel: timerÿff8017797d7868
> >           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timerÿff8017797d7868 function=hns_nic_service_timer nowB97246992
> >           <idle>-0     [042] dns2  9420.461628: timer_start: timerÿff8017797d7868 function=hns_nic_service_timer expiresB97247242 [timeout%0] cpuB idx˜ flags> >           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timerÿff8017797d7868
> >           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”18784000000 softexpires”18784000000
> >     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timerÿff80177db6d408 functionÞlayed_work_timer_fn expiresB97247242 [timeout%0] cpuB idx˜ flags=I
> >           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimerÿff8017dbb69808
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimerÿff8017dbb69808 function=tick_sched_timer expires”19804000000 softexpires”19804000000
> >           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18815244580
> >           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”18916246140 softexpires”18915246140
> >           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”18916247280
> >           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19017248760 softexpires”19016248760
> >           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimerÿff8017dba76808
> >           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimerÿff8017dba76808 function=tick_sched_timer now”18940002160
> >           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimerÿff8017dba76808
> >           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timerÿff00000917be40
> >           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timerÿff00000917be40 functionÞlayed_work_timer_fn nowB97247032
> >           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timerÿff00000917be40
> >           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”18944000000 softexpires”18944000000
> >            <...>-1631  [033] d..1  9420.621636: timer_start: timerÿff00000917be40 functionÞlayed_work_timer_fn expiresB97247282 [timeout%0] cpu3 idx\x103 flags=I
> >           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimerÿff8017dba76808
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimerÿff8017dba76808 function=tick_sched_timer expires”19964000000 softexpires”19964000000
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19012000000 softexpires”19012000000
> >           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
> >           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59803202655 softexpires˜59803202655
> >           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19017253180
> >           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19118254640 softexpires”19117254640
> >           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19118255760
> >           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19219257140 softexpires”19218257140
> >           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
> >           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19192000000 softexpires”19192000000
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19260000000 softexpires”19260000000
> >           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜59983202655 softexpires˜59983202655
> >           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19219261580
> >           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19320263160 softexpires”19319263160
> >           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimerÿff8017fbe5b808
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimerÿff8017fbe5b808 function=tick_sched_timer expires˜60023202655 softexpires˜60023202655
> >           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19260001400
> >           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97247112
> >           <idle>-0     [000] d.s2  9420.941624: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97247137 [timeout%] cpu=0 idx\x113 flags> >           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timerÿff8017fbe42558
> >           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timerÿff8017fbe42558 functionÞlayed_work_timer_fn nowB97247112
> >           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timerÿff8017fbe42558
> >           <idle>-0     [000] dns2  9420.941631: timer_cancel: timerÿff00000910a628
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97247112
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timerÿff00000910a628
> >           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19264000000 softexpires”19264000000
> >      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97247500 [timeout88] cpu=0 idx\x100 flags=D|I
> >      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timerÿff8017fbe78558 functionÞlayed_work_timer_fn expiresB97247494 [timeout82] cpu=2 idx\x114 flags=D|I
> >           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19364000000 softexpires”19364000000
> >           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires˜60055202655 softexpires˜60055202655
> >           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19320267640
> >           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19421269000 softexpires”19420269000
> >           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”19364005380
> >           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97247138
> >           <idle>-0     [000] d.s2  9421.045628: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97247363 [timeout"5] cpu=0 idx4 flags> >           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
> >           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19421273420
> >           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19522275040 softexpires”19521275040
> >           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”19484002280
> >           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timerÿff80177db6cc08
> >           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97247168
> >           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timerÿff80177db6cc08
> >           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”19488000000 softexpires”19488000000
> >     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97247418 [timeout%0] cpuU idx\x120 flags=I
> >           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”20508000000 softexpires”20508000000
> >           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19522276980
> >           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19623278460 softexpires”19622278460
> >           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”19623279580
> >           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”19724281060 softexpires”19723281060
> >           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19700000000 softexpires”19700000000
> >           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”20284000000 softexpires”20284000000
> >               sh-2256  [002] ....  9421.381193: timer_init: timerÿff80176c26fb40
> >               sh-2256  [002] d..1  9421.381194: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97247223 [timeout=2] cpu=2 idx=0 flags> >           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19708000000 softexpires”19708000000
> >           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”19708002000
> >           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timerÿff80176c26fb40
> >           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timerÿff80176c26fb40 function=process_timeout nowB97247224
> >           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timerÿff80176c26fb40
> >           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimerÿff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimerÿff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimerÿff8017dbb18808 function=tick_sched_timer expires”19712000000 softexpires”19712000000
> >           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”19724000000 softexpires”19724000000  
> 
> This being the gap?
> 
> Interesting in that I am not seeing any timeouts at all associated with
> the rcu_sched kthread...


This only happened when saving out the trace.  It's didn't happen at all
on an overnight run with no interference.  Which perhaps suggests the
tracing itself is changing the timing enough to hid the issue.

Oh goody.

I'm not familiar enough with the internals of event tracing to know,
but is there a reason that either clearing the buffer or outputting
it could result in this gap?

Jonathan


> 
> 							Thanx, Paul
> 
> >           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42832000000 softexpires”42832000000
> >           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
> >               sh-2256  [002] ....  9444.510857: timer_init: timerÿff80176c26fb40
> >               sh-2256  [002] d..1  9444.510857: timer_start: timerÿff80176c26fb40 function=process_timeout expiresB97253006 [timeout=2] cpu=2 idx=0 flags> >           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42844000000 softexpires”42844000000
> >           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42844005600
> >           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42844005460
> >           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42844005300
> >           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timerÿff8017fbe78558
> >           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timerÿff8017fbe78558 functionÞlayed_work_timer_fn nowB97253008
> >           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_disable_link nowB97253008
> >           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timerÿff8017fbe78558
> >           <idle>-0     [000] d.s2  9444.525632: timer_start: timerÿff80177fdc0840 function=link_timeout_enable_link expiresB97253033 [timeout%] cpu=0 idx‚ flags> >           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timerÿff00000910a628
> >           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timerÿff00000910a628 functionÞlayed_work_timer_fn nowB97253008
> >           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timerÿff00000910a628
> >           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
> >           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
> >           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42848000000 softexpires”42848000000
> >      rcu_preempt-9     [036] ....  9444.525648: timer_init: timerÿff8017d5fcfda0
> >           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42860000000 softexpires”42860000000
> >      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253009 [timeout=1] cpu6 idx=0 flags> >      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timerÿff00000910a628 functionÞlayed_work_timer_fn expiresB97253250 [timeout$2] cpu=0 idx‚ flags=D|I
> >           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”42948000000 softexpires”42948000000
> >           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42852000000 softexpires”42852000000
> >           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42852004760
> >           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timerÿff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253010
> >           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timerÿff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42856000000 softexpires”42856000000
> >           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42876000000 softexpires”42876000000
> >           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimerÿff8017fbe76808 function=tick_sched_timer now”42860007120
> >           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”42864000000 softexpires”42864000000
> >           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimerÿff8017fbe76808
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimerÿff8017fbe76808 function=tick_sched_timer expires”44316000000 softexpires”44316000000
> >           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42876008220
> >           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42880000000 softexpires”42880000000
> >           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42892000000 softexpires”42892000000
> >           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42892001340
> >           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42896000000 softexpires”42896000000
> >      rcu_preempt-9     [036] ....  9444.573631: timer_init: timerÿff8017d5fcfda0
> >      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timerÿff8017d5fcfda0 function=process_timeout expiresB97253021 [timeout=1] cpu6 idx=0 flags> >           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42900000000 softexpires”42900000000
> >           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42900001400
> >           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timerÿff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timerÿff8017d5fcfda0 function=process_timeout nowB97253022
> >           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timerÿff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42904000000 softexpires”42904000000
> >           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires”42924000000 softexpires”42924000000
> >           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”42900098200
> >           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43001101380 softexpires”43000101380
> >           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimerÿff8017dbac7808 function=tick_sched_timer now”42924001600
> >           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimerÿff8017dbac7808
> >           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires˜83719202655 softexpires˜83719202655
> >           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimerÿff8017fbe40808 function=tick_sched_timer now”42948005580
> >           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimerÿff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timerÿff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timerÿff80177fdc0840 function=link_timeout_enable_link nowB97253034
> >           <idle>-0     [000] d.s2  9444.629629: timer_start: timerÿff80177fdc0840 function=link_timeout_disable_link expiresB97253259 [timeout"5] cpu=0 idxB flags> >           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timerÿff80177fdc0840
> >           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimerÿff8017fbe40808 function=tick_sched_timer expires”43868000000 softexpires”43868000000
> >           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43001105940
> >           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43102107440 softexpires”43101107440
> >           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimerÿff8017db968808 function=tick_sched_timer now”43036006240
> >           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimerÿff8017db968808
> >           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timerÿff80177db6cc08
> >           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timerÿff80177db6cc08 functionÞlayed_work_timer_fn nowB97253056
> >           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timerÿff80177db6cc08
> >           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”43040000000 softexpires”43040000000
> >     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timerÿff80177db6cc08 functionÞlayed_work_timer_fn expiresB97253306 [timeout%0] cpuU idxˆ flags=I
> >           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimerÿff8017db968808
> >           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimerÿff8017db968808 function=tick_sched_timer expires”44060000000 softexpires”44060000000
> >           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43102109380
> >           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43203110880 softexpires”43202110880
> >           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43203112000
> >           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43304113380 softexpires”43303113380
> >           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimerÿff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func now”43304114500
> >           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimerÿff80176cb7ca90 function=ehci_hrtimer_func expires”43405116440 softexpires”43404116440
> >           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimerÿff80176cb7ca90
> >           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimerÿff8017dbb69808  
> > > 
> > > Thanks,
> > > 
> > > Jonathan  
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > > [ 1984.643626] Call trace:
> > > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > > >       
> > > >     
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm@huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> >   
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28  7:44                                                           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28  7:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

On Thu, 27 Jul 2017 09:52:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> > On Thu, 27 Jul 2017 14:49:03 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:    
> > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:      
> > > > >       
> > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > dump listing almost all of the cpus as having missed a grace period.        
> > > > > > 
> > > > > > I have seen stranger things, but admittedly not often.      
> > > > > 
> > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > 
> > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > 
> > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > because the timer has not fired:      
> > > > 
> > > > Good point, Nick!
> > > > 
> > > > Jonathan, could you please reproduce collecting timer event tracing?    
> > > I'm a little new to tracing (only started playing with it last week)
> > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > splats on reading out the trace when running with the RCU stall timer
> > > set to 4 so have increased that back to the default and am rerunning.
> > > 
> > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > 
> > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > 
> > > when it dumps, just send you the relevant part of what is in
> > > /sys/kernel/debug/tracing/trace?  
> > 
> > Interestingly the only thing that can make trip for me with tracing on
> > is peaking in the tracing buffers.  Not sure this is a valid case or
> > not.
> > 
> > Anyhow all timer activity seems to stop around the area of interest.
> > 
> > 
> > [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> > [ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> > [ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> > [ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> > [ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> > [ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> > [ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> > [ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> > [ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> > [ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> > [ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> > [ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> > [ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> > [ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> > [ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> > [ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> > [ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> > [ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> > [ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> > [ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> > [ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> > [ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> > [ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> > [ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> > [ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> > [ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> > [ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> > [ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> > [ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> > [ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> > [ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> > [ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> > [ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> > [ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> > [ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> > [ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> > [ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> > [ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> > [ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> > [ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> > [ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> > [ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> > [ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> > [ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> > [ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> > [ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> > [ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
> > [ 9443.049919] Task dump for CPU 1:
> > [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> > [ 9443.060173] Call trace:
> > [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.067744] [<          (null)>]           (null)
> > [ 9443.072434] Task dump for CPU 3:
> > [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> > [ 9443.082686] Call trace:
> > [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.090246] [<          (null)>]           (null)
> > [ 9443.094936] Task dump for CPU 4:
> > [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> > [ 9443.105188] Call trace:
> > [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.118224] Task dump for CPU 5:
> > [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> > [ 9443.128476] Call trace:
> > [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.136035] [<          (null)>]           (null)
> > [ 9443.140725] Task dump for CPU 6:
> > [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> > [ 9443.150976] Call trace:
> > [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.158535] [<          (null)>]           (null)
> > [ 9443.163226] Task dump for CPU 7:
> > [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> > [ 9443.173478] Call trace:
> > [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.181037] [<          (null)>]           (null)
> > [ 9443.185727] Task dump for CPU 8:
> > [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> > [ 9443.195979] Call trace:
> > [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.203537] [<          (null)>]           (null)
> > [ 9443.208227] Task dump for CPU 9:
> > [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> > [ 9443.218479] Call trace:
> > [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.231510] Task dump for CPU 10:
> > [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> > [ 9443.241848] Call trace:
> > [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.254879] Task dump for CPU 11:
> > [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> > [ 9443.265218] Call trace:
> > [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.272776] [<          (null)>]           (null)
> > [ 9443.277467] Task dump for CPU 12:
> > [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> > [ 9443.287806] Call trace:
> > [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.295364] [<          (null)>]           (null)
> > [ 9443.300054] Task dump for CPU 13:
> > [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> > [ 9443.310394] Call trace:
> > [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.317953] [<          (null)>]           (null)
> > [ 9443.322643] Task dump for CPU 14:
> > [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> > [ 9443.332981] Call trace:
> > [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.340540] [<          (null)>]           (null)
> > [ 9443.345230] Task dump for CPU 15:
> > [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> > [ 9443.355568] Call trace:
> > [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.368599] Task dump for CPU 17:
> > [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> > [ 9443.378937] Call trace:
> > [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.391968] Task dump for CPU 18:
> > [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> > [ 9443.402306] Call trace:
> > [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.415336] Task dump for CPU 19:
> > [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> > [ 9443.425675] Call trace:
> > [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.433234] [<          (null)>]           (null)
> > [ 9443.437924] Task dump for CPU 20:
> > [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> > [ 9443.448263] Call trace:
> > [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.473458] Task dump for CPU 21:
> > [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> > [ 9443.483796] Call trace:
> > [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.491354] [<          (null)>]           (null)
> > [ 9443.496045] Task dump for CPU 22:
> > [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> > [ 9443.506383] Call trace:
> > [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.519414] Task dump for CPU 23:
> > [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> > [ 9443.529752] Call trace:
> > [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.542784] Task dump for CPU 24:
> > [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> > [ 9443.553122] Call trace:
> > [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.566153] Task dump for CPU 25:
> > [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> > [ 9443.576491] Call trace:
> > [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.601682] Task dump for CPU 26:
> > [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> > [ 9443.612021] Call trace:
> > [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.625052] Task dump for CPU 27:
> > [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> > [ 9443.635390] Call trace:
> > [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.648421] Task dump for CPU 28:
> > [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> > [ 9443.658759] Call trace:
> > [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.666318] [<          (null)>]           (null)
> > [ 9443.671008] Task dump for CPU 29:
> > [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> > [ 9443.681346] Call trace:
> > [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.694377] Task dump for CPU 30:
> > [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> > [ 9443.704715] Call trace:
> > [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.729907] Task dump for CPU 31:
> > [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> > [ 9443.740246] Call trace:
> > [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.747805] [<          (null)>]           (null)
> > [ 9443.752496] Task dump for CPU 32:
> > [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> > [ 9443.762833] Call trace:
> > [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.770392] [<          (null)>]           (null)
> > [ 9443.775082] Task dump for CPU 34:
> > [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> > [ 9443.785420] Call trace:
> > [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.798451] Task dump for CPU 35:
> > [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> > [ 9443.808789] Call trace:
> > [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.821820] Task dump for CPU 36:
> > [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> > [ 9443.832158] Call trace:
> > [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.857350] Task dump for CPU 37:
> > [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> > [ 9443.867688] Call trace:
> > [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.892880] Task dump for CPU 38:
> > [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> > [ 9443.903218] Call trace:
> > [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.910776] [<          (null)>]           (null)
> > [ 9443.915466] Task dump for CPU 40:
> > [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> > [ 9443.925805] Call trace:
> > [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.938836] Task dump for CPU 41:
> > [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> > [ 9443.949174] Call trace:
> > [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.956733] [<          (null)>]           (null)
> > [ 9443.961423] Task dump for CPU 43:
> > [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> > [ 9443.971761] Call trace:
> > [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.984791] Task dump for CPU 44:
> > [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> > [ 9443.995130] Call trace:
> > [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.002688] [<          (null)>]           (null)
> > [ 9444.007378] Task dump for CPU 45:
> > [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> > [ 9444.017716] Call trace:
> > [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.025275] [<          (null)>]           (null)
> > [ 9444.029965] Task dump for CPU 46:
> > [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> > [ 9444.040302] Call trace:
> > [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.053333] Task dump for CPU 47:
> > [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> > [ 9444.063672] Call trace:
> > [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.076702] Task dump for CPU 48:
> > [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> > [ 9444.087041] Call trace:
> > [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.100071] Task dump for CPU 49:
> > [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> > [ 9444.110409] Call trace:
> > [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.123440] Task dump for CPU 50:
> > [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> > [ 9444.133777] Call trace:
> > [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.146807] Task dump for CPU 51:
> > [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> > [ 9444.157144] Call trace:
> > [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.164703] [<          (null)>]           (null)
> > [ 9444.169393] Task dump for CPU 52:
> > [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> > [ 9444.179731] Call trace:
> > [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.192761] Task dump for CPU 53:
> > [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> > [ 9444.203099] Call trace:
> > [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.216129] Task dump for CPU 54:
> > [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> > [ 9444.226467] Call trace:
> > [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.239498] Task dump for CPU 56:
> > [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> > [ 9444.249837] Call trace:
> > [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.257396] [<          (null)>]           (null)
> > [ 9444.262086] Task dump for CPU 57:
> > [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> > [ 9444.272424] Call trace:
> > [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.279982] [<          (null)>]           (null)
> > [ 9444.284672] Task dump for CPU 58:
> > [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> > [ 9444.295011] Call trace:
> > [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.320202] Task dump for CPU 59:
> > [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> > [ 9444.330540] Call trace:
> > [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.343571] Task dump for CPU 60:
> > [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> > [ 9444.353909] Call trace:
> > [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.379101] Task dump for CPU 61:
> > [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> > [ 9444.389438] Call trace:
> > [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.402469] Task dump for CPU 62:
> > [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> > [ 9444.412808] Call trace:
> > [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.425838] Task dump for CPU 63:
> > [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> > [ 9444.436177] Call trace:
> > [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> > [ 9444.463889] Call trace:
> > [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> > [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> > [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> > [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> > [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> > [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> > 
> > 
> > 
> > And the relevant chunk of trace is:
> > (I have a lot more.  There are substantial other pauses from to time, but not this long)
> > 
> > 
> >    rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
> >      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
> >           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
> >           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
> >           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
> >           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
> >           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
> >           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
> >           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
> >           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
> >           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
> >           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
> >           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
> >           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
> >       khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
> >       khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
> >           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
> >           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
> >           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
> >           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
> >           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
> >           <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
> >           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
> >           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
> >           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
> >           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
> >           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
> >           <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
> >           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
> >           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
> >           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
> >           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
> >           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
> >     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
> >           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
> >           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
> >           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
> >           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
> >           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
> >           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
> >           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
> >           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
> >           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
> >           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
> >           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
> >           <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
> >           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
> >           <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
> >           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
> >           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
> >     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
> >           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
> >           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
> >           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
> >           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
> >           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
> >           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
> >           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
> >           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
> >           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
> >           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
> >            <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
> >           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
> >           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
> >           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
> >           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
> >           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
> >           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
> >           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
> >           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
> >           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
> >           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
> >           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
> >           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
> >           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
> >           <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
> >           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
> >           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
> >           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
> >           <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
> >           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
> >      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
> >      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
> >           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
> >           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
> >           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
> >           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
> >           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
> >           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
> >           <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
> >           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
> >           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
> >           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
> >           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
> >           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
> >           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
> >     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
> >           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
> >           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
> >           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
> >           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
> >           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
> >           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
> >           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
> >               sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
> >               sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
> >           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
> >           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
> >           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
> >           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
> >           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
> >           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
> >           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000  
> 
> This being the gap?
> 
> Interesting in that I am not seeing any timeouts at all associated with
> the rcu_sched kthread...


This only happened when saving out the trace.  It's didn't happen at all
on an overnight run with no interference.  Which perhaps suggests the
tracing itself is changing the timing enough to hid the issue.

Oh goody.

I'm not familiar enough with the internals of event tracing to know,
but is there a reason that either clearing the buffer or outputting
it could result in this gap?

Jonathan


> 
> 							Thanx, Paul
> 
> >           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
> >           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
> >               sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
> >               sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
> >           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
> >           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
> >           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
> >           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
> >           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
> >           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
> >           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
> >           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
> >           <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
> >           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
> >           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
> >           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
> >           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >      rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
> >           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
> >      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
> >      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
> >           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
> >           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
> >           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
> >           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
> >           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
> >           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
> >           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
> >           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
> >           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
> >           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
> >           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
> >           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
> >           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
> >           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
> >      rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
> >      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
> >           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
> >           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
> >           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
> >           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
> >           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
> >           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
> >           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
> >           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
> >           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
> >           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
> >           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
> >           <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
> >           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
> >           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
> >           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
> >           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
> >           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
> >           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
> >     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
> >           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
> >           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
> >           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
> >           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
> >           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
> >           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
> >           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
> >           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808  
> > > 
> > > Thanks,
> > > 
> > > Jonathan  
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > > [ 1984.643626] Call trace:
> > > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > > >       
> > > >     
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm@huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28  7:44                                                           ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28  7:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 27 Jul 2017 09:52:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> > On Thu, 27 Jul 2017 14:49:03 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:    
> > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >       
> > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:      
> > > > >       
> > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > dump listing almost all of the cpus as having missed a grace period.        
> > > > > > 
> > > > > > I have seen stranger things, but admittedly not often.      
> > > > > 
> > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > 
> > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > 
> > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > because the timer has not fired:      
> > > > 
> > > > Good point, Nick!
> > > > 
> > > > Jonathan, could you please reproduce collecting timer event tracing?    
> > > I'm a little new to tracing (only started playing with it last week)
> > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > splats on reading out the trace when running with the RCU stall timer
> > > set to 4 so have increased that back to the default and am rerunning.
> > > 
> > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > 
> > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > 
> > > when it dumps, just send you the relevant part of what is in
> > > /sys/kernel/debug/tracing/trace?  
> > 
> > Interestingly the only thing that can make trip for me with tracing on
> > is peaking in the tracing buffers.  Not sure this is a valid case or
> > not.
> > 
> > Anyhow all timer activity seems to stop around the area of interest.
> > 
> > 
> > [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [ 9442.419107] 	1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> > [ 9442.430224] 	3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> > [ 9442.441340] 	4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> > [ 9442.452456] 	5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.463572] 	6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> > [ 9442.474688] 	7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.485803] 	8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> > [ 9442.496571] 	9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> > [ 9442.507339] 	10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> > [ 9442.518281] 	11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> > [ 9442.529136] 	12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> > [ 9442.539992] 	13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> > [ 9442.551020] 	14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> > [ 9442.561875] 	15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> > [ 9442.572730] 	17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> > [ 9442.583759] 	18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> > [ 9442.594787] 	19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> > [ 9442.605816] 	20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> > [ 9442.616758] 	21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> > [ 9442.627786] 	22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> > [ 9442.638641] 	23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 last_accelerate: c600/deb2, nonlazy_posted: 0, L.
> > [ 9442.649496] 	24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
> > [ 9442.660437] 	25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
> > [ 9442.671379] 	26-...: (9 GPs behind) idle=6d4/0/0 softirq=265/275 fqs=0 last_accelerate: 6034/dec0, nonlazy_posted: 0, L.
> > [ 9442.682234] 	27-...: (115 GPs behind) idle=e3c/0/0 softirq=212/226 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.693263] 	28-...: (9 GPs behind) idle=ea4/0/0 softirq=540/552 fqs=0 last_accelerate: 603c/dec0, nonlazy_posted: 0, L.
> > [ 9442.704118] 	29-...: (115 GPs behind) idle=83c/0/0 softirq=342/380 fqs=0 last_accelerate: 5420/dec0, nonlazy_posted: 0, L.
> > [ 9442.715147] 	30-...: (33 GPs behind) idle=e3c/0/0 softirq=509/509 fqs=0 last_accelerate: 52bc/dec0, nonlazy_posted: 0, L.
> > [ 9442.726088] 	31-...: (9 GPs behind) idle=df4/0/0 softirq=619/641 fqs=0 last_accelerate: 603c/decb, nonlazy_posted: 0, L.
> > [ 9442.736944] 	32-...: (9 GPs behind) idle=aa4/0/0 softirq=1841/1848 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.747972] 	34-...: (9 GPs behind) idle=e6c/0/0 softirq=5082/5086 fqs=0 last_accelerate: 6039/decb, nonlazy_posted: 0, L.
> > [ 9442.759001] 	35-...: (9 GPs behind) idle=7fc/0/0 softirq=1396/1406 fqs=0 last_accelerate: 603e/decb, nonlazy_posted: 0, L.
> > [ 9442.770030] 	36-...: (0 ticks this GP) idle=f28/0/0 softirq=255/255 fqs=0 last_accelerate: c9fc/decb, nonlazy_posted: 0, L.
> > [ 9442.781145] 	37-...: (50 GPs behind) idle=53c/0/0 softirq=227/230 fqs=0 last_accelerate: 45c0/decb, nonlazy_posted: 0, L.
> > [ 9442.792087] 	38-...: (9 GPs behind) idle=958/0/0 softirq=185/192 fqs=0 last_accelerate: 6030/decb, nonlazy_posted: 0, L.
> > [ 9442.802942] 	40-...: (389 GPs behind) idle=41c/0/0 softirq=131/136 fqs=0 last_accelerate: 5800/decb, nonlazy_posted: 0, L.
> > [ 9442.813971] 	41-...: (389 GPs behind) idle=258/0/0 softirq=133/138 fqs=0 last_accelerate: c00f/decb, nonlazy_posted: 0, L.
> > [ 9442.825000] 	43-...: (50 GPs behind) idle=254/0/0 softirq=113/117 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.835942] 	44-...: (115 GPs behind) idle=178/0/0 softirq=1271/1276 fqs=0 last_accelerate: 68e9/dee5, nonlazy_posted: 0, L.
> > [ 9442.847144] 	45-...: (2 GPs behind) idle=04a/1/0 softirq=364/389 fqs=0 last_accelerate: dee5/dee5, nonlazy_posted: 0, L.
> > [ 9442.857999] 	46-...: (9 GPs behind) idle=ec4/0/0 softirq=183/189 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.868854] 	47-...: (115 GPs behind) idle=088/0/0 softirq=135/149 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.879883] 	48-...: (389 GPs behind) idle=200/0/0 softirq=103/110 fqs=0 last_accelerate: 58b0/dee5, nonlazy_posted: 0, L.
> > [ 9442.890911] 	49-...: (9 GPs behind) idle=a24/0/0 softirq=205/211 fqs=0 last_accelerate: 6030/dee5, nonlazy_posted: 0, L.
> > [ 9442.901766] 	50-...: (25 GPs behind) idle=a74/0/0 softirq=144/144 fqs=0 last_accelerate: 5420/dee5, nonlazy_posted: 0, L.
> > [ 9442.912708] 	51-...: (50 GPs behind) idle=f68/0/0 softirq=116/122 fqs=0 last_accelerate: 57bc/dee5, nonlazy_posted: 0, L.
> > [ 9442.923650] 	52-...: (9 GPs behind) idle=e08/0/0 softirq=202/486 fqs=0 last_accelerate: c87f/defe, nonlazy_posted: 0, L.
> > [ 9442.934505] 	53-...: (2 GPs behind) idle=128/0/0 softirq=365/366 fqs=0 last_accelerate: ca88/defe, nonlazy_posted: 0, L.
> > [ 9442.945360] 	54-...: (9 GPs behind) idle=ce8/0/0 softirq=126/373 fqs=0 last_accelerate: bef8/defe, nonlazy_posted: 0, L.
> > [ 9442.956215] 	56-...: (9 GPs behind) idle=330/0/0 softirq=2116/2126 fqs=0 last_accelerate: 6030/defe, nonlazy_posted: 0, L.
> > [ 9442.967243] 	57-...: (1 GPs behind) idle=288/0/0 softirq=1707/1714 fqs=0 last_accelerate: c87c/defe, nonlazy_posted: 0, L.
> > [ 9442.978272] 	58-...: (37 GPs behind) idle=390/0/0 softirq=1716/1721 fqs=0 last_accelerate: 53f7/defe, nonlazy_posted: 0, L.
> > [ 9442.989387] 	59-...: (37 GPs behind) idle=e54/0/0 softirq=1700/1701 fqs=0 last_accelerate: 40a1/defe, nonlazy_posted: 0, L.
> > [ 9443.000502] 	60-...: (116 GPs behind) idle=7b4/0/0 softirq=92/96 fqs=0 last_accelerate: 57d8/df10, nonlazy_posted: 0, L.
> > [ 9443.011357] 	61-...: (9 GPs behind) idle=9d8/0/0 softirq=161/170 fqs=0 last_accelerate: 6030/df10, nonlazy_posted: 0, L.
> > [ 9443.022212] 	62-...: (115 GPs behind) idle=aa8/0/0 softirq=95/101 fqs=0 last_accelerate: 5420/df17, nonlazy_posted: 0, L.
> > [ 9443.033154] 	63-...: (50 GPs behind) idle=958/0/0 softirq=81/84 fqs=0 last_accelerate: 57b8/df17, nonlazy_posted: 0, L.
> > [ 9443.043920] 	(detected by 39, t=5403 jiffies, g=443, c=442, q=1)
> > [ 9443.049919] Task dump for CPU 1:
> > [ 9443.053134] swapper/1       R  running task        0     0      1 0x00000000
> > [ 9443.060173] Call trace:
> > [ 9443.062619] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.067744] [<          (null)>]           (null)
> > [ 9443.072434] Task dump for CPU 3:
> > [ 9443.075650] swapper/3       R  running task        0     0      1 0x00000000
> > [ 9443.082686] Call trace:
> > [ 9443.085121] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.090246] [<          (null)>]           (null)
> > [ 9443.094936] Task dump for CPU 4:
> > [ 9443.098152] swapper/4       R  running task        0     0      1 0x00000000
> > [ 9443.105188] Call trace:
> > [ 9443.107623] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.112752] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.118224] Task dump for CPU 5:
> > [ 9443.121440] swapper/5       R  running task        0     0      1 0x00000000
> > [ 9443.128476] Call trace:
> > [ 9443.130910] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.136035] [<          (null)>]           (null)
> > [ 9443.140725] Task dump for CPU 6:
> > [ 9443.143941] swapper/6       R  running task        0     0      1 0x00000000
> > [ 9443.150976] Call trace:
> > [ 9443.153411] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.158535] [<          (null)>]           (null)
> > [ 9443.163226] Task dump for CPU 7:
> > [ 9443.166442] swapper/7       R  running task        0     0      1 0x00000000
> > [ 9443.173478] Call trace:
> > [ 9443.175912] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.181037] [<          (null)>]           (null)
> > [ 9443.185727] Task dump for CPU 8:
> > [ 9443.188943] swapper/8       R  running task        0     0      1 0x00000000
> > [ 9443.195979] Call trace:
> > [ 9443.198412] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.203537] [<          (null)>]           (null)
> > [ 9443.208227] Task dump for CPU 9:
> > [ 9443.211443] swapper/9       R  running task        0     0      1 0x00000000
> > [ 9443.218479] Call trace:
> > [ 9443.220913] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.226039] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.231510] Task dump for CPU 10:
> > [ 9443.234812] swapper/10      R  running task        0     0      1 0x00000000
> > [ 9443.241848] Call trace:
> > [ 9443.244283] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.249408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.254879] Task dump for CPU 11:
> > [ 9443.258182] swapper/11      R  running task        0     0      1 0x00000000
> > [ 9443.265218] Call trace:
> > [ 9443.267652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.272776] [<          (null)>]           (null)
> > [ 9443.277467] Task dump for CPU 12:
> > [ 9443.280769] swapper/12      R  running task        0     0      1 0x00000000
> > [ 9443.287806] Call trace:
> > [ 9443.290240] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.295364] [<          (null)>]           (null)
> > [ 9443.300054] Task dump for CPU 13:
> > [ 9443.303357] swapper/13      R  running task        0     0      1 0x00000000
> > [ 9443.310394] Call trace:
> > [ 9443.312828] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.317953] [<          (null)>]           (null)
> > [ 9443.322643] Task dump for CPU 14:
> > [ 9443.325945] swapper/14      R  running task        0     0      1 0x00000000
> > [ 9443.332981] Call trace:
> > [ 9443.335416] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.340540] [<          (null)>]           (null)
> > [ 9443.345230] Task dump for CPU 15:
> > [ 9443.348533] swapper/15      R  running task        0     0      1 0x00000000
> > [ 9443.355568] Call trace:
> > [ 9443.358002] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.363128] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.368599] Task dump for CPU 17:
> > [ 9443.371901] swapper/17      R  running task        0     0      1 0x00000000
> > [ 9443.378937] Call trace:
> > [ 9443.381372] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.386497] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.391968] Task dump for CPU 18:
> > [ 9443.395270] swapper/18      R  running task        0     0      1 0x00000000
> > [ 9443.402306] Call trace:
> > [ 9443.404740] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.409865] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.415336] Task dump for CPU 19:
> > [ 9443.418639] swapper/19      R  running task        0     0      1 0x00000000
> > [ 9443.425675] Call trace:
> > [ 9443.428109] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.433234] [<          (null)>]           (null)
> > [ 9443.437924] Task dump for CPU 20:
> > [ 9443.441226] swapper/20      R  running task        0     0      1 0x00000000
> > [ 9443.448263] Call trace:
> > [ 9443.450697] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.455826] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.462600] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.467986] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.473458] Task dump for CPU 21:
> > [ 9443.476760] swapper/21      R  running task        0     0      1 0x00000000
> > [ 9443.483796] Call trace:
> > [ 9443.486230] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.491354] [<          (null)>]           (null)
> > [ 9443.496045] Task dump for CPU 22:
> > [ 9443.499347] swapper/22      R  running task        0     0      1 0x00000000
> > [ 9443.506383] Call trace:
> > [ 9443.508817] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.513943] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.519414] Task dump for CPU 23:
> > [ 9443.522716] swapper/23      R  running task        0     0      1 0x00000000
> > [ 9443.529752] Call trace:
> > [ 9443.532186] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.537312] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.542784] Task dump for CPU 24:
> > [ 9443.546086] swapper/24      R  running task        0     0      1 0x00000000
> > [ 9443.553122] Call trace:
> > [ 9443.555556] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.560681] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.566153] Task dump for CPU 25:
> > [ 9443.569455] swapper/25      R  running task        0     0      1 0x00000000
> > [ 9443.576491] Call trace:
> > [ 9443.578925] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.584051] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.590825] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.596211] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.601682] Task dump for CPU 26:
> > [ 9443.604985] swapper/26      R  running task        0     0      1 0x00000000
> > [ 9443.612021] Call trace:
> > [ 9443.614455] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.619581] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.625052] Task dump for CPU 27:
> > [ 9443.628355] swapper/27      R  running task        0     0      1 0x00000000
> > [ 9443.635390] Call trace:
> > [ 9443.637824] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.642949] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.648421] Task dump for CPU 28:
> > [ 9443.651723] swapper/28      R  running task        0     0      1 0x00000000
> > [ 9443.658759] Call trace:
> > [ 9443.661193] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.666318] [<          (null)>]           (null)
> > [ 9443.671008] Task dump for CPU 29:
> > [ 9443.674310] swapper/29      R  running task        0     0      1 0x00000000
> > [ 9443.681346] Call trace:
> > [ 9443.683780] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.688905] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.694377] Task dump for CPU 30:
> > [ 9443.697679] swapper/30      R  running task        0     0      1 0x00000000
> > [ 9443.704715] Call trace:
> > [ 9443.707150] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.712275] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.719050] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.724436] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.729907] Task dump for CPU 31:
> > [ 9443.733210] swapper/31      R  running task        0     0      1 0x00000000
> > [ 9443.740246] Call trace:
> > [ 9443.742680] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.747805] [<          (null)>]           (null)
> > [ 9443.752496] Task dump for CPU 32:
> > [ 9443.755798] swapper/32      R  running task        0     0      1 0x00000000
> > [ 9443.762833] Call trace:
> > [ 9443.765267] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.770392] [<          (null)>]           (null)
> > [ 9443.775082] Task dump for CPU 34:
> > [ 9443.778384] swapper/34      R  running task        0     0      1 0x00000000
> > [ 9443.785420] Call trace:
> > [ 9443.787854] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.792980] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.798451] Task dump for CPU 35:
> > [ 9443.801753] swapper/35      R  running task        0     0      1 0x00000000
> > [ 9443.808789] Call trace:
> > [ 9443.811224] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.816348] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.821820] Task dump for CPU 36:
> > [ 9443.825122] swapper/36      R  running task        0     0      1 0x00000000
> > [ 9443.832158] Call trace:
> > [ 9443.834592] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.839718] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.846493] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.851878] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.857350] Task dump for CPU 37:
> > [ 9443.860652] swapper/37      R  running task        0     0      1 0x00000000
> > [ 9443.867688] Call trace:
> > [ 9443.870122] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.875248] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9443.882022] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9443.887408] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.892880] Task dump for CPU 38:
> > [ 9443.896182] swapper/38      R  running task        0     0      1 0x00000000
> > [ 9443.903218] Call trace:
> > [ 9443.905652] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.910776] [<          (null)>]           (null)
> > [ 9443.915466] Task dump for CPU 40:
> > [ 9443.918769] swapper/40      R  running task        0     0      1 0x00000000
> > [ 9443.925805] Call trace:
> > [ 9443.928239] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.933365] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.938836] Task dump for CPU 41:
> > [ 9443.942138] swapper/41      R  running task        0     0      1 0x00000000
> > [ 9443.949174] Call trace:
> > [ 9443.951609] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.956733] [<          (null)>]           (null)
> > [ 9443.961423] Task dump for CPU 43:
> > [ 9443.964725] swapper/43      R  running task        0     0      1 0x00000000
> > [ 9443.971761] Call trace:
> > [ 9443.974195] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9443.979320] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9443.984791] Task dump for CPU 44:
> > [ 9443.988093] swapper/44      R  running task        0     0      1 0x00000000
> > [ 9443.995130] Call trace:
> > [ 9443.997564] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.002688] [<          (null)>]           (null)
> > [ 9444.007378] Task dump for CPU 45:
> > [ 9444.010680] swapper/45      R  running task        0     0      1 0x00000000
> > [ 9444.017716] Call trace:
> > [ 9444.020151] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.025275] [<          (null)>]           (null)
> > [ 9444.029965] Task dump for CPU 46:
> > [ 9444.033267] swapper/46      R  running task        0     0      1 0x00000000
> > [ 9444.040302] Call trace:
> > [ 9444.042737] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.047862] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.053333] Task dump for CPU 47:
> > [ 9444.056636] swapper/47      R  running task        0     0      1 0x00000000
> > [ 9444.063672] Call trace:
> > [ 9444.066106] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.071231] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.076702] Task dump for CPU 48:
> > [ 9444.080004] swapper/48      R  running task        0     0      1 0x00000000
> > [ 9444.087041] Call trace:
> > [ 9444.089475] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.094600] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.100071] Task dump for CPU 49:
> > [ 9444.103374] swapper/49      R  running task        0     0      1 0x00000000
> > [ 9444.110409] Call trace:
> > [ 9444.112844] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.117968] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.123440] Task dump for CPU 50:
> > [ 9444.126742] swapper/50      R  running task        0     0      1 0x00000000
> > [ 9444.133777] Call trace:
> > [ 9444.136211] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.141336] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.146807] Task dump for CPU 51:
> > [ 9444.150109] swapper/51      R  running task        0     0      1 0x00000000
> > [ 9444.157144] Call trace:
> > [ 9444.159578] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.164703] [<          (null)>]           (null)
> > [ 9444.169393] Task dump for CPU 52:
> > [ 9444.172695] swapper/52      R  running task        0     0      1 0x00000000
> > [ 9444.179731] Call trace:
> > [ 9444.182165] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.187290] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.192761] Task dump for CPU 53:
> > [ 9444.196063] swapper/53      R  running task        0     0      1 0x00000000
> > [ 9444.203099] Call trace:
> > [ 9444.205533] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.210658] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.216129] Task dump for CPU 54:
> > [ 9444.219431] swapper/54      R  running task        0     0      1 0x00000000
> > [ 9444.226467] Call trace:
> > [ 9444.228901] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.234026] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.239498] Task dump for CPU 56:
> > [ 9444.242801] swapper/56      R  running task        0     0      1 0x00000000
> > [ 9444.249837] Call trace:
> > [ 9444.252271] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.257396] [<          (null)>]           (null)
> > [ 9444.262086] Task dump for CPU 57:
> > [ 9444.265388] swapper/57      R  running task        0     0      1 0x00000000
> > [ 9444.272424] Call trace:
> > [ 9444.274858] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.279982] [<          (null)>]           (null)
> > [ 9444.284672] Task dump for CPU 58:
> > [ 9444.287975] swapper/58      R  running task        0     0      1 0x00000000
> > [ 9444.295011] Call trace:
> > [ 9444.297445] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.302570] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.309345] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.314731] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.320202] Task dump for CPU 59:
> > [ 9444.323504] swapper/59      R  running task        0     0      1 0x00000000
> > [ 9444.330540] Call trace:
> > [ 9444.332974] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.338100] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.343571] Task dump for CPU 60:
> > [ 9444.346873] swapper/60      R  running task        0     0      1 0x00000000
> > [ 9444.353909] Call trace:
> > [ 9444.356343] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.361469] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
> > [ 9444.368243] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
> > [ 9444.373629] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.379101] Task dump for CPU 61:
> > [ 9444.382402] swapper/61      R  running task        0     0      1 0x00000000
> > [ 9444.389438] Call trace:
> > [ 9444.391872] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.396997] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.402469] Task dump for CPU 62:
> > [ 9444.405771] swapper/62      R  running task        0     0      1 0x00000000
> > [ 9444.412808] Call trace:
> > [ 9444.415242] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.420367] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.425838] Task dump for CPU 63:
> > [ 9444.429141] swapper/63      R  running task        0     0      1 0x00000000
> > [ 9444.436177] Call trace:
> > [ 9444.438611] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.443736] [<ffff0000090d9df0>] __cpu_online_mask+0x0/0x8
> > [ 9444.449211] rcu_sched kthread starved for 5743 jiffies! g443 c442 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 9444.458416] rcu_sched       S    0    10      2 0x00000000
> > [ 9444.463889] Call trace:
> > [ 9444.466324] [<ffff000008085cb8>] __switch_to+0x90/0xa8
> > [ 9444.471453] [<ffff000008ab70a4>] __schedule+0x1a4/0x720
> > [ 9444.476665] [<ffff000008ab7660>] schedule+0x40/0xa8
> > [ 9444.481530] [<ffff000008abac70>] schedule_timeout+0x178/0x358
> > [ 9444.487263] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
> > [ 9444.492824] [<ffff0000080f33d0>] kthread+0x108/0x138
> > [ 9444.497775] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> > 
> > 
> > 
> > And the relevant chunk of trace is:
> > (I have a lot more.  There are substantial other pauses from to time, but not this long)
> > 
> > 
> >    rcu_preempt-9     [057] ....  9419.837631: timer_init: timer=ffff8017d5fcfda0
> >      rcu_preempt-9     [057] d..1  9419.837632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297246837 [timeout=1] cpu=57 idx=0 flags=
> >           <idle>-0     [057] d..1  9419.837634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.837634: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.837635: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418164000000 softexpires=9418164000000
> >           <idle>-0     [057] d.h2  9419.845621: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.h1  9419.845621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418164001440
> >           <idle>-0     [057] d.h1  9419.845622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.s2  9419.845623: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [057] ..s1  9419.845623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297246838
> >           <idle>-0     [057] .ns1  9419.845624: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [057] dn.2  9419.845628: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418168000000 softexpires=9418168000000
> >           <idle>-0     [057] d..1  9419.845635: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.845636: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9418188000000 softexpires=9418188000000
> >           <idle>-0     [057] d.h2  9419.869621: hrtimer_cancel: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d.h1  9419.869621: hrtimer_expire_entry: hrtimer=ffff8017db99e808 function=tick_sched_timer now=9418188001420
> >           <idle>-0     [057] d.h1  9419.869622: hrtimer_expire_exit: hrtimer=ffff8017db99e808
> >           <idle>-0     [057] d..2  9419.869626: hrtimer_start: hrtimer=ffff8017db99e808 function=tick_sched_timer expires=9858983202655 softexpires=9858983202655
> >           <idle>-0     [016] d.h2  9419.885626: hrtimer_cancel: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d.h1  9419.885627: hrtimer_expire_entry: hrtimer=ffff8017fbc3d808 function=tick_sched_timer now=9418204006760
> >           <idle>-0     [016] d.h1  9419.885629: hrtimer_expire_exit: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d.s2  9419.885629: timer_cancel: timer=ffff8017d37dbca0
> >           <idle>-0     [016] ..s1  9419.885630: timer_expire_entry: timer=ffff8017d37dbca0 function=process_timeout now=4297246848
> >           <idle>-0     [016] .ns1  9419.885631: timer_expire_exit: timer=ffff8017d37dbca0
> >           <idle>-0     [016] dn.2  9419.885636: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9418208000000 softexpires=9418208000000
> >       khugepaged-778   [016] ....  9419.885668: timer_init: timer=ffff8017d37dbca0
> >       khugepaged-778   [016] d..1  9419.885668: timer_start: timer=ffff8017d37dbca0 function=process_timeout expires=4297249348 [timeout=2500] cpu=16 idx=0 flags=
> >           <idle>-0     [016] d..1  9419.885670: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_cancel: hrtimer=ffff8017fbc3d808
> >           <idle>-0     [016] d..2  9419.885671: hrtimer_start: hrtimer=ffff8017fbc3d808 function=tick_sched_timer expires=9428444000000 softexpires=9428444000000
> >           <idle>-0     [045] d.h2  9419.890839: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.890839: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418209219940
> >           <idle>-0     [045] d.h3  9419.890844: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418310221420 softexpires=9418309221420
> >           <idle>-0     [045] d.h1  9419.890844: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9419.917625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9419.917626: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418236005860
> >           <idle>-0     [000] d.h1  9419.917628: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9419.917628: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9419.917629: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297246856
> >           <idle>-0     [000] d.s2  9419.917630: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297246881 [timeout=25] cpu=0 idx=81 flags=
> >           <idle>-0     [000] ..s1  9419.917633: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9419.917648: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9418340000000 softexpires=9418340000000
> >           <idle>-0     [045] d.h2  9419.991845: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9419.991845: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418310225960
> >           <idle>-0     [045] d.h3  9419.991849: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418411227320 softexpires=9418410227320
> >           <idle>-0     [045] d.h1  9419.991850: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9420.021625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.021625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9418340005520
> >           <idle>-0     [000] d.h1  9420.021627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.021627: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.021628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297246882
> >           <idle>-0     [000] d.s2  9420.021629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247107 [timeout=225] cpu=0 idx=34 flags=
> >           <idle>-0     [000] ..s1  9420.021632: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9420.021639: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [045] d.h2  9420.092851: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.092852: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418411231780
> >           <idle>-0     [045] d.h3  9420.092856: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418512233720 softexpires=9418511233720
> >           <idle>-0     [045] d.h1  9420.092856: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9420.141622: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9420.141623: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9418460002540
> >           <idle>-0     [055] d.h1  9420.141625: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9420.141626: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9420.141626: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297246912
> >           <idle>-0     [055] dns1  9420.141628: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9420.141632: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9418464000000 softexpires=9418464000000
> >     kworker/55:1-1246  [055] d..1  9420.141634: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247162 [timeout=250] cpu=55 idx=88 flags=I
> >           <idle>-0     [055] d..1  9420.141637: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9420.141637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419484000000 softexpires=9419484000000
> >           <idle>-0     [045] d.h2  9420.193855: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.193855: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418512235660
> >           <idle>-0     [045] d.h3  9420.193859: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418613237260 softexpires=9418612237260
> >           <idle>-0     [045] d.h1  9420.193860: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.294858: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.294858: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418613238380
> >           <idle>-0     [045] d.h3  9420.294862: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418714240000 softexpires=9418713240000
> >           <idle>-0     [045] d.h1  9420.294863: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.395861: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.395861: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418714241380
> >           <idle>-0     [045] d.h3  9420.395865: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418815242920 softexpires=9418814242920
> >           <idle>-0     [045] d.h1  9420.395865: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [042] d.h2  9420.461621: hrtimer_cancel: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d.h1  9420.461622: hrtimer_expire_entry: hrtimer=ffff8017dbb69808 function=tick_sched_timer now=9418780002180
> >           <idle>-0     [042] d.h1  9420.461623: hrtimer_expire_exit: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d.s2  9420.461624: timer_cancel: timer=ffff80177db6d408
> >           <idle>-0     [042] d.s1  9420.461625: timer_expire_entry: timer=ffff80177db6d408 function=delayed_work_timer_fn now=4297246992
> >           <idle>-0     [042] dns1  9420.461627: timer_expire_exit: timer=ffff80177db6d408
> >           <idle>-0     [042] dns2  9420.461627: timer_cancel: timer=ffff8017797d7868
> >           <idle>-0     [042] .ns1  9420.461628: timer_expire_entry: timer=ffff8017797d7868 function=hns_nic_service_timer now=4297246992
> >           <idle>-0     [042] dns2  9420.461628: timer_start: timer=ffff8017797d7868 function=hns_nic_service_timer expires=4297247242 [timeout=250] cpu=42 idx=98 flags=
> >           <idle>-0     [042] .ns1  9420.461629: timer_expire_exit: timer=ffff8017797d7868
> >           <idle>-0     [042] dn.2  9420.461632: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9418784000000 softexpires=9418784000000
> >     kworker/42:1-1223  [042] d..1  9420.461773: timer_start: timer=ffff80177db6d408 function=delayed_work_timer_fn expires=4297247242 [timeout=250] cpu=42 idx=98 flags=I
> >           <idle>-0     [042] d..1  9420.461866: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_cancel: hrtimer=ffff8017dbb69808
> >           <idle>-0     [042] d..2  9420.461867: hrtimer_start: hrtimer=ffff8017dbb69808 function=tick_sched_timer expires=9419804000000 softexpires=9419804000000
> >           <idle>-0     [045] d.h2  9420.496864: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.496864: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418815244580
> >           <idle>-0     [045] d.h3  9420.496868: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9418916246140 softexpires=9418915246140
> >           <idle>-0     [045] d.h1  9420.496868: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.597866: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.597867: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9418916247280
> >           <idle>-0     [045] d.h3  9420.597871: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419017248760 softexpires=9419016248760
> >           <idle>-0     [045] d.h1  9420.597871: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [033] d.h2  9420.621621: hrtimer_cancel: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d.h1  9420.621622: hrtimer_expire_entry: hrtimer=ffff8017dba76808 function=tick_sched_timer now=9418940002160
> >           <idle>-0     [033] d.h1  9420.621623: hrtimer_expire_exit: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d.s2  9420.621624: timer_cancel: timer=ffff00000917be40
> >           <idle>-0     [033] d.s1  9420.621625: timer_expire_entry: timer=ffff00000917be40 function=delayed_work_timer_fn now=4297247032
> >           <idle>-0     [033] dns1  9420.621626: timer_expire_exit: timer=ffff00000917be40
> >           <idle>-0     [033] dn.2  9420.621630: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9418944000000 softexpires=9418944000000
> >            <...>-1631  [033] d..1  9420.621636: timer_start: timer=ffff00000917be40 function=delayed_work_timer_fn expires=4297247282 [timeout=250] cpu=33 idx=103 flags=I
> >           <idle>-0     [033] d..1  9420.621639: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_cancel: hrtimer=ffff8017dba76808
> >           <idle>-0     [033] d..2  9420.621639: hrtimer_start: hrtimer=ffff8017dba76808 function=tick_sched_timer expires=9419964000000 softexpires=9419964000000
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.691401: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.691408: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419012000000 softexpires=9419012000000
> >           <idle>-0     [000] d..1  9420.691409: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.691409: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [002] d..1  9420.691423: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.691423: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.691424: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859803202655 softexpires=9859803202655
> >           <idle>-0     [045] d.h2  9420.698872: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.698873: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419017253180
> >           <idle>-0     [045] d.h3  9420.698877: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419118254640 softexpires=9419117254640
> >           <idle>-0     [045] d.h1  9420.698877: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9420.799875: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.799875: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419118255760
> >           <idle>-0     [045] d.h3  9420.799879: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419219257140 softexpires=9419218257140
> >           <idle>-0     [045] d.h1  9420.799880: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] dn.2  9420.871369: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9420.871370: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
> >           <idle>-0     [002] dn.2  9420.871375: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [000] d..1  9420.871376: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] dn.2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419192000000 softexpires=9419192000000
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.871376: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419260000000 softexpires=9419260000000
> >           <idle>-0     [002] d..1  9420.871398: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.871398: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9859983202655 softexpires=9859983202655
> >           <idle>-0     [045] d.h2  9420.900881: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9420.900881: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419219261580
> >           <idle>-0     [045] d.h3  9420.900885: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419320263160 softexpires=9419319263160
> >           <idle>-0     [045] d.h1  9420.900886: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_cancel: hrtimer=ffff8017fbe5b808
> >           <idle>-0     [001] d..2  9420.913601: hrtimer_start: hrtimer=ffff8017fbe5b808 function=tick_sched_timer expires=9860023202655 softexpires=9860023202655
> >           <idle>-0     [000] d.h2  9420.941621: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9420.941621: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419260001400
> >           <idle>-0     [000] d.h1  9420.941623: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9420.941623: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9420.941624: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297247112
> >           <idle>-0     [000] d.s2  9420.941624: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297247137 [timeout=25] cpu=0 idx=113 flags=
> >           <idle>-0     [000] ..s1  9420.941628: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d.s2  9420.941629: timer_cancel: timer=ffff8017fbe42558
> >           <idle>-0     [000] d.s1  9420.941629: timer_expire_entry: timer=ffff8017fbe42558 function=delayed_work_timer_fn now=4297247112
> >           <idle>-0     [000] dns1  9420.941630: timer_expire_exit: timer=ffff8017fbe42558
> >           <idle>-0     [000] dns2  9420.941631: timer_cancel: timer=ffff00000910a628
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297247112
> >           <idle>-0     [000] dns1  9420.941631: timer_expire_exit: timer=ffff00000910a628
> >           <idle>-0     [000] dn.2  9420.941634: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9420.941643: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419264000000 softexpires=9419264000000
> >      kworker/0:0-3     [000] d..1  9420.941650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297247500 [timeout=388] cpu=0 idx=100 flags=D|I
> >      kworker/2:0-22    [002] d..1  9420.941651: timer_start: timer=ffff8017fbe78558 function=delayed_work_timer_fn expires=4297247494 [timeout=382] cpu=2 idx=114 flags=D|I
> >           <idle>-0     [000] d..1  9420.941652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9420.941652: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9420.941653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419364000000 softexpires=9419364000000
> >           <idle>-0     [002] d..1  9420.941654: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9420.941654: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9860055202655 softexpires=9860055202655
> >           <idle>-0     [045] d.h2  9421.001887: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.001887: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419320267640
> >           <idle>-0     [045] d.h3  9421.001891: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419421269000 softexpires=9419420269000
> >           <idle>-0     [045] d.h1  9421.001892: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] d.h2  9421.045625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9421.045625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9419364005380
> >           <idle>-0     [000] d.h1  9421.045626: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9421.045627: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9421.045628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297247138
> >           <idle>-0     [000] d.s2  9421.045628: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297247363 [timeout=225] cpu=0 idx=34 flags=
> >           <idle>-0     [000] ..s1  9421.045631: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9421.045644: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
> >           <idle>-0     [045] d.h2  9421.102893: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.102893: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419421273420
> >           <idle>-0     [045] d.h3  9421.102897: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419522275040 softexpires=9419521275040
> >           <idle>-0     [045] d.h1  9421.102897: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9421.165621: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9421.165622: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9419484002280
> >           <idle>-0     [055] d.h1  9421.165624: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9421.165624: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9421.165625: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297247168
> >           <idle>-0     [055] dns1  9421.165626: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9421.165629: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9419488000000 softexpires=9419488000000
> >     kworker/55:1-1246  [055] d..1  9421.165632: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297247418 [timeout=250] cpu=55 idx=120 flags=I
> >           <idle>-0     [055] d..1  9421.165634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9421.165634: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9421.165635: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9420508000000 softexpires=9420508000000
> >           <idle>-0     [045] d.h2  9421.203896: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.203896: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419522276980
> >           <idle>-0     [045] d.h3  9421.203900: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419623278460 softexpires=9419622278460
> >           <idle>-0     [045] d.h1  9421.203901: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9421.304899: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9421.304899: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9419623279580
> >           <idle>-0     [045] d.h3  9421.304903: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9419724281060 softexpires=9419723281060
> >           <idle>-0     [045] d.h1  9421.304903: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] dn.2  9421.381179: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9421.381185: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419700000000 softexpires=9419700000000
> >           <idle>-0     [000] d..1  9421.381185: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9421.381186: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9420284000000 softexpires=9420284000000
> >               sh-2256  [002] ....  9421.381193: timer_init: timer=ffff80176c26fb40
> >               sh-2256  [002] d..1  9421.381194: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297247223 [timeout=2] cpu=2 idx=0 flags=
> >           <idle>-0     [002] d..1  9421.381196: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9421.381197: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419708000000 softexpires=9419708000000
> >           <idle>-0     [002] d.h2  9421.389621: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.h1  9421.389622: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9419708002000
> >           <idle>-0     [002] d.h1  9421.389623: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.s2  9421.389624: timer_cancel: timer=ffff80176c26fb40
> >           <idle>-0     [002] ..s1  9421.389624: timer_expire_entry: timer=ffff80176c26fb40 function=process_timeout now=4297247224
> >           <idle>-0     [002] .ns1  9421.389626: timer_expire_exit: timer=ffff80176c26fb40
> >           <idle>-0     [002] dn.2  9421.389629: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ...1  9421.389682: hrtimer_init: hrtimer=ffff8017d4dde8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >               sh-2256  [002] ....  9421.389690: hrtimer_init: hrtimer=ffff80176cbb0088 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
> >           <idle>-0     [039] dn.2  9421.389814: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=9419712000000 softexpires=9419712000000
> >           <idle>-0     [002] d..1  9421.389896: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9421.389897: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9421.389898: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9419724000000 softexpires=9419724000000  
> 
> This being the gap?
> 
> Interesting in that I am not seeing any timeouts at all associated with
> the rcu_sched kthread...


This only happened when saving out the trace.  It's didn't happen at all
on an overnight run with no interference.  Which perhaps suggests the
tracing itself is changing the timing enough to hid the issue.

Oh goody.

I'm not familiar enough with the internals of event tracing to know,
but is there a reason that either clearing the buffer or outputting
it could result in this gap?

Jonathan


> 
> 							Thanx, Paul
> 
> >           <idle>-0     [002] dn.2  9444.510766: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.510767: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442832000000 softexpires=9442832000000
> >           <idle>-0     [036] d..1  9444.510812: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.510814: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.510815: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
> >               sh-2256  [002] ....  9444.510857: timer_init: timer=ffff80176c26fb40
> >               sh-2256  [002] d..1  9444.510857: timer_start: timer=ffff80176c26fb40 function=process_timeout expires=4297253006 [timeout=2] cpu=2 idx=0 flags=
> >           <idle>-0     [002] d..1  9444.510864: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.510865: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.510866: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442844000000 softexpires=9442844000000
> >           <idle>-0     [000] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [002] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [036] d.h2  9444.525625: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [002] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442844005600
> >           <idle>-0     [036] d.h1  9444.525625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442844005460
> >           <idle>-0     [000] d.h1  9444.525627: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442844005300
> >           <idle>-0     [002] d.h1  9444.525627: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d.s2  9444.525629: timer_cancel: timer=ffff8017fbe78558
> >           <idle>-0     [036] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [002] d.s1  9444.525629: timer_expire_entry: timer=ffff8017fbe78558 function=delayed_work_timer_fn now=4297253008
> >           <idle>-0     [000] d.h1  9444.525629: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.525631: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.525631: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_disable_link now=4297253008
> >           <idle>-0     [002] dns1  9444.525631: timer_expire_exit: timer=ffff8017fbe78558
> >           <idle>-0     [000] d.s2  9444.525632: timer_start: timer=ffff80177fdc0840 function=link_timeout_enable_link expires=4297253033 [timeout=25] cpu=0 idx=82 flags=
> >           <idle>-0     [000] ..s1  9444.525633: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d.s2  9444.525634: timer_cancel: timer=ffff00000910a628
> >           <idle>-0     [000] d.s1  9444.525634: timer_expire_entry: timer=ffff00000910a628 function=delayed_work_timer_fn now=4297253008
> >           <idle>-0     [000] dns1  9444.525636: timer_expire_exit: timer=ffff00000910a628
> >           <idle>-0     [036] dn.2  9444.525639: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >           <idle>-0     [000] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >           <idle>-0     [002] dn.2  9444.525640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442848000000 softexpires=9442848000000
> >      rcu_preempt-9     [036] ....  9444.525648: timer_init: timer=ffff8017d5fcfda0
> >           <idle>-0     [002] d..1  9444.525648: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.525648: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.525649: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442860000000 softexpires=9442860000000
> >      rcu_preempt-9     [036] d..1  9444.525649: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253009 [timeout=1] cpu=36 idx=0 flags=
> >      kworker/0:0-3     [000] d..1  9444.525650: timer_start: timer=ffff00000910a628 function=delayed_work_timer_fn expires=4297253250 [timeout=242] cpu=0 idx=82 flags=D|I
> >           <idle>-0     [000] d..1  9444.525652: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [000] d..2  9444.525652: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d..2  9444.525653: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9442948000000 softexpires=9442948000000
> >           <idle>-0     [036] d..1  9444.525653: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.525654: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442852000000 softexpires=9442852000000
> >           <idle>-0     [036] d.h2  9444.533624: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.533625: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442852004760
> >           <idle>-0     [036] d.h1  9444.533626: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.533627: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.533628: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253010
> >           <idle>-0     [036] .ns1  9444.533629: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.533634: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442856000000 softexpires=9442856000000
> >           <idle>-0     [036] d..1  9444.533668: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.533668: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.533669: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442876000000 softexpires=9442876000000
> >           <idle>-0     [002] dnh2  9444.541626: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dnh1  9444.541627: hrtimer_expire_entry: hrtimer=ffff8017fbe76808 function=tick_sched_timer now=9442860007120
> >           <idle>-0     [002] dnh1  9444.541629: hrtimer_expire_exit: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] dn.2  9444.541630: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9442864000000 softexpires=9442864000000
> >           <idle>-0     [002] d..1  9444.541640: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_cancel: hrtimer=ffff8017fbe76808
> >           <idle>-0     [002] d..2  9444.541640: hrtimer_start: hrtimer=ffff8017fbe76808 function=tick_sched_timer expires=9444316000000 softexpires=9444316000000
> >           <idle>-0     [036] dnh2  9444.557627: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dnh1  9444.557628: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442876008220
> >           <idle>-0     [036] dnh1  9444.557630: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.557631: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442880000000 softexpires=9442880000000
> >           <idle>-0     [036] d..1  9444.557644: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.557645: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442892000000 softexpires=9442892000000
> >           <idle>-0     [036] d.h2  9444.573621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.573621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442892001340
> >           <idle>-0     [036] d.h1  9444.573622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] dn.2  9444.573628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442896000000 softexpires=9442896000000
> >      rcu_preempt-9     [036] ....  9444.573631: timer_init: timer=ffff8017d5fcfda0
> >      rcu_preempt-9     [036] d..1  9444.573632: timer_start: timer=ffff8017d5fcfda0 function=process_timeout expires=4297253021 [timeout=1] cpu=36 idx=0 flags=
> >           <idle>-0     [036] d..1  9444.573634: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.573635: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442900000000 softexpires=9442900000000
> >           <idle>-0     [036] d.h2  9444.581621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.581621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442900001400
> >           <idle>-0     [036] d.h1  9444.581622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.s2  9444.581623: timer_cancel: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] ..s1  9444.581623: timer_expire_entry: timer=ffff8017d5fcfda0 function=process_timeout now=4297253022
> >           <idle>-0     [036] .ns1  9444.581625: timer_expire_exit: timer=ffff8017d5fcfda0
> >           <idle>-0     [036] dn.2  9444.581628: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442904000000 softexpires=9442904000000
> >           <idle>-0     [036] d..1  9444.581636: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [036] d..2  9444.581636: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.581637: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9442924000000 softexpires=9442924000000
> >           <idle>-0     [045] d.h2  9444.581718: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.581719: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9442900098200
> >           <idle>-0     [045] d.h3  9444.581724: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443001101380 softexpires=9443000101380
> >           <idle>-0     [045] d.h1  9444.581725: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [036] d.h2  9444.605621: hrtimer_cancel: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d.h1  9444.605621: hrtimer_expire_entry: hrtimer=ffff8017dbac7808 function=tick_sched_timer now=9442924001600
> >           <idle>-0     [036] d.h1  9444.605622: hrtimer_expire_exit: hrtimer=ffff8017dbac7808
> >           <idle>-0     [036] d..2  9444.605629: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=9883719202655 softexpires=9883719202655
> >           <idle>-0     [000] d.h2  9444.629625: hrtimer_cancel: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.h1  9444.629625: hrtimer_expire_entry: hrtimer=ffff8017fbe40808 function=tick_sched_timer now=9442948005580
> >           <idle>-0     [000] d.h1  9444.629627: hrtimer_expire_exit: hrtimer=ffff8017fbe40808
> >           <idle>-0     [000] d.s2  9444.629628: timer_cancel: timer=ffff80177fdc0840
> >           <idle>-0     [000] ..s1  9444.629628: timer_expire_entry: timer=ffff80177fdc0840 function=link_timeout_enable_link now=4297253034
> >           <idle>-0     [000] d.s2  9444.629629: timer_start: timer=ffff80177fdc0840 function=link_timeout_disable_link expires=4297253259 [timeout=225] cpu=0 idx=42 flags=
> >           <idle>-0     [000] ..s1  9444.629638: timer_expire_exit: timer=ffff80177fdc0840
> >           <idle>-0     [000] d..2  9444.629661: hrtimer_start: hrtimer=ffff8017fbe40808 function=tick_sched_timer expires=9443868000000 softexpires=9443868000000
> >           <idle>-0     [045] d.h2  9444.682725: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.682725: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443001105940
> >           <idle>-0     [045] d.h3  9444.682730: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443102107440 softexpires=9443101107440
> >           <idle>-0     [045] d.h1  9444.682730: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [055] d.h2  9444.717626: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.h1  9444.717627: hrtimer_expire_entry: hrtimer=ffff8017db968808 function=tick_sched_timer now=9443036006240
> >           <idle>-0     [055] d.h1  9444.717629: hrtimer_expire_exit: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d.s2  9444.717630: timer_cancel: timer=ffff80177db6cc08
> >           <idle>-0     [055] d.s1  9444.717630: timer_expire_entry: timer=ffff80177db6cc08 function=delayed_work_timer_fn now=4297253056
> >           <idle>-0     [055] dns1  9444.717633: timer_expire_exit: timer=ffff80177db6cc08
> >           <idle>-0     [055] dn.2  9444.717637: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9443040000000 softexpires=9443040000000
> >     kworker/55:1-1246  [055] d..1  9444.717640: timer_start: timer=ffff80177db6cc08 function=delayed_work_timer_fn expires=4297253306 [timeout=250] cpu=55 idx=88 flags=I
> >           <idle>-0     [055] d..1  9444.717643: tick_stop: success=1 dependency=NONE
> >           <idle>-0     [055] d..2  9444.717643: hrtimer_cancel: hrtimer=ffff8017db968808
> >           <idle>-0     [055] d..2  9444.717644: hrtimer_start: hrtimer=ffff8017db968808 function=tick_sched_timer expires=9444060000000 softexpires=9444060000000
> >           <idle>-0     [045] d.h2  9444.783729: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.783729: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443102109380
> >           <idle>-0     [045] d.h3  9444.783733: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443203110880 softexpires=9443202110880
> >           <idle>-0     [045] d.h1  9444.783733: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.884731: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.884731: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443203112000
> >           <idle>-0     [045] d.h3  9444.884735: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443304113380 softexpires=9443303113380
> >           <idle>-0     [045] d.h1  9444.884736: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h2  9444.985734: hrtimer_cancel: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [045] d.h1  9444.985735: hrtimer_expire_entry: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func now=9443304114500
> >           <idle>-0     [045] d.h3  9444.985738: hrtimer_start: hrtimer=ffff80176cb7ca90 function=ehci_hrtimer_func expires=9443405116440 softexpires=9443404116440
> >           <idle>-0     [045] d.h1  9444.985739: hrtimer_expire_exit: hrtimer=ffff80176cb7ca90
> >           <idle>-0     [042] d.h2  9445.037622: hrtimer_cancel: hrtimer=ffff8017dbb69808  
> > > 
> > > Thanks,
> > > 
> > > Jonathan  
> > > > 
> > > > 							Thanx, Paul
> > > >     
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > [ 1984.638153] rcu_preempt     S    0     9      2 0x00000000
> > > > > [ 1984.643626] Call trace:
> > > > > [ 1984.646059] [<ffff000008084fb0>] __switch_to+0x90/0xa8
> > > > > [ 1984.651189] [<ffff000008962274>] __schedule+0x19c/0x5d8
> > > > > [ 1984.656400] [<ffff0000089626e8>] schedule+0x38/0xa0
> > > > > [ 1984.661266] [<ffff000008965844>] schedule_timeout+0x124/0x218
> > > > > [ 1984.667002] [<ffff000008121424>] rcu_gp_kthread+0x4fc/0x748
> > > > > [ 1984.672564] [<ffff0000080df0b4>] kthread+0xfc/0x128
> > > > > [ 1984.677429] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> > > > >       
> > > >     
> > > 
> > > _______________________________________________
> > > linuxarm mailing list
> > > linuxarm at huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28  7:44                                                           ` Jonathan Cameron
  (?)
@ 2017-07-28 12:54                                                             ` Boqun Feng
  -1 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-28 12:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jonathan,

FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():

	https://marc.info/?l=linux-kernel&m\x149750022019663

and RCU begins to use swait/wake last year, so I thought this could be
relevant.

Could you try the following patch and see if it works? Thanks.

Regards,
Boqun

------------------>8
Subject: [PATCH] swait: Remove the lockless swait_active() check in
 swake_up*()

Steven Rostedt reported a potential race in RCU core because of
swake_up():

        CPU0                            CPU1
        ----                            ----
                                __call_rcu_core() {

                                 spin_lock(rnp_root)
                                 need_wake = __rcu_start_gp() {
                                  rcu_start_gp_advanced() {
                                   gp_flags = FLAG_INIT
                                  }
                                 }

 rcu_gp_kthread() {
   swait_event_interruptible(wq,
        gp_flags & FLAG_INIT) {
   spin_lock(q->lock)

                                *fetch wq->task_list here! *

   list_add(wq->task_list, q->task_list)
   spin_unlock(q->lock);

   *fetch old value of gp_flags here *

                                 spin_unlock(rnp_root)

                                 rcu_gp_kthread_wake() {
                                  swake_up(wq) {
                                   swait_active(wq) {
                                    list_empty(wq->task_list)

                                   } * return false *

  if (condition) * false *
    schedule();

In this case, a wakeup is missed, which could cause the rcu_gp_kthread
waits for a long time.

The reason of this is that we do a lockless swait_active() check in
swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
before swait_active() to provide the proper order or 2) simply remove
the swait_active() in swake_up().

The solution 2 not only fixes this problem but also keeps the swait and
wait API as close as possible, as wake_up() doesn't provide a full
barrier and doesn't do a lockless check of the wait queue either.
Moreover, there are users already using swait_active() to do their quick
checks for the wait queues, so it make less sense that swake_up() and
swake_up_all() do this on their own.

This patch then removes the lockless swait_active() check in swake_up()
and swake_up_all().

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/sched/swait.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index 3d5610dcce11..2227e183e202 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
 {
 	unsigned long flags;
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irqsave(&q->lock, flags);
 	swake_up_locked(q);
 	raw_spin_unlock_irqrestore(&q->lock, flags);
@@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
 	struct swait_queue *curr;
 	LIST_HEAD(tmp);
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irq(&q->lock);
 	list_splice_init(&q->task_list, &tmp);
 	while (!list_empty(&tmp)) {
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 12:54                                                             ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-28 12:54 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Paul E. McKenney, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

Hi Jonathan,

FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():

	https://marc.info/?l=linux-kernel&m=149750022019663

and RCU begins to use swait/wake last year, so I thought this could be
relevant.

Could you try the following patch and see if it works? Thanks.

Regards,
Boqun

------------------>8
Subject: [PATCH] swait: Remove the lockless swait_active() check in
 swake_up*()

Steven Rostedt reported a potential race in RCU core because of
swake_up():

        CPU0                            CPU1
        ----                            ----
                                __call_rcu_core() {

                                 spin_lock(rnp_root)
                                 need_wake = __rcu_start_gp() {
                                  rcu_start_gp_advanced() {
                                   gp_flags = FLAG_INIT
                                  }
                                 }

 rcu_gp_kthread() {
   swait_event_interruptible(wq,
        gp_flags & FLAG_INIT) {
   spin_lock(q->lock)

                                *fetch wq->task_list here! *

   list_add(wq->task_list, q->task_list)
   spin_unlock(q->lock);

   *fetch old value of gp_flags here *

                                 spin_unlock(rnp_root)

                                 rcu_gp_kthread_wake() {
                                  swake_up(wq) {
                                   swait_active(wq) {
                                    list_empty(wq->task_list)

                                   } * return false *

  if (condition) * false *
    schedule();

In this case, a wakeup is missed, which could cause the rcu_gp_kthread
waits for a long time.

The reason of this is that we do a lockless swait_active() check in
swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
before swait_active() to provide the proper order or 2) simply remove
the swait_active() in swake_up().

The solution 2 not only fixes this problem but also keeps the swait and
wait API as close as possible, as wake_up() doesn't provide a full
barrier and doesn't do a lockless check of the wait queue either.
Moreover, there are users already using swait_active() to do their quick
checks for the wait queues, so it make less sense that swake_up() and
swake_up_all() do this on their own.

This patch then removes the lockless swait_active() check in swake_up()
and swake_up_all().

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/sched/swait.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index 3d5610dcce11..2227e183e202 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
 {
 	unsigned long flags;
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irqsave(&q->lock, flags);
 	swake_up_locked(q);
 	raw_spin_unlock_irqrestore(&q->lock, flags);
@@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
 	struct swait_queue *curr;
 	LIST_HEAD(tmp);
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irq(&q->lock);
 	list_splice_init(&q->task_list, &tmp);
 	while (!list_empty(&tmp)) {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 12:54                                                             ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-28 12:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jonathan,

FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():

	https://marc.info/?l=linux-kernel&m=149750022019663

and RCU begins to use swait/wake last year, so I thought this could be
relevant.

Could you try the following patch and see if it works? Thanks.

Regards,
Boqun

------------------>8
Subject: [PATCH] swait: Remove the lockless swait_active() check in
 swake_up*()

Steven Rostedt reported a potential race in RCU core because of
swake_up():

        CPU0                            CPU1
        ----                            ----
                                __call_rcu_core() {

                                 spin_lock(rnp_root)
                                 need_wake = __rcu_start_gp() {
                                  rcu_start_gp_advanced() {
                                   gp_flags = FLAG_INIT
                                  }
                                 }

 rcu_gp_kthread() {
   swait_event_interruptible(wq,
        gp_flags & FLAG_INIT) {
   spin_lock(q->lock)

                                *fetch wq->task_list here! *

   list_add(wq->task_list, q->task_list)
   spin_unlock(q->lock);

   *fetch old value of gp_flags here *

                                 spin_unlock(rnp_root)

                                 rcu_gp_kthread_wake() {
                                  swake_up(wq) {
                                   swait_active(wq) {
                                    list_empty(wq->task_list)

                                   } * return false *

  if (condition) * false *
    schedule();

In this case, a wakeup is missed, which could cause the rcu_gp_kthread
waits for a long time.

The reason of this is that we do a lockless swait_active() check in
swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
before swait_active() to provide the proper order or 2) simply remove
the swait_active() in swake_up().

The solution 2 not only fixes this problem but also keeps the swait and
wait API as close as possible, as wake_up() doesn't provide a full
barrier and doesn't do a lockless check of the wait queue either.
Moreover, there are users already using swait_active() to do their quick
checks for the wait queues, so it make less sense that swake_up() and
swake_up_all() do this on their own.

This patch then removes the lockless swait_active() check in swake_up()
and swake_up_all().

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/sched/swait.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index 3d5610dcce11..2227e183e202 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
 {
 	unsigned long flags;
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irqsave(&q->lock, flags);
 	swake_up_locked(q);
 	raw_spin_unlock_irqrestore(&q->lock, flags);
@@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
 	struct swait_queue *curr;
 	LIST_HEAD(tmp);
 
-	if (!swait_active(q))
-		return;
-
 	raw_spin_lock_irq(&q->lock);
 	list_splice_init(&q->task_list, &tmp);
 	while (!list_empty(&tmp)) {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28  7:44                                                           ` Jonathan Cameron
@ 2017-07-28 13:08                                                             ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="windows-1254", Size: 253163 bytes --]

On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:      
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >         
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:        
> > > > > >         
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace period.          
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.        
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > > because the timer has not fired:        
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?      
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?    
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 

In the interests of tidier traces with less other stuff in them I disable
usb and the sas driver (which were responsible for most of the hrtimer events).
Problem just became a whole lot easier to trigger.  Happening ever few minutes
if I do pretty much anything on the machine at all.

Running with CONFIG_RCU_CPU_STALL_TIMEOUT=8 (left over from trying
to get it to trigger more often before disabling usb and sas).

No sign of the massive gap. I'm thinking that was related to the
trace buffer readout.

More long logs I'm afraid:

[  775.760469] random: crng init done
[  835.595087] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  835.600740] 	2-...: (11 GPs behind) idleÖ0/0/0 softirq68/369 fqs=0 last_accelerate: fae9/0968, nonlazy_posted: 0, L.
[  835.611595] 	3-...: (14 GPs behind) idle\v0/0/0 softirq#8/238 fqs=0 last_accelerate: f91c/096c, nonlazy_posted: 0, L.
[  835.622450] 	4-...: (17 GPs behind) idleF4/0/0 softirq#0/234 fqs=0 last_accelerate: 01f5/096c, nonlazy_posted: 0, L.
[  835.633305] 	5-...: (17 GPs behind) idleôc/0/0 softirq!0/212 fqs=0 last_accelerate: 01f3/0970, nonlazy_posted: 0, L.
[  835.644159] 	6-...: (128 GPs behind) idleçc/0/0 softirq!8/219 fqs=0 last_accelerate: 587f/0974, nonlazy_posted: 0, L.
[  835.655099] 	7-...: (128 GPs behind) idleä0/0/0 softirq!4/217 fqs=0 last_accelerate: 587f/0974, nonlazy_posted: 0, L.
[  835.666040] 	8-...: (109 GPs behind) idleâ0/0/0 softirq!2/212 fqs=0 last_accelerate: 587f/0978, nonlazy_posted: 0, L.
[  835.676981] 	9-...: (128 GPs behind) idle×0/0/0 softirq\x195/196 fqs=0 last_accelerate: 587f/097c, nonlazy_posted: 0, L.
[  835.687921] 	10-...: (130 GPs behind) idle¹8/0/0 softirqB9/431 fqs=0 last_accelerate: 587f/097c, nonlazy_posted: 0, L.
[  835.698949] 	11-...: (128 GPs behind) idleÑ0/0/0 softirq\x190/190 fqs=0 last_accelerate: 587f/0980, nonlazy_posted: 0, L.
[  835.709976] 	12-...: (108 GPs behind) idleÖ0/0/0 softirq 1/202 fqs=0 last_accelerate: 587f/0984, nonlazy_posted: 0, L.
[  835.721003] 	13-...: (103 GPs behind) idle[4/0/0 softirq 0/200 fqs=0 last_accelerate: 587f/0984, nonlazy_posted: 0, L.
[  835.732031] 	14-...: (128 GPs behind) idle´4/0/0 softirq\x185/185 fqs=0 last_accelerate: 587f/0988, nonlazy_posted: 0, L.
[  835.743058] 	15-...: (128 GPs behind) idleÁc/0/0 softirq\x196/197 fqs=0 last_accelerate: 587f/098c, nonlazy_posted: 0, L.
[  835.754086] 	16-...: (29 GPs behind) idle7c/0/0 softirq 5/205 fqs=0 last_accelerate: dff1/098c, nonlazy_posted: 0, L.
[  835.765026] 	17-...: (8 GPs behind) idle°8/0/0 softirq 1/203 fqs=0 last_accelerate: fabc/0990, nonlazy_posted: 0, L.
[  835.775880] 	19-...: (0 ticks this GP) idle™4/0/0 softirq\x175/175 fqs=0 last_accelerate: 0194/0994, nonlazy_posted: 0, L.
[  835.786994] 	20-...: (59 GPs behind) idle˜4/0/0 softirq\x185/185 fqs=0 last_accelerate: e5dc/0994, nonlazy_posted: 0, L.
[  835.797935] 	21-...: (6 GPs behind) idleŽ4/0/0 softirq\x176/178 fqs=0 last_accelerate: fd08/0998, nonlazy_posted: 0, L.
[  835.808789] 	23-...: (112 GPs behind) idle~0/0/0 softirq\x166/168 fqs=0 last_accelerate: 587f/099c, nonlazy_posted: 0, L.
[  835.819816] 	24-...: (103 GPs behind) idleäc/0/0 softirq\x168/168 fqs=0 last_accelerate: 587f/09a0, nonlazy_posted: 0, L.
[  835.830843] 	25-...: (20 GPs behind) idle“0/0/0 softirq\x1303/1303 fqs=0 last_accelerate: edf1/09a0, nonlazy_posted: 0, L.
[  835.841958] 	26-...: (4 GPs behind) idlem8/0/0 softirq\x178/180 fqs=0 last_accelerate: fd08/09a4, nonlazy_posted: 0, L.
[  835.852812] 	27-...: (60 GPs behind) idleŠ8/0/0 softirq\x183/183 fqs=0 last_accelerate: e5dc/09a8, nonlazy_posted: 0, L.
[  835.863752] 	28-...: (3 GPs behind) idlel4/0/0 softirq\x167/168 fqs=0 last_accelerate: fe00/09a8, nonlazy_posted: 0, L.
[  835.874606] 	29-...: (57 GPs behind) idleX8/0/0 softirq\x152/153 fqs=0 last_accelerate: e7d4/09ac, nonlazy_posted: 0, L.
[  835.885547] 	30-...: (10 GPs behind) idle\c/0/0 softirq\x161/163 fqs=0 last_accelerate: f9f8/09b0, nonlazy_posted: 0, L.
[  835.896488] 	31-...: (1 GPs behind) idle_8/0/0 softirq$6/248 fqs=0 last_accelerate: fe18/09b0, nonlazy_posted: 0, L.
[  835.907342] 	32-...: (102 GPs behind) idle™8/0/0 softirq\x158/159 fqs=0 last_accelerate: 587f/09b4, nonlazy_posted: 0, L.
[  835.918369] 	33-...: (102 GPs behind) idle\x7f4/0/0 softirq\x141/141 fqs=0 last_accelerate: 587f/09b8, nonlazy_posted: 0, L.
[  835.929397] 	35-...: (146 GPs behind) idle@0/0/0 softirq\x142/143 fqs=0 last_accelerate: 587f/09b8, nonlazy_posted: 0, L.
[  835.940425] 	36-...: (39 GPs behind) idleB4/0/0 softirq\x115/115 fqs=0 last_accelerate: 01f5/09bc, nonlazy_posted: 0, L.
[  835.951366] 	37-...: (38 GPs behind) idle@0/0/0 softirq\x126/128 fqs=0 last_accelerate: fdd4/09c0, nonlazy_posted: 0, L.
[  835.962307] 	38-...: (17 GPs behind) idle[4/0/0 softirq\x139/142 fqs=0 last_accelerate: d804/09c0, nonlazy_posted: 0, L.
[  835.973247] 	39-...: (17 GPs behind) idle\x104/0/0 softirq\x110/110 fqs=0 last_accelerate: 01f8/09c4, nonlazy_posted: 0, L.
[  835.984188] 	40-...: (150 GPs behind) idle	8/0/0 softirq\x107/110 fqs=0 last_accelerate: 587f/09c8, nonlazy_posted: 0, L.
[  835.995216] 	41-...: (145 GPs behind) idle\x038/0/0 softirq\x102/103 fqs=0 last_accelerate: 587f/09cc, nonlazy_posted: 0, L.
[  836.006243] 	42-...: (147 GPs behind) idleú8/0/0 softirq”/94 fqs=0 last_accelerate: 587f/09cc, nonlazy_posted: 0, L.
[  836.017097] 	43-...: (145 GPs behind) idleö8/0/0 softirq˜/98 fqs=0 last_accelerate: 587f/09d0, nonlazy_posted: 0, L.
[  836.027951] 	44-...: (147 GPs behind) idleù8/0/0 softirq/90 fqs=0 last_accelerate: 587f/09d4, nonlazy_posted: 0, L.
[  836.038805] 	45-...: (147 GPs behind) idleô4/0/0 softirqv/77 fqs=0 last_accelerate: 587f/09d4, nonlazy_posted: 0, L.
[  836.049658] 	46-...: (147 GPs behind) idleóc/0/0 softirqw/77 fqs=0 last_accelerate: 587f/09d8, nonlazy_posted: 0, L.
[  836.060513] 	47-...: (146 GPs behind) idleæ0/0/0 softirqw/78 fqs=0 last_accelerate: 587f/09dc, nonlazy_posted: 0, L.
[  836.071367] 	48-...: (49 GPs behind) idleá0/0/0 softirqx/80 fqs=0 last_accelerate: ef2c/09dc, nonlazy_posted: 0, L.
[  836.082134] 	49-...: (34 GPs behind) idleÝ4/0/0 softirqr/72 fqs=0 last_accelerate: d805/09e0, nonlazy_posted: 0, L.
[  836.092902] 	50-...: (147 GPs behind) idleÂ0/0/0 softirqX/60 fqs=0 last_accelerate: 587f/09e4, nonlazy_posted: 0, L.
[  836.103756] 	51-...: (145 GPs behind) idle¿4/0/0 softirqc/63 fqs=0 last_accelerate: 587f/09e4, nonlazy_posted: 0, L.
[  836.114610] 	52-...: (47 GPs behind) idle¿0/0/0 softirqU/56 fqs=0 last_accelerate: f25c/09e8, nonlazy_posted: 0, L.
[  836.125377] 	53-...: (31 GPs behind) idleÊ8/0/0 softirqQ/53 fqs=0 last_accelerate: 03ea/09ec, nonlazy_posted: 0, L.
[  836.136144] 	54-...: (33 GPs behind) idleÅc/0/0 softirqT/54 fqs=0 last_accelerate: aff0/09ec, nonlazy_posted: 0, L.
[  836.146912] 	55-...: (146 GPs behind) idle«8/0/0 softirqX/59 fqs=0 last_accelerate: 587f/09f0, nonlazy_posted: 0, L.
[  836.157766] 	56-...: (43 GPs behind) idleòc/0/0 softirqg/67 fqs=0 last_accelerate: f8dc/09f4, nonlazy_posted: 0, L.
[  836.168533] 	57-...: (42 GPs behind) idle§4/0/0 softirqH/49 fqs=0 last_accelerate: 7e00/09f4, nonlazy_posted: 0, L.
[  836.179300] 	58-...: (29 GPs behind) idle÷c/0/0 softirq`/62 fqs=0 last_accelerate: 05ea/09f8, nonlazy_posted: 0, L.
[  836.190068] 	59-...: (103 GPs behind) idleËc/0/0 softirqH/49 fqs=0 last_accelerate: 587f/09fc, nonlazy_posted: 0, L.
[  836.200921] 	60-...: (40 GPs behind) idle—c/0/0 softirq9/39 fqs=0 last_accelerate: fcdc/09fc, nonlazy_posted: 0, L.
[  836.211688] 	61-...: (20 GPs behind) idle‡0/0/0 softirq'/31 fqs=0 last_accelerate: edf5/0a00, nonlazy_posted: 0, L.
[  836.222456] 	62-...: (17 GPs behind) idlez8/0/0 softirq3/34 fqs=0 last_accelerate: 01f3/0a04, nonlazy_posted: 0, L.
[  836.233223] 	63-...: (147 GPs behind) idleoc/0/0 softirq3/35 fqs=0 last_accelerate: 587f/0a04, nonlazy_posted: 0, L.
[  836.244075] 	(detected by 1, t!64 jiffies, g\x112, c\x111, q'06)
[  836.250246] Task dump for CPU 2:
[  836.253460] swapper/2       R  running task        0     0      1 0x00000000
[  836.260496] Call trace:
[  836.262935] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.268062] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  836.274836] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  836.280223] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.285694] Task dump for CPU 3:
[  836.288908] swapper/3       R  running task        0     0      1 0x00000000
[  836.295944] Call trace:
[  836.298378] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.303502] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.308973] Task dump for CPU 4:
[  836.312187] swapper/4       R  running task        0     0      1 0x00000000
[  836.319223] Call trace:
[  836.321657] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.326781] [<          (null)>]           (null)
[  836.331470] Task dump for CPU 5:
[  836.334685] swapper/5       R  running task        0     0      1 0x00000000
[  836.341719] Call trace:
[  836.344153] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.349277] [<          (null)>]           (null)
[  836.353967] Task dump for CPU 6:
[  836.357181] swapper/6       R  running task        0     0      1 0x00000000
[  836.364216] Call trace:
[  836.366649] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.371773] [<          (null)>]           (null)
[  836.376463] Task dump for CPU 7:
[  836.379678] swapper/7       R  running task        0     0      1 0x00000000
[  836.386713] Call trace:
[  836.389146] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.394270] [<          (null)>]           (null)
[  836.398959] Task dump for CPU 8:
[  836.402173] swapper/8       R  running task        0     0      1 0x00000000
[  836.409209] Call trace:
[  836.411642] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.416766] [<          (null)>]           (null)
[  836.421455] Task dump for CPU 9:
[  836.424669] swapper/9       R  running task        0     0      1 0x00000000
[  836.431704] Call trace:
[  836.434137] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.439261] [<          (null)>]           (null)
[  836.443951] Task dump for CPU 10:
[  836.447251] swapper/10      R  running task        0     0      1 0x00000000
[  836.454287] Call trace:
[  836.456720] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.461844] [<          (null)>]           (null)
[  836.466533] Task dump for CPU 11:
[  836.469834] swapper/11      R  running task        0     0      1 0x00000000
[  836.476869] Call trace:
[  836.479303] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.484426] [<          (null)>]           (null)
[  836.489116] Task dump for CPU 12:
[  836.492417] swapper/12      R  running task        0     0      1 0x00000000
[  836.499451] Call trace:
[  836.501885] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.507009] [<          (null)>]           (null)
[  836.511698] Task dump for CPU 13:
[  836.514999] swapper/13      R  running task        0     0      1 0x00000000
[  836.522033] Call trace:
[  836.524467] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.529591] [<          (null)>]           (null)
[  836.534280] Task dump for CPU 14:
[  836.537581] swapper/14      R  running task        0     0      1 0x00000000
[  836.544616] Call trace:
[  836.547049] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.552173] [<          (null)>]           (null)
[  836.556863] Task dump for CPU 15:
[  836.560163] swapper/15      R  running task        0     0      1 0x00000000
[  836.567198] Call trace:
[  836.569632] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.574755] [<          (null)>]           (null)
[  836.579445] Task dump for CPU 16:
[  836.582746] swapper/16      R  running task        0     0      1 0x00000000
[  836.589780] Call trace:
[  836.592214] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.597338] [<          (null)>]           (null)
[  836.602027] Task dump for CPU 17:
[  836.605328] swapper/17      R  running task        0     0      1 0x00000000
[  836.612362] Call trace:
[  836.614796] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.619920] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.625391] Task dump for CPU 19:
[  836.628692] swapper/19      R  running task        0     0      1 0x00000000
[  836.635727] Call trace:
[  836.638160] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.643285] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  836.650059] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  836.655444] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.660915] Task dump for CPU 20:
[  836.664216] swapper/20      R  running task        0     0      1 0x00000000
[  836.671251] Call trace:
[  836.673684] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.678808] [<          (null)>]           (null)
[  836.683497] Task dump for CPU 21:
[  836.686798] swapper/21      R  running task        0     0      1 0x00000000
[  836.693833] Call trace:
[  836.696266] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.701391] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.706862] Task dump for CPU 23:
[  836.710162] swapper/23      R  running task        0     0      1 0x00000000
[  836.717197] Call trace:
[  836.719631] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.724754] [<          (null)>]           (null)
[  836.729444] Task dump for CPU 24:
[  836.732745] swapper/24      R  running task        0     0      1 0x00000000
[  836.739779] Call trace:
[  836.742213] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.747337] [<          (null)>]           (null)
[  836.752026] Task dump for CPU 25:
[  836.755327] swapper/25      R  running task        0     0      1 0x00000000
[  836.762362] Call trace:
[  836.764795] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.769919] [<          (null)>]           (null)
[  836.774608] Task dump for CPU 26:
[  836.777909] swapper/26      R  running task        0     0      1 0x00000000
[  836.784944] Call trace:
[  836.787377] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.792501] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.797972] Task dump for CPU 27:
[  836.801273] swapper/27      R  running task        0     0      1 0x00000000
[  836.808308] Call trace:
[  836.810741] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.815865] [<          (null)>]           (null)
[  836.820555] Task dump for CPU 28:
[  836.823855] swapper/28      R  running task        0     0      1 0x00000000
[  836.830890] Call trace:
[  836.833323] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.838448] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.843918] Task dump for CPU 29:
[  836.847219] swapper/29      R  running task        0     0      1 0x00000000
[  836.854254] Call trace:
[  836.856687] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.861811] [<          (null)>]           (null)
[  836.866500] Task dump for CPU 30:
[  836.869801] swapper/30      R  running task        0     0      1 0x00000000
[  836.876836] Call trace:
[  836.879269] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.884394] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.889865] Task dump for CPU 31:
[  836.893166] swapper/31      R  running task        0     0      1 0x00000000
[  836.900200] Call trace:
[  836.902634] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.907758] [<          (null)>]           (null)
[  836.912447] Task dump for CPU 32:
[  836.915749] swapper/32      R  running task        0     0      1 0x00000000
[  836.922784] Call trace:
[  836.925217] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.930341] [<          (null)>]           (null)
[  836.935031] Task dump for CPU 33:
[  836.938332] swapper/33      R  running task        0     0      1 0x00000000
[  836.945367] Call trace:
[  836.947801] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.952925] [<          (null)>]           (null)
[  836.957614] Task dump for CPU 35:
[  836.960916] swapper/35      R  running task        0     0      1 0x00000000
[  836.967951] Call trace:
[  836.970384] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.975508] [<          (null)>]           (null)
[  836.980198] Task dump for CPU 36:
[  836.983499] swapper/36      R  running task        0     0      1 0x00000000
[  836.990534] Call trace:
[  836.992967] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.998091] [<          (null)>]           (null)
[  837.002781] Task dump for CPU 37:
[  837.006082] swapper/37      R  running task        0     0      1 0x00000000
[  837.013117] Call trace:
[  837.015551] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.020675] [<          (null)>]           (null)
[  837.025365] Task dump for CPU 38:
[  837.028666] swapper/38      R  running task        0     0      1 0x00000000
[  837.035701] Call trace:
[  837.038135] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.043259] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.048730] Task dump for CPU 39:
[  837.052031] swapper/39      R  running task        0     0      1 0x00000000
[  837.059067] Call trace:
[  837.061500] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.066624] [<          (null)>]           (null)
[  837.071314] Task dump for CPU 40:
[  837.074615] swapper/40      R  running task        0     0      1 0x00000000
[  837.081650] Call trace:
[  837.084083] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.089207] [<          (null)>]           (null)
[  837.093897] Task dump for CPU 41:
[  837.097198] swapper/41      R  running task        0     0      1 0x00000000
[  837.104234] Call trace:
[  837.106667] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.111791] [<          (null)>]           (null)
[  837.116480] Task dump for CPU 42:
[  837.119781] swapper/42      R  running task        0     0      1 0x00000000
[  837.126817] Call trace:
[  837.129250] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.134374] [<          (null)>]           (null)
[  837.139064] Task dump for CPU 43:
[  837.142365] swapper/43      R  running task        0     0      1 0x00000000
[  837.149400] Call trace:
[  837.151833] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.156957] [<          (null)>]           (null)
[  837.161647] Task dump for CPU 44:
[  837.164948] swapper/44      R  running task        0     0      1 0x00000000
[  837.171984] Call trace:
[  837.174417] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.179541] [<          (null)>]           (null)
[  837.184231] Task dump for CPU 45:
[  837.187532] swapper/45      R  running task        0     0      1 0x00000000
[  837.194567] Call trace:
[  837.197000] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.202124] [<          (null)>]           (null)
[  837.206813] Task dump for CPU 46:
[  837.210114] swapper/46      R  running task        0     0      1 0x00000000
[  837.217150] Call trace:
[  837.219583] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.224707] [<          (null)>]           (null)
[  837.229397] Task dump for CPU 47:
[  837.232698] swapper/47      R  running task        0     0      1 0x00000000
[  837.239733] Call trace:
[  837.242166] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.247290] [<          (null)>]           (null)
[  837.251980] Task dump for CPU 48:
[  837.255281] swapper/48      R  running task        0     0      1 0x00000000
[  837.262316] Call trace:
[  837.264749] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.269873] [<          (null)>]           (null)
[  837.274563] Task dump for CPU 49:
[  837.277864] swapper/49      R  running task        0     0      1 0x00000000
[  837.284899] Call trace:
[  837.287333] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.292457] [<          (null)>]           (null)
[  837.297146] Task dump for CPU 50:
[  837.300448] swapper/50      R  running task        0     0      1 0x00000000
[  837.307483] Call trace:
[  837.309916] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.315040] [<          (null)>]           (null)
[  837.319729] Task dump for CPU 51:
[  837.323031] swapper/51      R  running task        0     0      1 0x00000000
[  837.330065] Call trace:
[  837.332499] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.337623] [<          (null)>]           (null)
[  837.342313] Task dump for CPU 52:
[  837.345614] swapper/52      R  running task        0     0      1 0x00000000
[  837.352649] Call trace:
[  837.355083] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.360207] [<          (null)>]           (null)
[  837.364896] Task dump for CPU 53:
[  837.368198] swapper/53      R  running task        0     0      1 0x00000000
[  837.375233] Call trace:
[  837.377666] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.382790] [<          (null)>]           (null)
[  837.387480] Task dump for CPU 54:
[  837.390781] swapper/54      R  running task        0     0      1 0x00000000
[  837.397817] Call trace:
[  837.400250] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.405375] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  837.412149] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  837.417534] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.423005] Task dump for CPU 55:
[  837.426305] swapper/55      R  running task        0     0      1 0x00000000
[  837.433341] Call trace:
[  837.435774] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.440898] [<          (null)>]           (null)
[  837.445588] Task dump for CPU 56:
[  837.448889] swapper/56      R  running task        0     0      1 0x00000000
[  837.455924] Call trace:
[  837.458358] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.463482] [<          (null)>]           (null)
[  837.468172] Task dump for CPU 57:
[  837.471473] swapper/57      R  running task        0     0      1 0x00000000
[  837.478508] Call trace:
[  837.480942] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.486066] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.491536] Task dump for CPU 58:
[  837.494837] swapper/58      R  running task        0     0      1 0x00000000
[  837.501873] Call trace:
[  837.504306] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.509430] [<          (null)>]           (null)
[  837.514120] Task dump for CPU 59:
[  837.517421] swapper/59      R  running task        0     0      1 0x00000000
[  837.524456] Call trace:
[  837.526889] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.532013] [<          (null)>]           (null)
[  837.536703] Task dump for CPU 60:
[  837.540004] swapper/60      R  running task        0     0      1 0x00000000
[  837.547039] Call trace:
[  837.549473] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.554597] [<          (null)>]           (null)
[  837.559287] Task dump for CPU 61:
[  837.562588] swapper/61      R  running task        0     0      1 0x00000000
[  837.569623] Call trace:
[  837.572056] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.577180] [<          (null)>]           (null)
[  837.581870] Task dump for CPU 62:
[  837.585171] swapper/62      R  running task        0     0      1 0x00000000
[  837.592206] Call trace:
[  837.594640] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.599764] [<          (null)>]           (null)
[  837.604453] Task dump for CPU 63:
[  837.607754] swapper/63      R  running task        0     0      1 0x00000000
[  837.614790] Call trace:
[  837.617223] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.622347] [<          (null)>]           (null)
[  837.627039] rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[  837.636416] rcu_preempt     S    0     9      2 0x00000000
[  837.641888] Call trace:
[  837.644321] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.649448] [<ffff000008a16724>] __schedule+0x1a4/0x720
[  837.654659] [<ffff000008a16ce0>] schedule+0x40/0xa8
[  837.659524] [<ffff000008a1a2f0>] schedule_timeout+0x178/0x358
[  837.665257] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
[  837.670817] [<ffff0000080f33d0>] kthread+0x108/0x138
[  837.675767] [<ffff0000080836c0>] ret_from_fork+0x10/0x50


And from 8ish seconds before where I presume the problem would like.
If more is needed then ping me and I'll get them uploaded somewhere.


         <idle>-0     [000] d.h2   825.003085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.h1   825.003085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚3324002340
          <idle>-0     [000] d.h1   825.003086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   825.003087: timer_cancel: timerÿff8017d3923c08
          <idle>-0     [000] d.s1   825.003087: timer_expire_entry: timerÿff8017d3923c08 functionÞlayed_work_timer_fn nowB95098128
          <idle>-0     [000] dns1   825.003089: timer_expire_exit: timerÿff8017d3923c08
          <idle>-0     [000] dn.2   825.003091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚3328000000 softexpires‚3328000000
     kworker/0:0-3     [000] d..1   825.003093: timer_start: timerÿff8017d3923c08 functionÞlayed_work_timer_fn expiresB95098378 [timeout%0] cpu=0 idx˜ flags=I
          <idle>-0     [000] d..1   825.003095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   825.003095: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   825.003096: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4316000000 softexpires‚4316000000
          <idle>-0     [000] dn.2   825.234230: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   825.234230: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚3556000000 softexpires‚3556000000
          <idle>-0     [001] dn.2   825.234235: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   825.234235: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   825.234235: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚3556000000 softexpires‚3556000000
          <idle>-0     [000] d..2   825.234236: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   825.234236: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4316000000 softexpires‚4316000000
          <idle>-0     [001] d..1   825.234244: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   825.234244: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   825.234245: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1264347202655 softexpires\x1264347202655
          <idle>-0     [000] d.h2   825.995084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [034] d.h2   825.995085: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [034] d.h1   825.995085: hrtimer_expire_entry: hrtimerÿff8017dba91808 function=tick_sched_timer now‚4316002180
          <idle>-0     [000] d.h1   825.995085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚4316002240
          <idle>-0     [000] d.h1   825.995086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [034] d.h1   825.995087: hrtimer_expire_exit: hrtimerÿff8017dba91808
          <idle>-0     [000] d.s2   825.995087: timer_cancel: timerÿff8017d3923408
          <idle>-0     [000] d.s1   825.995087: timer_expire_entry: timerÿff8017d3923408 functionÞlayed_work_timer_fn nowB95098376
          <idle>-0     [034] d.s2   825.995087: timer_cancel: timerÿff00000909a9a0
          <idle>-0     [034] d.s1   825.995088: timer_expire_entry: timerÿff00000909a9a0 functionÞlayed_work_timer_fn nowB95098376
          <idle>-0     [000] dns1   825.995089: timer_expire_exit: timerÿff8017d3923408
          <idle>-0     [034] dns1   825.995089: timer_expire_exit: timerÿff00000909a9a0
          <idle>-0     [000] dn.2   825.995091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4320000000 softexpires‚4320000000
          <idle>-0     [034] dn.2   825.995092: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚4320000000 softexpires‚4320000000
     kworker/0:0-3     [000] d..1   825.995093: timer_start: timerÿff8017d3923408 functionÞlayed_work_timer_fn expiresB95098626 [timeout%0] cpu=0 idxe flags=I
          <idle>-0     [000] d..1   825.995096: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   825.995096: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   825.995096: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4348000000 softexpires‚4348000000
    kworker/34:2-1587  [034] d..1   825.995097: timer_start: timerÿff00000909a9a0 functionÞlayed_work_timer_fn expiresB95098626 [timeout%0] cpu4 idxe flags=I
          <idle>-0     [034] d..1   825.995100: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   825.995101: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [034] d..2   825.995101: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚5340000000 softexpires‚5340000000
          <idle>-0     [000] d.h2   826.027085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.h1   826.027085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚4348002340
          <idle>-0     [000] d.h1   826.027086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   826.027087: timer_cancel: timerÿff8017d3923c08
          <idle>-0     [000] d.s1   826.027087: timer_expire_entry: timerÿff8017d3923c08 functionÞlayed_work_timer_fn nowB95098384
          <idle>-0     [000] dns1   826.027088: timer_expire_exit: timerÿff8017d3923c08
          <idle>-0     [000] dn.2   826.027091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4352000000 softexpires‚4352000000
     kworker/0:0-3     [000] d..1   826.027093: timer_start: timerÿff8017d3923c08 functionÞlayed_work_timer_fn expiresB95098634 [timeout%0] cpu=0 idxf flags=I
          <idle>-0     [000] d..1   826.027095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   826.027095: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   826.027096: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5340000000 softexpires‚5340000000
          <idle>-0     [000] dn.2   826.185331: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   826.185331: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4508000000 softexpires‚4508000000
          <idle>-0     [001] dn.2   826.185336: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   826.185336: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.185336: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚4508000000 softexpires‚4508000000
          <idle>-0     [000] d..2   826.185337: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   826.185337: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5340000000 softexpires‚5340000000
          <idle>-0     [001] d..1   826.185349: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.185349: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   826.185350: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1265299202655 softexpires\x1265299202655
          <idle>-0     [000] dn.2   826.382948: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   826.382949: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚4704000000 softexpires‚4704000000
          <idle>-0     [001] dn.2   826.382954: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   826.382954: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.382954: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚4704000000 softexpires‚4704000000
          <idle>-0     [000] d..2   826.382955: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   826.382955: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5340000000 softexpires‚5340000000
          <idle>-0     [001] d..1   826.382963: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.382963: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   826.382964: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1265495202655 softexpires\x1265495202655
          <idle>-0     [000] dn.2   826.863667: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   826.863667: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5188000000 softexpires‚5188000000
          <idle>-0     [001] dn.2   826.863672: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   826.863672: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.863672: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5188000000 softexpires‚5188000000
          <idle>-0     [000] d..2   826.863673: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   826.863673: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5340000000 softexpires‚5340000000
          <idle>-0     [001] d..1   826.863680: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.863681: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   826.863681: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1265979202655 softexpires\x1265979202655
8 seconds before the issue is detected is somewhere around here.

	  <idle>-0     [000] d.h2   827.019085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [034] d.h2   827.019085: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [034] d.h1   827.019085: hrtimer_expire_entry: hrtimerÿff8017dba91808 function=tick_sched_timer now‚5340002280
          <idle>-0     [000] d.h1   827.019085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚5340002260
          <idle>-0     [034] d.h1   827.019086: hrtimer_expire_exit: hrtimerÿff8017dba91808
          <idle>-0     [000] d.h1   827.019087: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [034] d.s2   827.019087: timer_cancel: timerÿff00000909a9a0
          <idle>-0     [000] d.s2   827.019088: timer_cancel: timerÿff8017d3923408
          <idle>-0     [034] d.s1   827.019088: timer_expire_entry: timerÿff00000909a9a0 functionÞlayed_work_timer_fn nowB95098632
          <idle>-0     [000] d.s1   827.019088: timer_expire_entry: timerÿff8017d3923408 functionÞlayed_work_timer_fn nowB95098632
          <idle>-0     [034] dns1   827.019089: timer_expire_exit: timerÿff00000909a9a0
          <idle>-0     [000] dns1   827.019089: timer_expire_exit: timerÿff8017d3923408
          <idle>-0     [000] dns2   827.019090: timer_cancel: timerÿff0000090295a8
          <idle>-0     [000] dns1   827.019090: timer_expire_entry: timerÿff0000090295a8 functionÞlayed_work_timer_fn nowB95098632
          <idle>-0     [000] dns1   827.019090: timer_expire_exit: timerÿff0000090295a8
          <idle>-0     [034] dn.2   827.019092: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚5344000000 softexpires‚5344000000
          <idle>-0     [000] dn.2   827.019093: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5344000000 softexpires‚5344000000
     kworker/0:0-3     [000] d..1   827.019095: timer_start: timerÿff8017d3923408 functionÞlayed_work_timer_fn expiresB95098882 [timeout%0] cpu=0 idx— flags=I
    kworker/34:2-1587  [034] d..1   827.019098: timer_start: timerÿff00000909a9a0 functionÞlayed_work_timer_fn expiresB95098882 [timeout%0] cpu4 idx— flags=I
          <idle>-0     [034] d..1   827.019101: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   827.019101: hrtimer_cancel: hrtimerÿff8017dba91808
     kworker/0:0-3     [000] d..1   827.019102: timer_start: timerÿff0000090295a8 functionÞlayed_work_timer_fn expiresB95099000 [timeout68] cpu=0 idx flags=D|I
          <idle>-0     [034] d..2   827.019102: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [000] d..1   827.019104: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.019104: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.019105: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5372000000 softexpires‚5372000000
          <idle>-0     [000] d.h2   827.051085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.h1   827.051085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚5372002340
          <idle>-0     [000] d.h1   827.051086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   827.051087: timer_cancel: timerÿff8017d3923c08
          <idle>-0     [000] d.s1   827.051087: timer_expire_entry: timerÿff8017d3923c08 functionÞlayed_work_timer_fn nowB95098640
          <idle>-0     [000] dns1   827.051088: timer_expire_exit: timerÿff8017d3923c08
          <idle>-0     [000] dn.2   827.051091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5376000000 softexpires‚5376000000
     kworker/0:0-3     [000] d..1   827.051093: timer_start: timerÿff8017d3923c08 functionÞlayed_work_timer_fn expiresB95098890 [timeout%0] cpu=0 idx˜ flags=I
          <idle>-0     [000] d..1   827.051095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.051095: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.051096: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [000] dn.2   827.153486: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   827.153486: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5476000000 softexpires‚5476000000
          <idle>-0     [001] dn.2   827.153491: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   827.153491: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.153491: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5476000000 softexpires‚5476000000
          <idle>-0     [000] d..2   827.153491: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.153492: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [001] d..1   827.153499: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.153500: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.153500: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1266267202655 softexpires\x1266267202655
          <idle>-0     [000] dn.2   827.463397: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   827.463398: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5788000000 softexpires‚5788000000
          <idle>-0     [001] dn.2   827.463402: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   827.463403: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.463403: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5788000000 softexpires‚5788000000
          <idle>-0     [000] d..2   827.463403: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.463403: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [001] d..1   827.463411: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.463411: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.463412: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires\x1266579202655 softexpires\x1266579202655
          <idle>-0     [000] dn.2   827.563461: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   827.563462: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚5888000000 softexpires‚5888000000
          <idle>-0     [001] dn.2   827.563466: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d..1   827.563467: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.563467: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5888000000 softexpires‚5888000000
          <idle>-0     [000] d..2   827.563467: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.563468: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
              sh-1993  [001] ....   827.563474: timer_init: timerÿff8017db00fb40
              sh-1993  [001] d..1   827.563475: timer_start: timerÿff8017db00fb40 function=process_timeout expiresB95098770 [timeout=2] cpu=1 idx=0 flags          <idle>-0     [001] d..1   827.563477: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.563478: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.563478: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5896000000 softexpires‚5896000000
          <idle>-0     [001] d.h2   827.575084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   827.575084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5896001620
          <idle>-0     [001] d.h1   827.575086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.s2   827.575087: timer_cancel: timerÿff8017db00fb40
          <idle>-0     [001] ..s1   827.575087: timer_expire_entry: timerÿff8017db00fb40 function=process_timeout nowB95098771
          <idle>-0     [001] .ns1   827.575089: timer_expire_exit: timerÿff8017db00fb40
          <idle>-0     [001] dns2   827.575089: timer_cancel: timerÿff8017fbe5e558
          <idle>-0     [001] dns1   827.575089: timer_expire_entry: timerÿff8017fbe5e558 functionÞlayed_work_timer_fn nowB95098771
          <idle>-0     [001] dns1   827.575091: timer_expire_exit: timerÿff8017fbe5e558
          <idle>-0     [001] dn.2   827.575094: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5900000000 softexpires‚5900000000
              sh-1993  [001] ...1   827.575136: hrtimer_init: hrtimerÿff8017d4f1e8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-1993  [001] ...1   827.575137: hrtimer_init: hrtimerÿff8017d4f1e8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-1993  [001] ....   827.575145: hrtimer_init: hrtimerÿff80176f273888 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
          <idle>-0     [018] dn.2   827.575215: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] dn.2   827.575216: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5900000000 softexpires‚5900000000
          <idle>-0     [001] d..1   827.575297: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.575298: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.575299: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5916000000 softexpires‚5916000000
              ps-2012  [018] ....   827.576609: timer_init: timerÿff801764b98188
              ps-2012  [018] ....   827.576614: timer_init: timerÿff801764b98608
              ps-2012  [018] ....   827.576633: timer_init: timerÿff801764b98188
              ps-2012  [018] ....   827.576635: timer_init: timerÿff801764b98608
              ps-2012  [018] d.h2   827.579085: hrtimer_cancel: hrtimerÿff8017fbc74808
              ps-2012  [018] d.h1   827.579086: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5900002120
              ps-2012  [018] d.h1   827.579093: hrtimer_expire_exit: hrtimerÿff8017fbc74808
              ps-2012  [018] d.h2   827.579094: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5904000000 softexpires‚5904000000
          <idle>-0     [019] dn.2   827.579107: hrtimer_start: hrtimerÿff8017fbc8f808 function=tick_sched_timer expires‚5904000000 softexpires‚5904000000
     rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
     rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95098773 [timeout=1] cpu\x19 idx=0 flags          <idle>-0     [019] d..1   827.579119: tick_stop: success=1 dependency=NONE
          <idle>-0     [019] d..2   827.579119: hrtimer_cancel: hrtimerÿff8017fbc8f808
          <idle>-0     [019] d..2   827.579119: hrtimer_start: hrtimerÿff8017fbc8f808 function=tick_sched_timer expires„0668000000 softexpires„0668000000
          <idle>-0     [018] d..1   827.580189: tick_stop: success=1 dependency=NONE
          <idle>-0     [018] d..2   827.580190: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   827.580191: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5916000000 softexpires‚5916000000
          <idle>-0     [018] d.h2   827.595084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.595084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.595085: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5916001860
          <idle>-0     [001] d.h1   827.595085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5916001960
          <idle>-0     [001] d.h1   827.595086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.595087: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   827.595089: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5932000000 softexpires‚5932000000
          <idle>-0     [001] d..2   827.595090: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5932000000 softexpires‚5932000000
          <idle>-0     [018] d.h2   827.611084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.611084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.611084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5932001300
          <idle>-0     [001] d.h1   827.611084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5932001520
          <idle>-0     [018] d.h1   827.611085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.611085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.611088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5948000000 softexpires‚5948000000
          <idle>-0     [018] d..2   827.611090: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5948000000 softexpires‚5948000000
          <idle>-0     [018] d.h2   827.627084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.627084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.627084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5948001320
          <idle>-0     [001] d.h1   827.627084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5948001560
          <idle>-0     [001] d.h1   827.627085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.627086: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   827.627087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5964000000 softexpires‚5964000000
          <idle>-0     [001] d..2   827.627088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5964000000 softexpires‚5964000000
          <idle>-0     [018] d.h2   827.643084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.643084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.643084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5964001260
          <idle>-0     [001] d.h1   827.643084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5964001440
          <idle>-0     [018] d.h1   827.643085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.643085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.643087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5980000000 softexpires‚5980000000
          <idle>-0     [018] d..2   827.643088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5980000000 softexpires‚5980000000
          <idle>-0     [018] d.h2   827.659084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.659084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.659084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5980001300
          <idle>-0     [001] d.h1   827.659084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5980001540
          <idle>-0     [018] d.h1   827.659085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.659085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.659086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚5996000000 softexpires‚5996000000
          <idle>-0     [001] d..2   827.659088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚5996000000 softexpires‚5996000000
          <idle>-0     [018] d.h2   827.675084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.675084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.675084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚5996001320
          <idle>-0     [001] d.h1   827.675084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚5996001520
          <idle>-0     [018] d.h1   827.675085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.675085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.675087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6012000000 softexpires‚6012000000
          <idle>-0     [018] d..2   827.675088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6012000000 softexpires‚6012000000
          <idle>-0     [018] d.h2   827.691084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.691084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.691084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6012001280
          <idle>-0     [001] d.h1   827.691084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6012001480
          <idle>-0     [018] d.h1   827.691085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.691085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.691086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6028000000 softexpires‚6028000000
          <idle>-0     [001] d..2   827.691088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6028000000 softexpires‚6028000000
          <idle>-0     [018] d.h2   827.707084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.707084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.707084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6028001300
          <idle>-0     [001] d.h1   827.707084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6028001520
          <idle>-0     [018] d.h1   827.707085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.707085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.707087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6044000000 softexpires‚6044000000
          <idle>-0     [018] d..2   827.707087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6044000000 softexpires‚6044000000
          <idle>-0     [018] d.h2   827.723084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] d.h1   827.723084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6044001280
          <idle>-0     [001] d.h2   827.723084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   827.723085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6044001480
          <idle>-0     [018] d.h1   827.723085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.723086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.723086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6060000000 softexpires‚6060000000
          <idle>-0     [001] d..2   827.723088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6060000000 softexpires‚6060000000
          <idle>-0     [018] d.h2   827.739084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.739084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.739084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6060001280
          <idle>-0     [001] d.h1   827.739084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6060001460
          <idle>-0     [018] d.h1   827.739085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.739085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.739087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6076000000 softexpires‚6076000000
          <idle>-0     [001] d..2   827.739087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6076000000 softexpires‚6076000000
          <idle>-0     [018] d.h2   827.755084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.755084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.755084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6076001280
          <idle>-0     [001] d.h1   827.755084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6076001460
          <idle>-0     [018] d.h1   827.755085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.755085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.755086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6092000000 softexpires‚6092000000
          <idle>-0     [001] d..2   827.755088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6092000000 softexpires‚6092000000
          <idle>-0     [018] d.h2   827.771084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.771084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.771084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6092001240
          <idle>-0     [001] d.h1   827.771084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6092001460
          <idle>-0     [018] d.h1   827.771085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.771085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.771087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6108000000 softexpires‚6108000000
          <idle>-0     [001] d..2   827.771087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6108000000 softexpires‚6108000000
          <idle>-0     [018] d.h2   827.787084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.787084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.787084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6108001280
          <idle>-0     [001] d.h1   827.787084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6108001500
          <idle>-0     [018] d.h1   827.787085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.787085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.787086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6124000000 softexpires‚6124000000
          <idle>-0     [001] d..2   827.787087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6124000000 softexpires‚6124000000
          <idle>-0     [018] d.h2   827.803084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.803084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.803084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6124001280
          <idle>-0     [001] d.h1   827.803084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6124001480
          <idle>-0     [018] d.h1   827.803085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.803085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.803087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6140000000 softexpires‚6140000000
          <idle>-0     [018] d..2   827.803087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6140000000 softexpires‚6140000000
          <idle>-0     [018] d.h2   827.819084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.819084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.819084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6140001280
          <idle>-0     [001] d.h1   827.819084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6140001480
          <idle>-0     [018] d.h1   827.819085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.819085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.819086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6156000000 softexpires‚6156000000
          <idle>-0     [001] d..2   827.819088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6156000000 softexpires‚6156000000
          <idle>-0     [018] d.h2   827.835084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.835084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.835084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6156001260
          <idle>-0     [001] d.h1   827.835084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6156001460
          <idle>-0     [018] d.h1   827.835085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.835085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.835087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6172000000 softexpires‚6172000000
          <idle>-0     [018] d..2   827.835087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6172000000 softexpires‚6172000000
          <idle>-0     [018] d.h2   827.851084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.851084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.851084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6172001260
          <idle>-0     [001] d.h1   827.851084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6172001440
          <idle>-0     [018] d.h1   827.851085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.851085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.851086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6188000000 softexpires‚6188000000
          <idle>-0     [001] d..2   827.851087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6188000000 softexpires‚6188000000
          <idle>-0     [018] d.h2   827.867084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.867084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.867084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6188001280
          <idle>-0     [001] d.h1   827.867084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6188001500
          <idle>-0     [018] d.h1   827.867085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.867085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.867087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6204000000 softexpires‚6204000000
          <idle>-0     [001] d..2   827.867087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6204000000 softexpires‚6204000000
          <idle>-0     [018] d.h2   827.883084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.883084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.883084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6204001260
          <idle>-0     [001] d.h1   827.883084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6204001460
          <idle>-0     [018] d.h1   827.883085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.883085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.883086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6220000000 softexpires‚6220000000
          <idle>-0     [001] d..2   827.883088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6220000000 softexpires‚6220000000
          <idle>-0     [018] d.h2   827.899084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.899084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.899084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6220001280
          <idle>-0     [001] d.h1   827.899084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6220001440
          <idle>-0     [018] d.h1   827.899085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.899085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.899087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6236000000 softexpires‚6236000000
          <idle>-0     [001] d..2   827.899087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6236000000 softexpires‚6236000000
          <idle>-0     [000] dn.2   827.913426: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   827.913427: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6236000000 softexpires‚6236000000
          <idle>-0     [018] d.h2   827.915084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.915084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h2   827.915084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   827.915084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6236001200
          <idle>-0     [001] d.h1   827.915084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6236001400
              ps-2012  [000] d.h1   827.915085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚6236001280
          <idle>-0     [018] d.h1   827.915085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.915085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.915086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6252000000 softexpires‚6252000000
              ps-2012  [000] d.h1   827.915088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h2   827.915088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6240000000 softexpires‚6240000000
          <idle>-0     [001] d..2   827.915090: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6252000000 softexpires‚6252000000
          <idle>-0     [000] d..1   827.916071: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.916072: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.916073: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6252000000 softexpires‚6252000000
          <idle>-0     [000] d..2   827.920194: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.920195: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6268000000 softexpires‚6268000000
          <idle>-0     [018] d.h2   827.931084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.931084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.931084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6252001280
          <idle>-0     [001] d.h1   827.931084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6252001500
          <idle>-0     [018] d.h1   827.931085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.931085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   827.931088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6268000000 softexpires‚6268000000
          <idle>-0     [018] d..2   827.931088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6268000000 softexpires‚6268000000
          <idle>-0     [000] d..2   827.935471: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.935471: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6284000000 softexpires‚6284000000
          <idle>-0     [018] d.h2   827.947084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.947084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.947084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6268001260
          <idle>-0     [001] d.h1   827.947084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6268001480
          <idle>-0     [018] d.h1   827.947085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.947085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.947086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6284000000 softexpires‚6284000000
          <idle>-0     [001] d..2   827.947088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6284000000 softexpires‚6284000000
          <idle>-0     [000] d..2   827.952136: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.952137: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6300000000 softexpires‚6300000000
          <idle>-0     [005] d..2   827.960777: hrtimer_cancel: hrtimerÿff8017fbec8808
          <idle>-0     [005] d..2   827.960779: hrtimer_start: hrtimerÿff8017fbec8808 function=tick_sched_timer expires\x1267075202655 softexpires\x1267075202655
          <idle>-0     [062] d..2   827.960781: hrtimer_cancel: hrtimerÿff8017dba25808
          <idle>-0     [062] d..2   827.960783: hrtimer_start: hrtimerÿff8017dba25808 function=tick_sched_timer expires\x1267075202655 softexpires\x1267075202655
          <idle>-0     [018] d.h2   827.963085: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.963085: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.963085: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6284002420
          <idle>-0     [001] d.h1   827.963085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6284002560
          <idle>-0     [018] d.h1   827.963086: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.963087: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.963089: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6300000000 softexpires‚6300000000
          <idle>-0     [001] d..2   827.963089: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6300000000 softexpires‚6300000000
          <idle>-0     [000] d..2   827.967413: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.967414: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6316000000 softexpires‚6316000000
          <idle>-0     [004] d..2   827.968778: hrtimer_cancel: hrtimerÿff8017fbead808
          <idle>-0     [004] d..2   827.968779: hrtimer_start: hrtimerÿff8017fbead808 function=tick_sched_timer expires\x1267083202655 softexpires\x1267083202655
          <idle>-0     [036] d..2   827.969900: hrtimer_cancel: hrtimerÿff8017dbac7808
          <idle>-0     [036] d..2   827.969902: hrtimer_start: hrtimerÿff8017dbac7808 function=tick_sched_timer expires\x1267083202655 softexpires\x1267083202655
          <idle>-0     [018] d.h2   827.979084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.979084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.979084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6300001260
          <idle>-0     [001] d.h1   827.979084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6300001480
          <idle>-0     [018] d.h1   827.979085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.979085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.979087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6316000000 softexpires‚6316000000
          <idle>-0     [001] d..2   827.979088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6316000000 softexpires‚6316000000
          <idle>-0     [039] d..2   827.980320: hrtimer_cancel: hrtimerÿff8017dbb18808
          <idle>-0     [039] d..2   827.980321: hrtimer_start: hrtimerÿff8017dbb18808 function=tick_sched_timer expires\x1267095202655 softexpires\x1267095202655
          <idle>-0     [000] d..2   827.984080: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.984080: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6332000000 softexpires‚6332000000
          <idle>-0     [018] d.h2   827.995084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   827.995084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   827.995084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6316001280
          <idle>-0     [001] d.h1   827.995084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6316001460
          <idle>-0     [018] d.h1   827.995085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   827.995085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   827.995087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6332000000 softexpires‚6332000000
          <idle>-0     [001] d..2   827.995088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6332000000 softexpires‚6332000000
          <idle>-0     [000] d..2   827.999356: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   827.999356: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6348000000 softexpires‚6348000000
          <idle>-0     [018] d.h2   828.011084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.011084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   828.011084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6332001500
          <idle>-0     [018] d.h1   828.011084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6332001300
          <idle>-0     [001] d.h1   828.011085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.011085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   828.011087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6348000000 softexpires‚6348000000
          <idle>-0     [001] d..2   828.011088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6348000000 softexpires‚6348000000
          <idle>-0     [000] d..2   828.016021: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.016022: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [018] d.h2   828.027084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.027084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.027084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6348001300
          <idle>-0     [001] d.h1   828.027084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6348001520
          <idle>-0     [018] d.h1   828.027085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.027085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.027087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [018] d..2   828.027087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6364000000 softexpires‚6364000000
          <idle>-0     [000] d.h2   828.043085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   828.043085: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.043085: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.043085: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6364002620
          <idle>-0     [034] d.h2   828.043085: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [001] d.h1   828.043085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6364002780
          <idle>-0     [000] d.h1   828.043086: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚6364002620
          <idle>-0     [034] d.h1   828.043086: hrtimer_expire_entry: hrtimerÿff8017dba91808 function=tick_sched_timer now‚6364002780
          <idle>-0     [018] d.h1   828.043086: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.043087: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   828.043087: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [018] d..2   828.043088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6380000000 softexpires‚6380000000
          <idle>-0     [034] d.h1   828.043088: hrtimer_expire_exit: hrtimerÿff8017dba91808
          <idle>-0     [000] d.s2   828.043088: timer_cancel: timerÿff8017d3923408
          <idle>-0     [000] d.s1   828.043088: timer_expire_entry: timerÿff8017d3923408 functionÞlayed_work_timer_fn nowB95098888
          <idle>-0     [034] d.s2   828.043089: timer_cancel: timerÿff00000909a9a0
          <idle>-0     [034] d.s1   828.043089: timer_expire_entry: timerÿff00000909a9a0 functionÞlayed_work_timer_fn nowB95098888
          <idle>-0     [001] d..2   828.043089: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6380000000 softexpires‚6380000000
          <idle>-0     [034] dns1   828.043090: timer_expire_exit: timerÿff00000909a9a0
          <idle>-0     [000] dns1   828.043090: timer_expire_exit: timerÿff8017d3923408
          <idle>-0     [034] dn.2   828.043094: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚6368000000 softexpires‚6368000000
          <idle>-0     [000] dn.2   828.043095: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6368000000 softexpires‚6368000000
     kworker/0:0-3     [000] d..1   828.043098: timer_start: timerÿff8017d3923408 functionÞlayed_work_timer_fn expiresB95099138 [timeout%0] cpu=0 idxe flags=I
          <idle>-0     [000] d..1   828.043101: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.043101: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.043101: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6380000000 softexpires‚6380000000
    kworker/34:2-1587  [034] d..1   828.043102: timer_start: timerÿff00000909a9a0 functionÞlayed_work_timer_fn expiresB95099138 [timeout%0] cpu4 idxe flags=I
          <idle>-0     [034] d..1   828.043104: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   828.043105: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [034] d..2   828.043105: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚7388000000 softexpires‚7388000000
          <idle>-0     [000] d..2   828.047965: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.047965: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6396000000 softexpires‚6396000000
          <idle>-0     [018] d.h2   828.059084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.059084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.059084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6380001280
          <idle>-0     [001] d.h1   828.059084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6380001500
          <idle>-0     [018] d.h1   828.059085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.059085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.059087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6396000000 softexpires‚6396000000
          <idle>-0     [001] d..2   828.059088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6396000000 softexpires‚6396000000
          <idle>-0     [000] d.h2   828.075084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   828.075084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.075084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.075084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6396001280
          <idle>-0     [000] d.h1   828.075084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚6396001280
          <idle>-0     [001] d.h1   828.075084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6396001480
          <idle>-0     [018] d.h1   828.075085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.075085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   828.075085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   828.075086: timer_cancel: timerÿff8017d3923c08
          <idle>-0     [018] d..2   828.075086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6412000000 softexpires‚6412000000
          <idle>-0     [000] d.s1   828.075086: timer_expire_entry: timerÿff8017d3923c08 functionÞlayed_work_timer_fn nowB95098896
          <idle>-0     [000] dns1   828.075088: timer_expire_exit: timerÿff8017d3923c08
          <idle>-0     [001] d..2   828.075088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6412000000 softexpires‚6412000000
          <idle>-0     [000] dn.2   828.075091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6400000000 softexpires‚6400000000
     kworker/0:0-3     [000] d..1   828.075093: timer_start: timerÿff8017d3923c08 functionÞlayed_work_timer_fn expiresB95099146 [timeout%0] cpu=0 idxf flags=I
          <idle>-0     [000] d..1   828.075095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.075095: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.075096: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6412000000 softexpires‚6412000000
          <idle>-0     [000] d..2   828.079906: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.079907: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6428000000 softexpires‚6428000000
          <idle>-0     [018] d.h2   828.091084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.091084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.091084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6412001280
          <idle>-0     [001] d.h1   828.091084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6412001460
          <idle>-0     [018] d.h1   828.091085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.091085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.091087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6428000000 softexpires‚6428000000
          <idle>-0     [001] d..2   828.091087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6428000000 softexpires‚6428000000
          <idle>-0     [000] d..2   828.095183: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.095183: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6444000000 softexpires‚6444000000
          <idle>-0     [018] d.h2   828.107084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.107084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.107084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6428001260
          <idle>-0     [001] d.h1   828.107084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6428001480
          <idle>-0     [018] d.h1   828.107085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.107085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.107086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6444000000 softexpires‚6444000000
          <idle>-0     [001] d..2   828.107088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6444000000 softexpires‚6444000000
          <idle>-0     [000] d..2   828.111848: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.111849: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6460000000 softexpires‚6460000000
          <idle>-0     [018] d.h2   828.123084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.123084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.123084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6444001280
          <idle>-0     [001] d.h1   828.123084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6444001460
          <idle>-0     [018] d.h1   828.123085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.123085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.123087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6460000000 softexpires‚6460000000
          <idle>-0     [001] d..2   828.123087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6460000000 softexpires‚6460000000
          <idle>-0     [000] d..2   828.127125: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.127126: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6476000000 softexpires‚6476000000
          <idle>-0     [018] d.h2   828.139084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.139084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.139084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6460001260
          <idle>-0     [001] d.h1   828.139084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6460001460
          <idle>-0     [018] d.h1   828.139085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.139085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.139086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6476000000 softexpires‚6476000000
          <idle>-0     [001] d..2   828.139088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6476000000 softexpires‚6476000000
          <idle>-0     [000] d..2   828.143791: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.143791: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6492000000 softexpires‚6492000000
          <idle>-0     [018] d.h2   828.155084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.155084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.155084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6476001280
          <idle>-0     [001] d.h1   828.155084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6476001480
          <idle>-0     [018] d.h1   828.155085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.155085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.155087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6492000000 softexpires‚6492000000
          <idle>-0     [018] d..2   828.155087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6492000000 softexpires‚6492000000
          <idle>-0     [000] d..2   828.160457: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.160457: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6508000000 softexpires‚6508000000
          <idle>-0     [018] d.h2   828.171084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.171084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.171084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6492001300
          <idle>-0     [001] d.h1   828.171084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6492001500
          <idle>-0     [018] d.h1   828.171085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.171085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.171086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6508000000 softexpires‚6508000000
          <idle>-0     [001] d..2   828.171087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6508000000 softexpires‚6508000000
          <idle>-0     [000] d..2   828.175733: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.175734: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6524000000 softexpires‚6524000000
          <idle>-0     [018] d.h2   828.187084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.187084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.187084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6508001300
          <idle>-0     [001] d.h1   828.187084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6508001480
          <idle>-0     [018] d.h1   828.187085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.187085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.187087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6524000000 softexpires‚6524000000
          <idle>-0     [001] d..2   828.187087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6524000000 softexpires‚6524000000
          <idle>-0     [000] d..2   828.192399: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.192399: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6540000000 softexpires‚6540000000
          <idle>-0     [018] d.h2   828.203084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.203084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.203084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6524001260
          <idle>-0     [001] d.h1   828.203084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6524001460
          <idle>-0     [018] d.h1   828.203085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.203085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.203086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6540000000 softexpires‚6540000000
          <idle>-0     [001] d..2   828.203087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6540000000 softexpires‚6540000000
          <idle>-0     [000] d..2   828.207676: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.207676: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6556000000 softexpires‚6556000000
          <idle>-0     [018] d.h2   828.219084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.219084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.219084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6540001260
          <idle>-0     [001] d.h1   828.219084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6540001420
          <idle>-0     [018] d.h1   828.219085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.219085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.219087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6556000000 softexpires‚6556000000
          <idle>-0     [001] d..2   828.219087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6556000000 softexpires‚6556000000
          <idle>-0     [000] d..2   828.224341: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.224342: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6572000000 softexpires‚6572000000
          <idle>-0     [018] d.h2   828.235084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.235084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.235084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6556001260
          <idle>-0     [001] d.h1   828.235084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6556001440
          <idle>-0     [018] d.h1   828.235085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.235085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.235086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6572000000 softexpires‚6572000000
          <idle>-0     [001] d..2   828.235087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6572000000 softexpires‚6572000000
          <idle>-0     [000] d..2   828.239618: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.239619: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6588000000 softexpires‚6588000000
          <idle>-0     [000] dn.2   828.249341: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   828.249342: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6572000000 softexpires‚6572000000
          <idle>-0     [018] d.h2   828.251084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.251084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h2   828.251084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   828.251084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6572001200
          <idle>-0     [001] d.h1   828.251084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6572001420
              ps-2012  [000] d.h1   828.251085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚6572001280
          <idle>-0     [018] d.h1   828.251085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.251085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.251087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6588000000 softexpires‚6588000000
              ps-2012  [000] d.h1   828.251088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h2   828.251088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6576000000 softexpires‚6576000000
          <idle>-0     [001] d..2   828.251093: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6588000000 softexpires‚6588000000
          <idle>-0     [000] d..1   828.251937: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.251937: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.251938: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6588000000 softexpires‚6588000000
          <idle>-0     [000] d..2   828.256023: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.256024: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6604000000 softexpires‚6604000000
          <idle>-0     [018] d.h2   828.267084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] d.h1   828.267084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6588001240
          <idle>-0     [001] d.h2   828.267084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   828.267085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6588001440
          <idle>-0     [018] d.h1   828.267085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.267086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.267086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6604000000 softexpires‚6604000000
          <idle>-0     [001] d..2   828.267088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6604000000 softexpires‚6604000000
          <idle>-0     [000] d..2   828.271300: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.271301: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6620000000 softexpires‚6620000000
          <idle>-0     [018] d.h2   828.283084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.283084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.283084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6604001240
          <idle>-0     [001] d.h1   828.283084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6604001420
          <idle>-0     [018] d.h1   828.283085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.283085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.283087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6620000000 softexpires‚6620000000
          <idle>-0     [001] d..2   828.283087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6620000000 softexpires‚6620000000
          <idle>-0     [000] d..2   828.287966: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.287967: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6636000000 softexpires‚6636000000
          <idle>-0     [018] d.h2   828.299084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.299084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.299084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6620001260
          <idle>-0     [001] d.h1   828.299084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6620001460
          <idle>-0     [018] d.h1   828.299085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.299085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.299086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6636000000 softexpires‚6636000000
          <idle>-0     [001] d..2   828.299087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6636000000 softexpires‚6636000000
          <idle>-0     [000] d..2   828.303242: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.303243: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6652000000 softexpires‚6652000000
          <idle>-0     [018] d.h2   828.315084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.315084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.315084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6636001260
          <idle>-0     [001] d.h1   828.315084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6636001460
          <idle>-0     [018] d.h1   828.315085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.315085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.315087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6652000000 softexpires‚6652000000
          <idle>-0     [001] d..2   828.315087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6652000000 softexpires‚6652000000
          <idle>-0     [000] d..2   828.319908: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.319909: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6668000000 softexpires‚6668000000
          <idle>-0     [018] d.h2   828.331084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.331084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.331084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6652001240
          <idle>-0     [001] d.h1   828.331084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6652001440
          <idle>-0     [018] d.h1   828.331085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.331085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.331086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6668000000 softexpires‚6668000000
          <idle>-0     [001] d..2   828.331087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6668000000 softexpires‚6668000000
          <idle>-0     [000] d..2   828.335185: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.335185: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6684000000 softexpires‚6684000000
          <idle>-0     [018] d.h2   828.347084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.347084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.347084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6668001260
          <idle>-0     [001] d.h1   828.347084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6668001460
          <idle>-0     [018] d.h1   828.347085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.347085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.347087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6684000000 softexpires‚6684000000
          <idle>-0     [001] d..2   828.347087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6684000000 softexpires‚6684000000
          <idle>-0     [000] d..2   828.351850: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.351851: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6700000000 softexpires‚6700000000
          <idle>-0     [018] d.h2   828.363084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.363084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.363084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6684001260
          <idle>-0     [001] d.h1   828.363084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6684001460
          <idle>-0     [018] d.h1   828.363085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.363085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.363086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6700000000 softexpires‚6700000000
          <idle>-0     [001] d..2   828.363087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6700000000 softexpires‚6700000000
          <idle>-0     [000] d..2   828.367127: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.367128: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6716000000 softexpires‚6716000000
          <idle>-0     [018] d.h2   828.379084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.379084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.379084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6700001260
          <idle>-0     [001] d.h1   828.379084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6700001440
          <idle>-0     [018] d.h1   828.379085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.379085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.379087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6716000000 softexpires‚6716000000
          <idle>-0     [018] d..2   828.379087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6716000000 softexpires‚6716000000
          <idle>-0     [000] d..2   828.383793: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.383794: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6732000000 softexpires‚6732000000
          <idle>-0     [018] d.h2   828.395084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.395084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.395084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6716001260
          <idle>-0     [001] d.h1   828.395084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6716001480
          <idle>-0     [018] d.h1   828.395085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.395085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.395086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6732000000 softexpires‚6732000000
          <idle>-0     [001] d..2   828.395087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6732000000 softexpires‚6732000000
          <idle>-0     [000] d..2   828.400458: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.400459: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6748000000 softexpires‚6748000000
          <idle>-0     [018] d.h2   828.411084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.411084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.411084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6732001300
          <idle>-0     [001] d.h1   828.411084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6732001500
          <idle>-0     [018] d.h1   828.411085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.411085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.411087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6748000000 softexpires‚6748000000
          <idle>-0     [018] d..2   828.411087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6748000000 softexpires‚6748000000
          <idle>-0     [000] d..2   828.415735: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.415736: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6764000000 softexpires‚6764000000
          <idle>-0     [018] d.h2   828.427084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.427084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.427084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6748001320
          <idle>-0     [001] d.h1   828.427084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6748001520
          <idle>-0     [018] d.h1   828.427085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.427085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.427086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6764000000 softexpires‚6764000000
          <idle>-0     [001] d..2   828.427087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6764000000 softexpires‚6764000000
          <idle>-0     [000] d..2   828.432401: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.432401: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6780000000 softexpires‚6780000000
          <idle>-0     [018] d.h2   828.443084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.443084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.443084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6764001260
          <idle>-0     [001] d.h1   828.443084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6764001480
          <idle>-0     [018] d.h1   828.443085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.443085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.443087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6780000000 softexpires‚6780000000
          <idle>-0     [001] d..2   828.443087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6780000000 softexpires‚6780000000
          <idle>-0     [000] d..2   828.447678: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.447678: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6796000000 softexpires‚6796000000
          <idle>-0     [018] d.h2   828.459084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.459084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.459084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6780001260
          <idle>-0     [001] d.h1   828.459084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6780001460
          <idle>-0     [018] d.h1   828.459085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.459085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.459086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6796000000 softexpires‚6796000000
          <idle>-0     [001] d..2   828.459087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6796000000 softexpires‚6796000000
          <idle>-0     [000] d..2   828.464343: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.464344: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6812000000 softexpires‚6812000000
          <idle>-0     [018] d.h2   828.475084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.475084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.475084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6796001240
          <idle>-0     [001] d.h1   828.475084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6796001460
          <idle>-0     [018] d.h1   828.475085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.475085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.475087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6812000000 softexpires‚6812000000
          <idle>-0     [001] d..2   828.475087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6812000000 softexpires‚6812000000
          <idle>-0     [000] d..2   828.479620: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.479621: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6828000000 softexpires‚6828000000
          <idle>-0     [018] d.h2   828.491084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.491084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.491084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6812001260
          <idle>-0     [001] d.h1   828.491084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6812001460
          <idle>-0     [018] d.h1   828.491085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.491085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.491086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6828000000 softexpires‚6828000000
          <idle>-0     [001] d..2   828.491087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6828000000 softexpires‚6828000000
          <idle>-0     [000] d..2   828.496286: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.496286: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6844000000 softexpires‚6844000000
          <idle>-0     [018] d.h2   828.507084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.507084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.507084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6828001260
          <idle>-0     [001] d.h1   828.507084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6828001480
          <idle>-0     [018] d.h1   828.507085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.507085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.507087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6844000000 softexpires‚6844000000
          <idle>-0     [018] d..2   828.507087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6844000000 softexpires‚6844000000
          <idle>-0     [000] d..2   828.511562: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.511563: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6860000000 softexpires‚6860000000
          <idle>-0     [018] d.h2   828.523084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.523084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.523084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6844001300
          <idle>-0     [001] d.h1   828.523084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6844001480
          <idle>-0     [018] d.h1   828.523085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.523085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.523087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6860000000 softexpires‚6860000000
          <idle>-0     [001] d..2   828.523087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6860000000 softexpires‚6860000000
          <idle>-0     [000] d..2   828.528228: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.528229: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6876000000 softexpires‚6876000000
          <idle>-0     [018] d.h2   828.539084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.539084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.539084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6860001260
          <idle>-0     [001] d.h1   828.539084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6860001460
          <idle>-0     [018] d.h1   828.539085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.539085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.539087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6876000000 softexpires‚6876000000
          <idle>-0     [001] d..2   828.539087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6876000000 softexpires‚6876000000
          <idle>-0     [000] d..2   828.543505: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.543505: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6892000000 softexpires‚6892000000
          <idle>-0     [018] d.h2   828.555084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.555084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.555084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6876001240
          <idle>-0     [001] d.h1   828.555084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6876001420
          <idle>-0     [018] d.h1   828.555085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.555085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.555087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6892000000 softexpires‚6892000000
          <idle>-0     [001] d..2   828.555088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6892000000 softexpires‚6892000000
          <idle>-0     [000] d..2   828.560170: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.560171: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6908000000 softexpires‚6908000000
          <idle>-0     [018] d.h2   828.571084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.571084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.571084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6892001320
          <idle>-0     [001] d.h1   828.571084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6892001520
          <idle>-0     [018] d.h1   828.571085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.571085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.571087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6908000000 softexpires‚6908000000
          <idle>-0     [001] d..2   828.571088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6908000000 softexpires‚6908000000
          <idle>-0     [000] d..2   828.575447: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.575448: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
          <idle>-0     [000] dn.2   828.585170: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   828.585171: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6908000000 softexpires‚6908000000
          <idle>-0     [018] d.h2   828.587084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.587084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h1   828.587084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   828.587084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6908001200
          <idle>-0     [001] d.h1   828.587084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6908001420
              ps-2012  [000] d.h.   828.587085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚6908001320
          <idle>-0     [018] d.h1   828.587085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.587085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.587086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
              ps-2012  [000] d.h.   828.587088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h1   828.587088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6912000000 softexpires‚6912000000
              ps-2012  [000] d.s1   828.587090: timer_cancel: timerÿff0000090295a8
              ps-2012  [000] d.s.   828.587090: timer_expire_entry: timerÿff0000090295a8 functionÞlayed_work_timer_fn nowB95099024
              ps-2012  [000] dns.   828.587092: timer_expire_exit: timerÿff0000090295a8
          <idle>-0     [001] d..2   828.587095: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
          <idle>-0     [001] dn.2   828.587101: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] dn.2   828.587101: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6912000000 softexpires‚6912000000
          <idle>-0     [018] dn.2   828.587106: hrtimer_cancel: hrtimerÿff8017fbc74808
     kworker/1:1-1158  [001] d..1   828.587107: timer_start: timerÿff8017fbe5e558 functionÞlayed_work_timer_fn expiresB95099247 [timeout"3] cpu=1 idx\x131 flags=D|I
          <idle>-0     [018] dn.2   828.587107: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6912000000 softexpires‚6912000000
     kworker/0:0-3     [000] d..1   828.587108: timer_start: timerÿff0000090295a8 functionÞlayed_work_timer_fn expiresB95099250 [timeout"6] cpu=0 idx€ flags=D|I
          <idle>-0     [001] d..1   828.587110: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   828.587110: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.587110: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
     kworker/0:0-3     [000] d..1   828.587112: timer_start: timerÿff8017fbe43558 functionÞlayed_work_timer_fn expiresB95099250 [timeout"6] cpu=0 idx\x101 flags=D|I
    kworker/18:1-1177  [018] d..1   828.587115: timer_start: timerÿff8017fbc76558 functionÞlayed_work_timer_fn expiresB95099446 [timeoutB2] cpu\x18 idxp flags=D|I
          <idle>-0     [018] d..1   828.587119: tick_stop: success=1 dependency=NONE
          <idle>-0     [018] d..2   828.587119: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   828.587120: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
          <idle>-0     [000] d..1   828.587808: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.587809: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.587809: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6924000000 softexpires‚6924000000
          <idle>-0     [000] d..2   828.591939: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.591940: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6940000000 softexpires‚6940000000
          <idle>-0     [018] d.h2   828.603084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.603084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.603084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6924001400
          <idle>-0     [001] d.h1   828.603084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6924001640
          <idle>-0     [018] d.h1   828.603085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.603086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.603088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6940000000 softexpires‚6940000000
          <idle>-0     [001] d..2   828.603088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6940000000 softexpires‚6940000000
          <idle>-0     [000] d..2   828.607216: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.607217: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6956000000 softexpires‚6956000000
          <idle>-0     [018] d.h2   828.619084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.619084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.619084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6940001300
          <idle>-0     [001] d.h1   828.619084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6940001520
          <idle>-0     [018] d.h1   828.619085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.619085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.619087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6956000000 softexpires‚6956000000
          <idle>-0     [001] d..2   828.619088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6956000000 softexpires‚6956000000
          <idle>-0     [000] d..2   828.623882: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.623882: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6972000000 softexpires‚6972000000
          <idle>-0     [018] d.h2   828.635084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.635084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.635084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6956001340
          <idle>-0     [001] d.h1   828.635084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6956001540
          <idle>-0     [018] d.h1   828.635085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.635085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   828.635087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6972000000 softexpires‚6972000000
          <idle>-0     [018] d..2   828.635087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6972000000 softexpires‚6972000000
          <idle>-0     [000] d..2   828.639158: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.639159: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚6988000000 softexpires‚6988000000
          <idle>-0     [018] d.h2   828.651084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.651084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.651084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6972001300
          <idle>-0     [001] d.h1   828.651084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6972001520
          <idle>-0     [018] d.h1   828.651085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.651085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.651086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚6988000000 softexpires‚6988000000
          <idle>-0     [001] d..2   828.651088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚6988000000 softexpires‚6988000000
          <idle>-0     [000] d..2   828.655824: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.655825: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7004000000 softexpires‚7004000000
          <idle>-0     [018] d.h2   828.667084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.667084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.667084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚6988001320
          <idle>-0     [001] d.h1   828.667084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚6988001540
          <idle>-0     [018] d.h1   828.667085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.667085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.667087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7004000000 softexpires‚7004000000
          <idle>-0     [001] d..2   828.667087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7004000000 softexpires‚7004000000
          <idle>-0     [000] d..2   828.671101: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.671101: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7020000000 softexpires‚7020000000
          <idle>-0     [018] d.h2   828.683084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.683084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.683084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7004001340
          <idle>-0     [001] d.h1   828.683084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7004001520
          <idle>-0     [018] d.h1   828.683085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.683085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.683086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7020000000 softexpires‚7020000000
          <idle>-0     [001] d..2   828.683088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7020000000 softexpires‚7020000000
          <idle>-0     [000] d..2   828.687766: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.687767: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7036000000 softexpires‚7036000000
          <idle>-0     [018] d.h2   828.699084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.699084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.699084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7020001320
          <idle>-0     [001] d.h1   828.699084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7020001520
          <idle>-0     [018] d.h1   828.699085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.699085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.699087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7036000000 softexpires‚7036000000
          <idle>-0     [001] d..2   828.699088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7036000000 softexpires‚7036000000
          <idle>-0     [000] d..2   828.704432: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.704433: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7052000000 softexpires‚7052000000
          <idle>-0     [018] d.h2   828.715084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.715084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.715084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7036001320
          <idle>-0     [001] d.h1   828.715084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7036001520
          <idle>-0     [018] d.h1   828.715085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.715085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.715086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7052000000 softexpires‚7052000000
          <idle>-0     [001] d..2   828.715088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7052000000 softexpires‚7052000000
          <idle>-0     [000] d..2   828.719709: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.719709: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7068000000 softexpires‚7068000000
          <idle>-0     [018] d.h2   828.731084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.731084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.731084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7052001320
          <idle>-0     [001] d.h1   828.731084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7052001520
          <idle>-0     [018] d.h1   828.731085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.731085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.731087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7068000000 softexpires‚7068000000
          <idle>-0     [001] d..2   828.731087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7068000000 softexpires‚7068000000
          <idle>-0     [000] d..2   828.736374: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.736375: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7084000000 softexpires‚7084000000
          <idle>-0     [018] d.h2   828.747084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.747084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.747084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7068001320
          <idle>-0     [001] d.h1   828.747084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7068001500
          <idle>-0     [018] d.h1   828.747085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.747085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.747086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7084000000 softexpires‚7084000000
          <idle>-0     [001] d..2   828.747088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7084000000 softexpires‚7084000000
          <idle>-0     [000] d..2   828.751651: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.751652: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7100000000 softexpires‚7100000000
          <idle>-0     [018] d.h2   828.763084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.763084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.763084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7084001340
          <idle>-0     [001] d.h1   828.763084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7084001560
          <idle>-0     [018] d.h1   828.763085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.763085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.763087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7100000000 softexpires‚7100000000
          <idle>-0     [001] d..2   828.763087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7100000000 softexpires‚7100000000
          <idle>-0     [000] d..2   828.768317: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.768317: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7116000000 softexpires‚7116000000
          <idle>-0     [018] d.h2   828.779084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.779084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.779084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7100001320
          <idle>-0     [001] d.h1   828.779085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7100001520
          <idle>-0     [018] d.h1   828.779085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.779086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.779086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7116000000 softexpires‚7116000000
          <idle>-0     [001] d..2   828.779088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7116000000 softexpires‚7116000000
          <idle>-0     [000] d..2   828.783594: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.783594: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7132000000 softexpires‚7132000000
          <idle>-0     [018] d.h2   828.795084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.795084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.795084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7116001320
          <idle>-0     [001] d.h1   828.795084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7116001520
          <idle>-0     [018] d.h1   828.795085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.795085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.795087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7132000000 softexpires‚7132000000
          <idle>-0     [001] d..2   828.795087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7132000000 softexpires‚7132000000
          <idle>-0     [000] d..2   828.800259: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.800260: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7148000000 softexpires‚7148000000
          <idle>-0     [018] d.h2   828.811084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.811084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.811084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7132001320
          <idle>-0     [001] d.h1   828.811084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7132001520
          <idle>-0     [018] d.h1   828.811085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.811085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.811086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7148000000 softexpires‚7148000000
          <idle>-0     [001] d..2   828.811088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7148000000 softexpires‚7148000000
          <idle>-0     [000] d..2   828.815536: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.815537: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7164000000 softexpires‚7164000000
          <idle>-0     [018] d.h2   828.827084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.827084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.827084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7148001300
          <idle>-0     [001] d.h1   828.827084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7148001500
          <idle>-0     [018] d.h1   828.827085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.827085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.827087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7164000000 softexpires‚7164000000
          <idle>-0     [001] d..2   828.827087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7164000000 softexpires‚7164000000
          <idle>-0     [000] d..2   828.832202: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.832202: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7180000000 softexpires‚7180000000
          <idle>-0     [018] d.h2   828.843084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.843084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.843084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7164001320
          <idle>-0     [001] d.h1   828.843084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7164001520
          <idle>-0     [018] d.h1   828.843085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.843085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.843086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7180000000 softexpires‚7180000000
          <idle>-0     [001] d..2   828.843088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7180000000 softexpires‚7180000000
          <idle>-0     [000] d..2   828.847478: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.847479: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7196000000 softexpires‚7196000000
          <idle>-0     [018] d.h2   828.859084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.859084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.859084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7180001300
          <idle>-0     [001] d.h1   828.859084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7180001500
          <idle>-0     [018] d.h1   828.859085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.859085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.859087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7196000000 softexpires‚7196000000
          <idle>-0     [001] d..2   828.859087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7196000000 softexpires‚7196000000
          <idle>-0     [000] d..2   828.864144: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.864145: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7212000000 softexpires‚7212000000
          <idle>-0     [018] d.h2   828.875084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.875084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.875084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7196001320
          <idle>-0     [001] d.h1   828.875084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7196001500
          <idle>-0     [018] d.h1   828.875085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.875085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.875086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7212000000 softexpires‚7212000000
          <idle>-0     [001] d..2   828.875087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7212000000 softexpires‚7212000000
          <idle>-0     [000] d..2   828.879421: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.879421: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7228000000 softexpires‚7228000000
          <idle>-0     [018] d.h2   828.891084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.891084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.891084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7212001340
          <idle>-0     [001] d.h1   828.891084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7212001560
          <idle>-0     [018] d.h1   828.891085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.891085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.891087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7228000000 softexpires‚7228000000
          <idle>-0     [001] d..2   828.891087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7228000000 softexpires‚7228000000
          <idle>-0     [000] d..2   828.896086: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.896087: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7244000000 softexpires‚7244000000
          <idle>-0     [018] d.h2   828.907084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.907084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.907084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7228001300
          <idle>-0     [001] d.h1   828.907084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7228001500
          <idle>-0     [018] d.h1   828.907085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.907085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.907086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7244000000 softexpires‚7244000000
          <idle>-0     [001] d..2   828.907088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7244000000 softexpires‚7244000000
          <idle>-0     [000] d..2   828.911363: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.911364: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7260000000 softexpires‚7260000000
          <idle>-0     [000] dn.2   828.921086: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   828.921087: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7244000000 softexpires‚7244000000
          <idle>-0     [018] d.h2   828.923084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.923084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h1   828.923084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   828.923084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7244001340
          <idle>-0     [001] d.h1   828.923084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7244001520
              ps-2012  [000] d.h.   828.923085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚7244001400
          <idle>-0     [018] d.h1   828.923085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.923085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.923087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7260000000 softexpires‚7260000000
              ps-2012  [000] d.h.   828.923088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h1   828.923088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7248000000 softexpires‚7248000000
          <idle>-0     [001] d..2   828.923089: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7260000000 softexpires‚7260000000
          <idle>-0     [000] d..1   828.923707: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.923708: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.923708: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7260000000 softexpires‚7260000000
          <idle>-0     [000] d..2   828.927855: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.927856: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7276000000 softexpires‚7276000000
          <idle>-0     [018] d.h2   828.939084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.939084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.939084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7260001320
          <idle>-0     [001] d.h1   828.939084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7260001560
          <idle>-0     [018] d.h1   828.939085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.939085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.939086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7276000000 softexpires‚7276000000
          <idle>-0     [001] d..2   828.939088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7276000000 softexpires‚7276000000
          <idle>-0     [000] d..2   828.943132: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.943133: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7292000000 softexpires‚7292000000
          <idle>-0     [018] d.h2   828.955084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.955084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.955084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7276001340
          <idle>-0     [001] d.h1   828.955084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7276001560
          <idle>-0     [018] d.h1   828.955085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.955085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.955087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7292000000 softexpires‚7292000000
          <idle>-0     [001] d..2   828.955087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7292000000 softexpires‚7292000000
          <idle>-0     [000] d..2   828.959798: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.959798: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7308000000 softexpires‚7308000000
          <idle>-0     [018] d.h2   828.971084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.971084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.971084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7292001300
          <idle>-0     [001] d.h1   828.971084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7292001480
          <idle>-0     [018] d.h1   828.971085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.971085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.971086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7308000000 softexpires‚7308000000
          <idle>-0     [001] d..2   828.971088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7308000000 softexpires‚7308000000
          <idle>-0     [000] d..2   828.976463: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.976464: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7324000000 softexpires‚7324000000
          <idle>-0     [018] d.h2   828.987084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   828.987084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   828.987084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7308001320
          <idle>-0     [001] d.h1   828.987084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7308001520
          <idle>-0     [018] d.h1   828.987085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   828.987085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   828.987087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7324000000 softexpires‚7324000000
          <idle>-0     [001] d..2   828.987087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7324000000 softexpires‚7324000000
          <idle>-0     [000] d..2   828.991740: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   828.991741: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7340000000 softexpires‚7340000000
          <idle>-0     [018] d.h2   829.003084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.003084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.003084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7324001300
          <idle>-0     [001] d.h1   829.003084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7324001520
          <idle>-0     [018] d.h1   829.003085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.003085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.003086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7340000000 softexpires‚7340000000
          <idle>-0     [001] d..2   829.003088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7340000000 softexpires‚7340000000
          <idle>-0     [000] d..2   829.008406: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.008406: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7356000000 softexpires‚7356000000
          <idle>-0     [018] d.h2   829.019084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.019084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.019084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7340001340
          <idle>-0     [001] d.h1   829.019084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7340001580
          <idle>-0     [018] d.h1   829.019085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.019085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.019087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7356000000 softexpires‚7356000000
          <idle>-0     [001] d..2   829.019087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7356000000 softexpires‚7356000000
          <idle>-0     [000] d..2   829.023682: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.023683: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7372000000 softexpires‚7372000000
          <idle>-0     [018] d.h2   829.035084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.035084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.035084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7356001300
          <idle>-0     [001] d.h1   829.035084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7356001520
          <idle>-0     [018] d.h1   829.035085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.035085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.035087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7372000000 softexpires‚7372000000
          <idle>-0     [001] d..2   829.035088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7372000000 softexpires‚7372000000
          <idle>-0     [000] d..2   829.040348: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.040349: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7388000000 softexpires‚7388000000
          <idle>-0     [018] d.h2   829.051084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.051084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.051084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7372001300
          <idle>-0     [001] d.h1   829.051084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7372001480
          <idle>-0     [018] d.h1   829.051085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.051085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   829.051087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7388000000 softexpires‚7388000000
          <idle>-0     [018] d..2   829.051087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7388000000 softexpires‚7388000000
          <idle>-0     [001] d.h2   829.067085: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h2   829.067085: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   829.067085: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.067086: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7388002960
          <idle>-0     [018] d.h1   829.067086: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7388003120
          <idle>-0     [034] d.h2   829.067086: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [000] d.h1   829.067086: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚7388002980
          <idle>-0     [034] d.h1   829.067086: hrtimer_expire_entry: hrtimerÿff8017dba91808 function=tick_sched_timer now‚7388003140
          <idle>-0     [018] d.h1   829.067087: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.067087: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   829.067088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [034] d.h1   829.067088: hrtimer_expire_exit: hrtimerÿff8017dba91808
          <idle>-0     [018] d..2   829.067088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7404000000 softexpires‚7404000000
          <idle>-0     [034] d.s2   829.067089: timer_cancel: timerÿff00000909a9a0
          <idle>-0     [000] d.s2   829.067089: timer_cancel: timerÿff8017d3923408
          <idle>-0     [034] d.s1   829.067089: timer_expire_entry: timerÿff00000909a9a0 functionÞlayed_work_timer_fn nowB95099144
          <idle>-0     [000] d.s1   829.067089: timer_expire_entry: timerÿff8017d3923408 functionÞlayed_work_timer_fn nowB95099144
          <idle>-0     [001] d..2   829.067089: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7404000000 softexpires‚7404000000
          <idle>-0     [034] dns1   829.067091: timer_expire_exit: timerÿff00000909a9a0
          <idle>-0     [000] dns1   829.067091: timer_expire_exit: timerÿff8017d3923408
          <idle>-0     [034] dn.2   829.067094: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚7392000000 softexpires‚7392000000
          <idle>-0     [000] dn.2   829.067094: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7392000000 softexpires‚7392000000
     kworker/0:0-3     [000] d..1   829.067097: timer_start: timerÿff8017d3923408 functionÞlayed_work_timer_fn expiresB95099394 [timeout%0] cpu=0 idx— flags=I
          <idle>-0     [000] d..1   829.067100: tick_stop: success=1 dependency=NONE
    kworker/34:2-1587  [034] d..1   829.067101: timer_start: timerÿff00000909a9a0 functionÞlayed_work_timer_fn expiresB95099394 [timeout%0] cpu4 idx— flags=I
          <idle>-0     [000] d..2   829.067101: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.067102: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7404000000 softexpires‚7404000000
          <idle>-0     [034] d..1   829.067104: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   829.067104: hrtimer_cancel: hrtimerÿff8017dba91808
          <idle>-0     [034] d..2   829.067104: hrtimer_start: hrtimerÿff8017dba91808 function=tick_sched_timer expires‚8412000000 softexpires‚8412000000
          <idle>-0     [000] d..2   829.072292: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.072292: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7420000000 softexpires‚7420000000
          <idle>-0     [018] d.h2   829.083084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.083084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.083084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7404001380
          <idle>-0     [001] d.h1   829.083084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7404001600
          <idle>-0     [018] d.h1   829.083085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.083086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.083087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7420000000 softexpires‚7420000000
          <idle>-0     [001] d..2   829.083088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7420000000 softexpires‚7420000000
          <idle>-0     [000] d.h2   829.099084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   829.099084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.099084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.099084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7420001460
          <idle>-0     [000] d.h1   829.099084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚7420001460
          <idle>-0     [001] d.h1   829.099084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7420001660
          <idle>-0     [018] d.h1   829.099085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.099085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   829.099086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   829.099086: timer_cancel: timerÿff8017d3923c08
          <idle>-0     [018] d..2   829.099087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7436000000 softexpires‚7436000000
          <idle>-0     [000] d.s1   829.099087: timer_expire_entry: timerÿff8017d3923c08 functionÞlayed_work_timer_fn nowB95099152
          <idle>-0     [001] d..2   829.099088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7436000000 softexpires‚7436000000
          <idle>-0     [000] dns1   829.099088: timer_expire_exit: timerÿff8017d3923c08
          <idle>-0     [000] dn.2   829.099092: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7424000000 softexpires‚7424000000
     kworker/0:0-3     [000] d..1   829.099094: timer_start: timerÿff8017d3923c08 functionÞlayed_work_timer_fn expiresB95099402 [timeout%0] cpu=0 idx˜ flags=I
          <idle>-0     [000] d..1   829.099096: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.099097: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.099097: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7436000000 softexpires‚7436000000
          <idle>-0     [000] d..2   829.104233: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.104233: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7452000000 softexpires‚7452000000
          <idle>-0     [018] d.h2   829.115084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.115084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.115084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7436001340
          <idle>-0     [001] d.h1   829.115084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7436001540
          <idle>-0     [018] d.h1   829.115085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.115085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.115087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7452000000 softexpires‚7452000000
          <idle>-0     [001] d..2   829.115087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7452000000 softexpires‚7452000000
          <idle>-0     [000] d..2   829.119510: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.119510: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7468000000 softexpires‚7468000000
          <idle>-0     [018] d.h2   829.131084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.131084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.131084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7452001360
          <idle>-0     [001] d.h1   829.131084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7452001580
          <idle>-0     [018] d.h1   829.131085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.131085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.131086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7468000000 softexpires‚7468000000
          <idle>-0     [001] d..2   829.131087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7468000000 softexpires‚7468000000
          <idle>-0     [000] d..2   829.136175: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.136176: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7484000000 softexpires‚7484000000
          <idle>-0     [018] d.h2   829.147084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.147084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.147084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7468001400
          <idle>-0     [001] d.h1   829.147084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7468001620
          <idle>-0     [018] d.h1   829.147085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.147085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.147087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7484000000 softexpires‚7484000000
          <idle>-0     [001] d..2   829.147087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7484000000 softexpires‚7484000000
          <idle>-0     [000] d..2   829.151452: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.151453: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7500000000 softexpires‚7500000000
          <idle>-0     [018] d.h2   829.163084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.163084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.163084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7484001300
          <idle>-0     [001] d.h1   829.163084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7484001480
          <idle>-0     [018] d.h1   829.163085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.163085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.163086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7500000000 softexpires‚7500000000
          <idle>-0     [001] d..2   829.163087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7500000000 softexpires‚7500000000
          <idle>-0     [000] d..2   829.168118: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.168118: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7516000000 softexpires‚7516000000
          <idle>-0     [018] d.h2   829.179084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.179084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   829.179084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7500001540
          <idle>-0     [018] d.h1   829.179084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7500001340
          <idle>-0     [001] d.h1   829.179085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.179085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d..2   829.179087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7516000000 softexpires‚7516000000
          <idle>-0     [018] d..2   829.179088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7516000000 softexpires‚7516000000
          <idle>-0     [000] d..2   829.183394: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.183395: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7532000000 softexpires‚7532000000
          <idle>-0     [018] d.h2   829.195084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.195084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.195084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7516001340
          <idle>-0     [001] d.h1   829.195084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7516001540
          <idle>-0     [018] d.h1   829.195085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.195085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.195086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7532000000 softexpires‚7532000000
          <idle>-0     [001] d..2   829.195088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7532000000 softexpires‚7532000000
          <idle>-0     [000] d..2   829.200060: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.200061: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7548000000 softexpires‚7548000000
          <idle>-0     [018] d.h2   829.211084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.211084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.211084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7532001380
          <idle>-0     [001] d.h1   829.211084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7532001560
          <idle>-0     [018] d.h1   829.211085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.211085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   829.211087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7548000000 softexpires‚7548000000
          <idle>-0     [018] d..2   829.211087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7548000000 softexpires‚7548000000
          <idle>-0     [000] d..2   829.215337: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.215337: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7564000000 softexpires‚7564000000
          <idle>-0     [018] d.h2   829.227084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.227084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.227084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7548001340
          <idle>-0     [001] d.h1   829.227084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7548001560
          <idle>-0     [018] d.h1   829.227085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.227085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.227086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7564000000 softexpires‚7564000000
          <idle>-0     [001] d..2   829.227088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7564000000 softexpires‚7564000000
          <idle>-0     [000] d..2   829.232002: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.232003: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7580000000 softexpires‚7580000000
          <idle>-0     [018] d.h2   829.243084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.243084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.243084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7564001360
          <idle>-0     [001] d.h1   829.243084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7564001580
          <idle>-0     [018] d.h1   829.243085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.243085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.243087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7580000000 softexpires‚7580000000
          <idle>-0     [001] d..2   829.243087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7580000000 softexpires‚7580000000
          <idle>-0     [000] d..2   829.247279: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.247280: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7596000000 softexpires‚7596000000
          <idle>-0     [000] dn.2   829.257002: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.257003: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7580000000 softexpires‚7580000000
          <idle>-0     [018] d.h2   829.259084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.259084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h1   829.259084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   829.259084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7580001360
          <idle>-0     [001] d.h1   829.259084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7580001540
              ps-2012  [000] d.h.   829.259085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚7580001460
          <idle>-0     [018] d.h1   829.259085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.259085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.259086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7596000000 softexpires‚7596000000
              ps-2012  [000] d.h.   829.259088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h1   829.259089: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7584000000 softexpires‚7584000000
          <idle>-0     [001] d..2   829.259094: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7596000000 softexpires‚7596000000
          <idle>-0     [000] d..1   829.259641: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.259642: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.259642: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7596000000 softexpires‚7596000000
          <idle>-0     [000] d..2   829.263771: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.263772: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7612000000 softexpires‚7612000000
          <idle>-0     [018] d.h2   829.275084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.275084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.275084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7596001380
          <idle>-0     [001] d.h1   829.275084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7596001620
          <idle>-0     [018] d.h1   829.275085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.275085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.275087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7612000000 softexpires‚7612000000
          <idle>-0     [001] d..2   829.275087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7612000000 softexpires‚7612000000
          <idle>-0     [000] d..2   829.280437: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.280437: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7628000000 softexpires‚7628000000
          <idle>-0     [018] d.h2   829.291084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.291084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.291084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7612001360
          <idle>-0     [001] d.h1   829.291084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7612001580
          <idle>-0     [018] d.h1   829.291085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.291085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.291086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7628000000 softexpires‚7628000000
          <idle>-0     [001] d..2   829.291088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7628000000 softexpires‚7628000000
          <idle>-0     [000] d..2   829.295714: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.295714: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7644000000 softexpires‚7644000000
          <idle>-0     [018] d.h2   829.307084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.307084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.307084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7628001420
          <idle>-0     [001] d.h1   829.307084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7628001640
          <idle>-0     [018] d.h1   829.307085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.307085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.307087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7644000000 softexpires‚7644000000
          <idle>-0     [001] d..2   829.307087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7644000000 softexpires‚7644000000
          <idle>-0     [000] d..2   829.312379: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.312380: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7660000000 softexpires‚7660000000
          <idle>-0     [018] d.h2   829.323084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.323084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.323084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7644001340
          <idle>-0     [001] d.h1   829.323085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7644001560
          <idle>-0     [018] d.h1   829.323085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.323086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.323086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7660000000 softexpires‚7660000000
          <idle>-0     [001] d..2   829.323088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7660000000 softexpires‚7660000000
          <idle>-0     [000] d..2   829.327656: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.327657: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7676000000 softexpires‚7676000000
          <idle>-0     [018] d.h2   829.339084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.339084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.339084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7660001460
          <idle>-0     [001] d.h1   829.339084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7660001660
          <idle>-0     [018] d.h1   829.339085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.339085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.339087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7676000000 softexpires‚7676000000
          <idle>-0     [001] d..2   829.339087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7676000000 softexpires‚7676000000
          <idle>-0     [000] d..2   829.344322: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.344322: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7692000000 softexpires‚7692000000
          <idle>-0     [018] d.h2   829.355084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.355084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.355084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7676001340
          <idle>-0     [001] d.h1   829.355084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7676001560
          <idle>-0     [018] d.h1   829.355085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.355085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.355086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7692000000 softexpires‚7692000000
          <idle>-0     [001] d..2   829.355088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7692000000 softexpires‚7692000000
          <idle>-0     [000] d..2   829.359599: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.359599: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7708000000 softexpires‚7708000000
          <idle>-0     [018] d.h2   829.371084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.371084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.371084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7692001360
          <idle>-0     [001] d.h1   829.371084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7692001580
          <idle>-0     [018] d.h1   829.371085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.371085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.371087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7708000000 softexpires‚7708000000
          <idle>-0     [001] d..2   829.371087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7708000000 softexpires‚7708000000
          <idle>-0     [000] d..2   829.376264: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.376265: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7724000000 softexpires‚7724000000
          <idle>-0     [018] d.h2   829.387084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.387084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.387084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7708001360
          <idle>-0     [001] d.h1   829.387084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7708001560
          <idle>-0     [018] d.h1   829.387085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.387085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.387086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7724000000 softexpires‚7724000000
          <idle>-0     [001] d..2   829.387088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7724000000 softexpires‚7724000000
          <idle>-0     [000] d..2   829.391541: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.391541: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7740000000 softexpires‚7740000000
          <idle>-0     [018] d.h2   829.403084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.403084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.403084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7724001380
          <idle>-0     [001] d.h1   829.403084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7724001580
          <idle>-0     [018] d.h1   829.403085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.403085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.403087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7740000000 softexpires‚7740000000
          <idle>-0     [001] d..2   829.403087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7740000000 softexpires‚7740000000
          <idle>-0     [000] d..2   829.408206: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.408207: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7756000000 softexpires‚7756000000
          <idle>-0     [018] d.h2   829.419084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.419084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.419084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7740001360
          <idle>-0     [001] d.h1   829.419084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7740001560
          <idle>-0     [018] d.h1   829.419085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.419085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.419086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7756000000 softexpires‚7756000000
          <idle>-0     [001] d..2   829.419088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7756000000 softexpires‚7756000000
          <idle>-0     [000] d..2   829.423483: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.423484: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7772000000 softexpires‚7772000000
          <idle>-0     [018] d.h2   829.435084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.435084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.435084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7756001340
          <idle>-0     [001] d.h1   829.435084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7756001560
          <idle>-0     [018] d.h1   829.435085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.435085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.435087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7772000000 softexpires‚7772000000
          <idle>-0     [001] d..2   829.435087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7772000000 softexpires‚7772000000
          <idle>-0     [000] d..2   829.440149: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.440149: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7788000000 softexpires‚7788000000
          <idle>-0     [018] d.h2   829.451084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.451084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.451084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7772001360
          <idle>-0     [001] d.h1   829.451084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7772001560
          <idle>-0     [018] d.h1   829.451085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.451085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.451087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7788000000 softexpires‚7788000000
          <idle>-0     [001] d..2   829.451088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7788000000 softexpires‚7788000000
          <idle>-0     [000] d..2   829.455426: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.455426: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7804000000 softexpires‚7804000000
          <idle>-0     [018] d.h2   829.467084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.467084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.467084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7788001380
          <idle>-0     [001] d.h1   829.467084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7788001560
          <idle>-0     [018] d.h1   829.467085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.467085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   829.467087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7804000000 softexpires‚7804000000
          <idle>-0     [018] d..2   829.467087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7804000000 softexpires‚7804000000
          <idle>-0     [000] d..2   829.472091: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.472092: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7820000000 softexpires‚7820000000
          <idle>-0     [018] d.h2   829.483084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.483084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.483084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7804001360
          <idle>-0     [001] d.h1   829.483084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7804001580
          <idle>-0     [018] d.h1   829.483085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.483085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.s2   829.483086: timer_cancel: timerÿff8017fbe5e558
          <idle>-0     [018] d..2   829.483086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7820000000 softexpires‚7820000000
          <idle>-0     [001] d.s1   829.483087: timer_expire_entry: timerÿff8017fbe5e558 functionÞlayed_work_timer_fn nowB95099248
          <idle>-0     [001] dns1   829.483089: timer_expire_exit: timerÿff8017fbe5e558
          <idle>-0     [001] dn.2   829.483091: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7808000000 softexpires‚7808000000
          <idle>-0     [001] d..1   829.483097: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   829.483098: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   829.483098: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7820000000 softexpires‚7820000000
          <idle>-0     [000] d..2   829.487368: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.487369: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7836000000 softexpires‚7836000000
          <idle>-0     [018] d.h2   829.499084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.499084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.499084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7820001420
          <idle>-0     [001] d.h1   829.499084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7820001660
          <idle>-0     [018] d.h1   829.499085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.499085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.499087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7836000000 softexpires‚7836000000
          <idle>-0     [001] d..2   829.499088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7836000000 softexpires‚7836000000
          <idle>-0     [000] d..2   829.504034: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.504034: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7852000000 softexpires‚7852000000
          <idle>-0     [018] d.h2   829.515084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.515084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.515084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7836001640
          <idle>-0     [001] d.h1   829.515084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7836001780
          <idle>-0     [018] d.h1   829.515085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.515085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.515087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7852000000 softexpires‚7852000000
          <idle>-0     [001] d..2   829.515088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7852000000 softexpires‚7852000000
          <idle>-0     [000] d..2   829.519310: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.519311: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7868000000 softexpires‚7868000000
          <idle>-0     [018] d.h2   829.531084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.531084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.531084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7852001340
          <idle>-0     [001] d.h1   829.531084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7852001540
          <idle>-0     [018] d.h1   829.531085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.531085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.531087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7868000000 softexpires‚7868000000
          <idle>-0     [001] d..2   829.531087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7868000000 softexpires‚7868000000
          <idle>-0     [000] d..2   829.535976: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.535977: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7884000000 softexpires‚7884000000
          <idle>-0     [018] d.h2   829.547084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.547084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.547084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7868001360
          <idle>-0     [001] d.h1   829.547084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7868001560
          <idle>-0     [018] d.h1   829.547085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.547085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.547086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7884000000 softexpires‚7884000000
          <idle>-0     [001] d..2   829.547088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7884000000 softexpires‚7884000000
          <idle>-0     [000] d..2   829.551253: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.551253: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7900000000 softexpires‚7900000000
          <idle>-0     [018] d.h2   829.563084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.563084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.563084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7884001380
          <idle>-0     [001] d.h1   829.563084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7884001580
          <idle>-0     [018] d.h1   829.563085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.563085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.563087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7900000000 softexpires‚7900000000
          <idle>-0     [001] d..2   829.563087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7900000000 softexpires‚7900000000
          <idle>-0     [000] d..2   829.567918: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.567919: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7916000000 softexpires‚7916000000
          <idle>-0     [018] d.h2   829.579084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.579084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.579084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7900001360
          <idle>-0     [001] d.h1   829.579084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7900001600
          <idle>-0     [018] d.h1   829.579085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.579085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.579087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7916000000 softexpires‚7916000000
          <idle>-0     [001] d..2   829.579088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7916000000 softexpires‚7916000000
          <idle>-0     [000] d..2   829.583195: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.583196: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7932000000 softexpires‚7932000000
          <idle>-0     [000] dn.2   829.592918: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.592919: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7916000000 softexpires‚7916000000
          <idle>-0     [018] d.h2   829.595084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.595084: hrtimer_cancel: hrtimerÿff8017fbe5c808
              ps-2012  [000] d.h1   829.595084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   829.595084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7916001380
          <idle>-0     [001] d.h1   829.595084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7916001600
              ps-2012  [000] d.h.   829.595085: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚7916001380
          <idle>-0     [018] d.h1   829.595085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.595086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.595087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7932000000 softexpires‚7932000000
              ps-2012  [000] d.h.   829.595088: hrtimer_expire_exit: hrtimerÿff8017fbe41808
              ps-2012  [000] d.h1   829.595088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7920000000 softexpires‚7920000000
              ps-2012  [000] d.s1   829.595090: timer_cancel: timerÿff8017fbe43558
              ps-2012  [000] d.s.   829.595090: timer_expire_entry: timerÿff8017fbe43558 functionÞlayed_work_timer_fn nowB95099276
              ps-2012  [000] dns.   829.595092: timer_expire_exit: timerÿff8017fbe43558
              ps-2012  [000] dns1   829.595092: timer_cancel: timerÿff0000090295a8
              ps-2012  [000] dns.   829.595093: timer_expire_entry: timerÿff0000090295a8 functionÞlayed_work_timer_fn nowB95099276
              ps-2012  [000] dns.   829.595093: timer_expire_exit: timerÿff0000090295a8
          <idle>-0     [001] d..2   829.595094: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7932000000 softexpires‚7932000000
     kworker/0:0-3     [000] d..1   829.595102: timer_start: timerÿff8017fbe43558 functionÞlayed_work_timer_fn expiresB95099500 [timeout"4] cpu=0 idx\x111 flags=D|I
     kworker/0:0-3     [000] d..1   829.595109: timer_start: timerÿff0000090295a8 functionÞlayed_work_timer_fn expiresB95099500 [timeout"4] cpu=0 idx\x111 flags=D|I
          <idle>-0     [001] dn.2   829.595344: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] dn.2   829.595344: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7920000000 softexpires‚7920000000
          <idle>-0     [000] d..1   829.595363: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.595364: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.595365: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7932000000 softexpires‚7932000000
          <idle>-0     [001] d..1   829.595418: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   829.595419: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d..2   829.595419: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7932000000 softexpires‚7932000000
          <idle>-0     [000] d..2   829.599340: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.599341: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7948000000 softexpires‚7948000000
          <idle>-0     [018] d.h2   829.611084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.611084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.611085: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7932001760
          <idle>-0     [001] d.h1   829.611085: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7932002100
          <idle>-0     [018] d.h1   829.611086: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.611086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.611087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7948000000 softexpires‚7948000000
          <idle>-0     [001] d..2   829.611090: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7948000000 softexpires‚7948000000
          <idle>-0     [000] d..2   829.616006: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.616006: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7964000000 softexpires‚7964000000
          <idle>-0     [018] d.h2   829.627084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.627084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.627084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7948001600
          <idle>-0     [001] d.h1   829.627084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7948001860
          <idle>-0     [018] d.h1   829.627085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.627086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.627087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7964000000 softexpires‚7964000000
          <idle>-0     [001] d..2   829.627088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7964000000 softexpires‚7964000000
          <idle>-0     [000] d..2   829.631282: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.631283: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7980000000 softexpires‚7980000000
          <idle>-0     [018] d.h2   829.643084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.643084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.643084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7964001500
          <idle>-0     [001] d.h1   829.643084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7964001720
          <idle>-0     [018] d.h1   829.643085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.643085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.643087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7980000000 softexpires‚7980000000
          <idle>-0     [001] d..2   829.643088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7980000000 softexpires‚7980000000
          <idle>-0     [000] d..2   829.647948: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.647949: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚7996000000 softexpires‚7996000000
          <idle>-0     [018] d.h2   829.659084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.659084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.659084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7980001460
          <idle>-0     [001] d.h1   829.659084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7980001680
          <idle>-0     [018] d.h1   829.659085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.659085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.659087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚7996000000 softexpires‚7996000000
          <idle>-0     [001] d..2   829.659088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚7996000000 softexpires‚7996000000
          <idle>-0     [000] d..2   829.663225: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.663225: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8012000000 softexpires‚8012000000
          <idle>-0     [018] d.h2   829.675084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.675084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.675084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚7996001340
          <idle>-0     [001] d.h1   829.675084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚7996001560
          <idle>-0     [018] d.h1   829.675085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.675085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.675086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8012000000 softexpires‚8012000000
          <idle>-0     [001] d..2   829.675088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8012000000 softexpires‚8012000000
          <idle>-0     [000] d..2   829.679890: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.679891: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8028000000 softexpires‚8028000000
          <idle>-0     [018] d.h2   829.691084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.691084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.691084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8012001360
          <idle>-0     [001] d.h1   829.691084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8012001560
          <idle>-0     [018] d.h1   829.691085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.691085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.691087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8028000000 softexpires‚8028000000
          <idle>-0     [001] d..2   829.691087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8028000000 softexpires‚8028000000
          <idle>-0     [000] d..2   829.695167: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.695168: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8044000000 softexpires‚8044000000
          <idle>-0     [018] d.h2   829.707084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.707084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.707084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8028001380
          <idle>-0     [001] d.h1   829.707084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8028001600
          <idle>-0     [018] d.h1   829.707085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.707085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.707086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8044000000 softexpires‚8044000000
          <idle>-0     [001] d..2   829.707088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8044000000 softexpires‚8044000000
          <idle>-0     [000] d..2   829.711833: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.711833: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8060000000 softexpires‚8060000000
          <idle>-0     [018] d.h2   829.723084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.723084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [001] d.h1   829.723084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8044001580
          <idle>-0     [018] d.h1   829.723084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8044001380
          <idle>-0     [001] d.h1   829.723085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.723085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d..2   829.723087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8060000000 softexpires‚8060000000
          <idle>-0     [018] d..2   829.723088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8060000000 softexpires‚8060000000
          <idle>-0     [000] d..2   829.727110: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.727110: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8076000000 softexpires‚8076000000
          <idle>-0     [018] d.h2   829.739084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.739084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.739084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8060001360
          <idle>-0     [001] d.h1   829.739084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8060001580
          <idle>-0     [018] d.h1   829.739085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.739085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.739086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8076000000 softexpires‚8076000000
          <idle>-0     [001] d..2   829.739088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8076000000 softexpires‚8076000000
          <idle>-0     [000] d..2   829.743775: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.743776: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8092000000 softexpires‚8092000000
          <idle>-0     [018] d.h2   829.755084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.755084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.755084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8076001360
          <idle>-0     [001] d.h1   829.755084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8076001540
          <idle>-0     [001] d.h1   829.755085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.755085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   829.755087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8092000000 softexpires‚8092000000
          <idle>-0     [001] d..2   829.755087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8092000000 softexpires‚8092000000
          <idle>-0     [000] d..2   829.760441: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.760441: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8108000000 softexpires‚8108000000
          <idle>-0     [018] d.h2   829.771084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.771084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.771084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8092001360
          <idle>-0     [001] d.h1   829.771084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8092001540
          <idle>-0     [018] d.h1   829.771085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.771085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.771086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8108000000 softexpires‚8108000000
          <idle>-0     [001] d..2   829.771088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8108000000 softexpires‚8108000000
          <idle>-0     [000] d..2   829.775718: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.775718: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8124000000 softexpires‚8124000000
          <idle>-0     [018] d.h2   829.787084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.787084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.787084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8108001380
          <idle>-0     [001] d.h1   829.787084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8108001600
          <idle>-0     [018] d.h1   829.787085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.787085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.787087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8124000000 softexpires‚8124000000
          <idle>-0     [001] d..2   829.787088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8124000000 softexpires‚8124000000
          <idle>-0     [000] d..2   829.792383: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.792384: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8140000000 softexpires‚8140000000
          <idle>-0     [018] d.h2   829.803084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.803084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.803084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8124001360
          <idle>-0     [001] d.h1   829.803084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8124001540
          <idle>-0     [018] d.h1   829.803085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.803085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.803086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8140000000 softexpires‚8140000000
          <idle>-0     [001] d..2   829.803088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8140000000 softexpires‚8140000000
          <idle>-0     [000] d..2   829.807660: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.807661: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8156000000 softexpires‚8156000000
          <idle>-0     [018] d.h2   829.819084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.819084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.819084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8140001340
          <idle>-0     [001] d.h1   829.819084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8140001520
          <idle>-0     [018] d.h1   829.819085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.819086: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.819087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8156000000 softexpires‚8156000000
          <idle>-0     [001] d..2   829.819088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8156000000 softexpires‚8156000000
          <idle>-0     [000] d..2   829.824326: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.824326: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8172000000 softexpires‚8172000000
          <idle>-0     [018] d.h2   829.835084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.835084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.835084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8156001340
          <idle>-0     [001] d.h1   829.835084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8156001540
          <idle>-0     [018] d.h1   829.835085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.835085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.835086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8172000000 softexpires‚8172000000
          <idle>-0     [001] d..2   829.835088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8172000000 softexpires‚8172000000
          <idle>-0     [000] d..2   829.839602: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.839603: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8188000000 softexpires‚8188000000
          <idle>-0     [018] d.h2   829.851084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.851084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.851084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8172001380
          <idle>-0     [001] d.h1   829.851084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8172001580
          <idle>-0     [018] d.h1   829.851085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.851085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.851087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8188000000 softexpires‚8188000000
          <idle>-0     [001] d..2   829.851088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8188000000 softexpires‚8188000000
          <idle>-0     [000] d..2   829.856268: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.856269: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [018] d.h2   829.867084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.867084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.867084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8188001420
          <idle>-0     [001] d.h1   829.867084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8188001640
          <idle>-0     [018] d.h1   829.867085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.867085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.867086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [001] d..2   829.867088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [000] d..2   829.871545: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.871545: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.872935: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.872936: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8196000000 softexpires‚8196000000
          <idle>-0     [000] d..1   829.872940: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.872941: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.872941: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.874323: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.874324: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8196000000 softexpires‚8196000000
          <idle>-0     [000] d..1   829.874327: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.874327: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.874328: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.875713: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.875713: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8200000000 softexpires‚8200000000
          <idle>-0     [000] d..1   829.875717: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.875717: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.875718: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.877101: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.877101: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8200000000 softexpires‚8200000000
          <idle>-0     [000] d..1   829.877104: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.877105: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.877105: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.878489: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.878490: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8200000000 softexpires‚8200000000
          <idle>-0     [000] d..1   829.878493: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.878493: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.878494: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.879879: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.879880: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [000] d..1   829.879883: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.879883: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.879884: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.881267: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.881268: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [000] d..1   829.881271: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.881271: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.881271: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.882656: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.882656: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8204000000 softexpires‚8204000000
          <idle>-0     [000] d..1   829.882659: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.882660: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.882660: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [018] d.h2   829.883084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.883084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.883084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8204001360
          <idle>-0     [001] d.h1   829.883084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8204001580
          <idle>-0     [018] d.h1   829.883085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.883085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.883087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [001] d..2   829.883087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.884045: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.884046: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8208000000 softexpires‚8208000000
          <idle>-0     [000] d..1   829.884049: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.884049: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.884049: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.885433: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.885434: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8208000000 softexpires‚8208000000
          <idle>-0     [000] d..1   829.885437: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.885437: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.885438: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.886822: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.886823: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8208000000 softexpires‚8208000000
          <idle>-0     [000] d..1   829.886826: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.886826: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.886826: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
          <idle>-0     [000] dn.2   829.888212: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.888213: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8212000000 softexpires‚8212000000
          <idle>-0     [000] d..1   829.888216: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.888216: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.888217: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] dn.2   829.889600: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.889600: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8212000000 softexpires‚8212000000
          <idle>-0     [000] d..1   829.889603: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.889604: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.889604: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] dn.2   829.890989: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.890989: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8212000000 softexpires‚8212000000
          <idle>-0     [000] d..1   829.890992: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.890992: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.890993: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] dn.2   829.892378: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.892379: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8216000000 softexpires‚8216000000
          <idle>-0     [000] d..1   829.892382: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.892382: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.892383: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] dn.2   829.893766: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.893767: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8216000000 softexpires‚8216000000
          <idle>-0     [000] d..1   829.893770: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.893770: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.893771: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] dn.2   829.895155: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] dn.2   829.895156: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8220000000 softexpires‚8220000000
              sh-1993  [000] ....   829.895158: timer_init: timerÿff8017db00fb40
              sh-1993  [000] d..1   829.895158: timer_start: timerÿff8017db00fb40 function=process_timeout expiresB95099353 [timeout=2] cpu=0 idx=0 flags          <idle>-0     [000] d..1   829.895161: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.895161: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.895161: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8228000000 softexpires‚8228000000
          <idle>-0     [018] d.h2   829.899084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.899084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.899084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8220001340
          <idle>-0     [001] d.h1   829.899084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8220001540
          <idle>-0     [018] d.h1   829.899085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.899085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.899086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [001] d..2   829.899088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8236000000 softexpires‚8236000000
          <idle>-0     [000] d.h2   829.907084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.h1   829.907084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8228001680
          <idle>-0     [000] d.h1   829.907086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [000] d.s2   829.907087: timer_cancel: timerÿff8017db00fb40
          <idle>-0     [000] ..s1   829.907087: timer_expire_entry: timerÿff8017db00fb40 function=process_timeout nowB95099354
          <idle>-0     [000] .ns1   829.907089: timer_expire_exit: timerÿff8017db00fb40
          <idle>-0     [000] dn.2   829.907092: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8232000000 softexpires‚8232000000
          <idle>-0     [000] d..1   829.907120: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.907120: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [000] d..2   829.907121: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8252000000 softexpires‚8252000000
          <idle>-0     [018] d.h2   829.915084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.915084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.915084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8236001260
          <idle>-0     [001] d.h1   829.915084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8236001480
          <idle>-0     [018] d.h1   829.915085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.915085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.915087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8252000000 softexpires‚8252000000
          <idle>-0     [001] d..2   829.915087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8252000000 softexpires‚8252000000
          <idle>-0     [018] d.h2   829.931084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h2   829.931084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h2   829.931084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.931084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8252001300
          <idle>-0     [000] d.h1   829.931084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8252001300
          <idle>-0     [001] d.h1   829.931084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8252001460
          <idle>-0     [018] d.h1   829.931085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.931085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   829.931085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [018] d..2   829.931086: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8268000000 softexpires‚8268000000
          <idle>-0     [001] d..2   829.931088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8268000000 softexpires‚8268000000
          <idle>-0     [000] d..2   829.931098: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8268000000 softexpires‚8268000000
          <idle>-0     [001] d.h2   829.947084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h2   829.947084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h2   829.947084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h1   829.947084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8268001500
          <idle>-0     [018] d.h1   829.947084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8268001480
          <idle>-0     [000] d.h1   829.947084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8268001740
          <idle>-0     [001] d.h1   829.947085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.947085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h1   829.947086: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [018] d..2   829.947087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8284000000 softexpires‚8284000000
          <idle>-0     [001] d..2   829.947087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8284000000 softexpires‚8284000000
          <idle>-0     [000] d..2   829.947089: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8284000000 softexpires‚8284000000
          <idle>-0     [000] d.h2   829.963084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   829.963084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   829.963084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   829.963084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8284001460
          <idle>-0     [000] d.h1   829.963084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8284001480
          <idle>-0     [001] d.h1   829.963084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8284001600
          <idle>-0     [018] d.h1   829.963085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h1   829.963085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h1   829.963085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.963087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8300000000 softexpires‚8300000000
          <idle>-0     [001] d..2   829.963088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8300000000 softexpires‚8300000000
          <idle>-0     [000] d..2   829.963088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8300000000 softexpires‚8300000000
          <idle>-0     [018] d.h2   829.979085: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [018] d.h1   829.979085: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8300002260
          <idle>-0     [000] d.h2   829.979087: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h2   829.979088: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   829.979088: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8300005100
          <idle>-0     [001] d.h1   829.979088: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8300005260
          <idle>-0     [018] d.h1   829.979088: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h1   829.979089: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h1   829.979089: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d..2   829.979090: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8316000000 softexpires‚8316000000
          <idle>-0     [001] d..2   829.979091: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8316000000 softexpires‚8316000000
          <idle>-0     [000] d..2   829.979092: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8316000000 softexpires‚8316000000
          <idle>-0     [001] d.h2   829.995084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h2   829.995084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h2   829.995084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h1   829.995084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8316001520
          <idle>-0     [000] d.h1   829.995084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8316001680
          <idle>-0     [001] d.h1   829.995084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8316001460
          <idle>-0     [018] d.h1   829.995085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h1   829.995085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [000] d.h1   829.995085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [018] d..2   829.995087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8332000000 softexpires‚8332000000
          <idle>-0     [001] d..2   829.995088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8332000000 softexpires‚8332000000
          <idle>-0     [000] d..2   829.995089: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8332000000 softexpires‚8332000000
          <idle>-0     [000] d.h2   830.011084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [018] d.h2   830.011084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [001] d.h2   830.011084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   830.011084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8332001440
          <idle>-0     [000] d.h1   830.011084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8332001480
          <idle>-0     [001] d.h1   830.011084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8332001640
          <idle>-0     [000] d.h1   830.011085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h1   830.011085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   830.011085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [018] d..2   830.011088: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8348000000 softexpires‚8348000000
          <idle>-0     [001] d..2   830.011088: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8348000000 softexpires‚8348000000
          <idle>-0     [000] d..2   830.011088: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8348000000 softexpires‚8348000000
          <idle>-0     [001] d.h2   830.027084: hrtimer_cancel: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h2   830.027084: hrtimer_cancel: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h2   830.027084: hrtimer_cancel: hrtimerÿff8017fbe41808
          <idle>-0     [001] d.h1   830.027084: hrtimer_expire_entry: hrtimerÿff8017fbe5c808 function=tick_sched_timer now‚8348001500
          <idle>-0     [018] d.h1   830.027084: hrtimer_expire_entry: hrtimerÿff8017fbc74808 function=tick_sched_timer now‚8348001500
          <idle>-0     [000] d.h1   830.027084: hrtimer_expire_entry: hrtimerÿff8017fbe41808 function=tick_sched_timer now‚8348001660
          <idle>-0     [001] d.h1   830.027085: hrtimer_expire_exit: hrtimerÿff8017fbe5c808
          <idle>-0     [018] d.h1   830.027085: hrtimer_expire_exit: hrtimerÿff8017fbc74808
          <idle>-0     [000] d.h1   830.027085: hrtimer_expire_exit: hrtimerÿff8017fbe41808
          <idle>-0     [018] d..2   830.027087: hrtimer_start: hrtimerÿff8017fbc74808 function=tick_sched_timer expires‚8364000000 softexpires‚8364000000
          <idle>-0     [001] d..2   830.027087: hrtimer_start: hrtimerÿff8017fbe5c808 function=tick_sched_timer expires‚8364000000 softexpires‚8364000000
          <idle>-0     [000] d..2   830.027091: hrtimer_start: hrtimerÿff8017fbe41808 function=tick_sched_timer expires‚8364000000 softexpires‚8364000000
 
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 13:08                                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:      
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >         
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:        
> > > > > >         
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace period.          
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.        
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > > because the timer has not fired:        
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?      
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?    
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 

In the interests of tidier traces with less other stuff in them I disable
usb and the sas driver (which were responsible for most of the hrtimer events).
Problem just became a whole lot easier to trigger.  Happening ever few minutes
if I do pretty much anything on the machine at all.

Running with CONFIG_RCU_CPU_STALL_TIMEOUT=8 (left over from trying
to get it to trigger more often before disabling usb and sas).

No sign of the massive gap. I'm thinking that was related to the
trace buffer readout.

More long logs I'm afraid:

[  775.760469] random: crng init done
[  835.595087] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  835.600740] 	2-...: (11 GPs behind) idle=d60/0/0 softirq=368/369 fqs=0 last_accelerate: fae9/0968, nonlazy_posted: 0, L.
[  835.611595] 	3-...: (14 GPs behind) idle=0b0/0/0 softirq=238/238 fqs=0 last_accelerate: f91c/096c, nonlazy_posted: 0, L.
[  835.622450] 	4-...: (17 GPs behind) idle=464/0/0 softirq=230/234 fqs=0 last_accelerate: 01f5/096c, nonlazy_posted: 0, L.
[  835.633305] 	5-...: (17 GPs behind) idle=f4c/0/0 softirq=210/212 fqs=0 last_accelerate: 01f3/0970, nonlazy_posted: 0, L.
[  835.644159] 	6-...: (128 GPs behind) idle=e7c/0/0 softirq=218/219 fqs=0 last_accelerate: 587f/0974, nonlazy_posted: 0, L.
[  835.655099] 	7-...: (128 GPs behind) idle=e40/0/0 softirq=214/217 fqs=0 last_accelerate: 587f/0974, nonlazy_posted: 0, L.
[  835.666040] 	8-...: (109 GPs behind) idle=e20/0/0 softirq=212/212 fqs=0 last_accelerate: 587f/0978, nonlazy_posted: 0, L.
[  835.676981] 	9-...: (128 GPs behind) idle=d70/0/0 softirq=195/196 fqs=0 last_accelerate: 587f/097c, nonlazy_posted: 0, L.
[  835.687921] 	10-...: (130 GPs behind) idle=b98/0/0 softirq=429/431 fqs=0 last_accelerate: 587f/097c, nonlazy_posted: 0, L.
[  835.698949] 	11-...: (128 GPs behind) idle=d10/0/0 softirq=190/190 fqs=0 last_accelerate: 587f/0980, nonlazy_posted: 0, L.
[  835.709976] 	12-...: (108 GPs behind) idle=d60/0/0 softirq=201/202 fqs=0 last_accelerate: 587f/0984, nonlazy_posted: 0, L.
[  835.721003] 	13-...: (103 GPs behind) idle=5b4/0/0 softirq=200/200 fqs=0 last_accelerate: 587f/0984, nonlazy_posted: 0, L.
[  835.732031] 	14-...: (128 GPs behind) idle=b44/0/0 softirq=185/185 fqs=0 last_accelerate: 587f/0988, nonlazy_posted: 0, L.
[  835.743058] 	15-...: (128 GPs behind) idle=c1c/0/0 softirq=196/197 fqs=0 last_accelerate: 587f/098c, nonlazy_posted: 0, L.
[  835.754086] 	16-...: (29 GPs behind) idle=37c/0/0 softirq=205/205 fqs=0 last_accelerate: dff1/098c, nonlazy_posted: 0, L.
[  835.765026] 	17-...: (8 GPs behind) idle=b08/0/0 softirq=201/203 fqs=0 last_accelerate: fabc/0990, nonlazy_posted: 0, L.
[  835.775880] 	19-...: (0 ticks this GP) idle=994/0/0 softirq=175/175 fqs=0 last_accelerate: 0194/0994, nonlazy_posted: 0, L.
[  835.786994] 	20-...: (59 GPs behind) idle=984/0/0 softirq=185/185 fqs=0 last_accelerate: e5dc/0994, nonlazy_posted: 0, L.
[  835.797935] 	21-...: (6 GPs behind) idle=8e4/0/0 softirq=176/178 fqs=0 last_accelerate: fd08/0998, nonlazy_posted: 0, L.
[  835.808789] 	23-...: (112 GPs behind) idle=7e0/0/0 softirq=166/168 fqs=0 last_accelerate: 587f/099c, nonlazy_posted: 0, L.
[  835.819816] 	24-...: (103 GPs behind) idle=e4c/0/0 softirq=168/168 fqs=0 last_accelerate: 587f/09a0, nonlazy_posted: 0, L.
[  835.830843] 	25-...: (20 GPs behind) idle=930/0/0 softirq=1303/1303 fqs=0 last_accelerate: edf1/09a0, nonlazy_posted: 0, L.
[  835.841958] 	26-...: (4 GPs behind) idle=6d8/0/0 softirq=178/180 fqs=0 last_accelerate: fd08/09a4, nonlazy_posted: 0, L.
[  835.852812] 	27-...: (60 GPs behind) idle=8a8/0/0 softirq=183/183 fqs=0 last_accelerate: e5dc/09a8, nonlazy_posted: 0, L.
[  835.863752] 	28-...: (3 GPs behind) idle=6c4/0/0 softirq=167/168 fqs=0 last_accelerate: fe00/09a8, nonlazy_posted: 0, L.
[  835.874606] 	29-...: (57 GPs behind) idle=588/0/0 softirq=152/153 fqs=0 last_accelerate: e7d4/09ac, nonlazy_posted: 0, L.
[  835.885547] 	30-...: (10 GPs behind) idle=5cc/0/0 softirq=161/163 fqs=0 last_accelerate: f9f8/09b0, nonlazy_posted: 0, L.
[  835.896488] 	31-...: (1 GPs behind) idle=5f8/0/0 softirq=246/248 fqs=0 last_accelerate: fe18/09b0, nonlazy_posted: 0, L.
[  835.907342] 	32-...: (102 GPs behind) idle=998/0/0 softirq=158/159 fqs=0 last_accelerate: 587f/09b4, nonlazy_posted: 0, L.
[  835.918369] 	33-...: (102 GPs behind) idle=7f4/0/0 softirq=141/141 fqs=0 last_accelerate: 587f/09b8, nonlazy_posted: 0, L.
[  835.929397] 	35-...: (146 GPs behind) idle=400/0/0 softirq=142/143 fqs=0 last_accelerate: 587f/09b8, nonlazy_posted: 0, L.
[  835.940425] 	36-...: (39 GPs behind) idle=424/0/0 softirq=115/115 fqs=0 last_accelerate: 01f5/09bc, nonlazy_posted: 0, L.
[  835.951366] 	37-...: (38 GPs behind) idle=400/0/0 softirq=126/128 fqs=0 last_accelerate: fdd4/09c0, nonlazy_posted: 0, L.
[  835.962307] 	38-...: (17 GPs behind) idle=5b4/0/0 softirq=139/142 fqs=0 last_accelerate: d804/09c0, nonlazy_posted: 0, L.
[  835.973247] 	39-...: (17 GPs behind) idle=104/0/0 softirq=110/110 fqs=0 last_accelerate: 01f8/09c4, nonlazy_posted: 0, L.
[  835.984188] 	40-...: (150 GPs behind) idle=098/0/0 softirq=107/110 fqs=0 last_accelerate: 587f/09c8, nonlazy_posted: 0, L.
[  835.995216] 	41-...: (145 GPs behind) idle=038/0/0 softirq=102/103 fqs=0 last_accelerate: 587f/09cc, nonlazy_posted: 0, L.
[  836.006243] 	42-...: (147 GPs behind) idle=fa8/0/0 softirq=94/94 fqs=0 last_accelerate: 587f/09cc, nonlazy_posted: 0, L.
[  836.017097] 	43-...: (145 GPs behind) idle=f68/0/0 softirq=98/98 fqs=0 last_accelerate: 587f/09d0, nonlazy_posted: 0, L.
[  836.027951] 	44-...: (147 GPs behind) idle=f98/0/0 softirq=90/90 fqs=0 last_accelerate: 587f/09d4, nonlazy_posted: 0, L.
[  836.038805] 	45-...: (147 GPs behind) idle=f44/0/0 softirq=76/77 fqs=0 last_accelerate: 587f/09d4, nonlazy_posted: 0, L.
[  836.049658] 	46-...: (147 GPs behind) idle=f3c/0/0 softirq=77/77 fqs=0 last_accelerate: 587f/09d8, nonlazy_posted: 0, L.
[  836.060513] 	47-...: (146 GPs behind) idle=e60/0/0 softirq=77/78 fqs=0 last_accelerate: 587f/09dc, nonlazy_posted: 0, L.
[  836.071367] 	48-...: (49 GPs behind) idle=e10/0/0 softirq=78/80 fqs=0 last_accelerate: ef2c/09dc, nonlazy_posted: 0, L.
[  836.082134] 	49-...: (34 GPs behind) idle=dd4/0/0 softirq=72/72 fqs=0 last_accelerate: d805/09e0, nonlazy_posted: 0, L.
[  836.092902] 	50-...: (147 GPs behind) idle=c20/0/0 softirq=58/60 fqs=0 last_accelerate: 587f/09e4, nonlazy_posted: 0, L.
[  836.103756] 	51-...: (145 GPs behind) idle=bf4/0/0 softirq=63/63 fqs=0 last_accelerate: 587f/09e4, nonlazy_posted: 0, L.
[  836.114610] 	52-...: (47 GPs behind) idle=bf0/0/0 softirq=55/56 fqs=0 last_accelerate: f25c/09e8, nonlazy_posted: 0, L.
[  836.125377] 	53-...: (31 GPs behind) idle=ca8/0/0 softirq=51/53 fqs=0 last_accelerate: 03ea/09ec, nonlazy_posted: 0, L.
[  836.136144] 	54-...: (33 GPs behind) idle=c5c/0/0 softirq=54/54 fqs=0 last_accelerate: aff0/09ec, nonlazy_posted: 0, L.
[  836.146912] 	55-...: (146 GPs behind) idle=ab8/0/0 softirq=58/59 fqs=0 last_accelerate: 587f/09f0, nonlazy_posted: 0, L.
[  836.157766] 	56-...: (43 GPs behind) idle=f2c/0/0 softirq=67/67 fqs=0 last_accelerate: f8dc/09f4, nonlazy_posted: 0, L.
[  836.168533] 	57-...: (42 GPs behind) idle=a74/0/0 softirq=48/49 fqs=0 last_accelerate: 7e00/09f4, nonlazy_posted: 0, L.
[  836.179300] 	58-...: (29 GPs behind) idle=f7c/0/0 softirq=60/62 fqs=0 last_accelerate: 05ea/09f8, nonlazy_posted: 0, L.
[  836.190068] 	59-...: (103 GPs behind) idle=cbc/0/0 softirq=48/49 fqs=0 last_accelerate: 587f/09fc, nonlazy_posted: 0, L.
[  836.200921] 	60-...: (40 GPs behind) idle=97c/0/0 softirq=39/39 fqs=0 last_accelerate: fcdc/09fc, nonlazy_posted: 0, L.
[  836.211688] 	61-...: (20 GPs behind) idle=870/0/0 softirq=27/31 fqs=0 last_accelerate: edf5/0a00, nonlazy_posted: 0, L.
[  836.222456] 	62-...: (17 GPs behind) idle=7a8/0/0 softirq=33/34 fqs=0 last_accelerate: 01f3/0a04, nonlazy_posted: 0, L.
[  836.233223] 	63-...: (147 GPs behind) idle=6fc/0/0 softirq=33/35 fqs=0 last_accelerate: 587f/0a04, nonlazy_posted: 0, L.
[  836.244075] 	(detected by 1, t=2164 jiffies, g=112, c=111, q=2706)
[  836.250246] Task dump for CPU 2:
[  836.253460] swapper/2       R  running task        0     0      1 0x00000000
[  836.260496] Call trace:
[  836.262935] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.268062] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  836.274836] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  836.280223] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.285694] Task dump for CPU 3:
[  836.288908] swapper/3       R  running task        0     0      1 0x00000000
[  836.295944] Call trace:
[  836.298378] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.303502] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.308973] Task dump for CPU 4:
[  836.312187] swapper/4       R  running task        0     0      1 0x00000000
[  836.319223] Call trace:
[  836.321657] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.326781] [<          (null)>]           (null)
[  836.331470] Task dump for CPU 5:
[  836.334685] swapper/5       R  running task        0     0      1 0x00000000
[  836.341719] Call trace:
[  836.344153] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.349277] [<          (null)>]           (null)
[  836.353967] Task dump for CPU 6:
[  836.357181] swapper/6       R  running task        0     0      1 0x00000000
[  836.364216] Call trace:
[  836.366649] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.371773] [<          (null)>]           (null)
[  836.376463] Task dump for CPU 7:
[  836.379678] swapper/7       R  running task        0     0      1 0x00000000
[  836.386713] Call trace:
[  836.389146] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.394270] [<          (null)>]           (null)
[  836.398959] Task dump for CPU 8:
[  836.402173] swapper/8       R  running task        0     0      1 0x00000000
[  836.409209] Call trace:
[  836.411642] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.416766] [<          (null)>]           (null)
[  836.421455] Task dump for CPU 9:
[  836.424669] swapper/9       R  running task        0     0      1 0x00000000
[  836.431704] Call trace:
[  836.434137] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.439261] [<          (null)>]           (null)
[  836.443951] Task dump for CPU 10:
[  836.447251] swapper/10      R  running task        0     0      1 0x00000000
[  836.454287] Call trace:
[  836.456720] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.461844] [<          (null)>]           (null)
[  836.466533] Task dump for CPU 11:
[  836.469834] swapper/11      R  running task        0     0      1 0x00000000
[  836.476869] Call trace:
[  836.479303] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.484426] [<          (null)>]           (null)
[  836.489116] Task dump for CPU 12:
[  836.492417] swapper/12      R  running task        0     0      1 0x00000000
[  836.499451] Call trace:
[  836.501885] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.507009] [<          (null)>]           (null)
[  836.511698] Task dump for CPU 13:
[  836.514999] swapper/13      R  running task        0     0      1 0x00000000
[  836.522033] Call trace:
[  836.524467] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.529591] [<          (null)>]           (null)
[  836.534280] Task dump for CPU 14:
[  836.537581] swapper/14      R  running task        0     0      1 0x00000000
[  836.544616] Call trace:
[  836.547049] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.552173] [<          (null)>]           (null)
[  836.556863] Task dump for CPU 15:
[  836.560163] swapper/15      R  running task        0     0      1 0x00000000
[  836.567198] Call trace:
[  836.569632] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.574755] [<          (null)>]           (null)
[  836.579445] Task dump for CPU 16:
[  836.582746] swapper/16      R  running task        0     0      1 0x00000000
[  836.589780] Call trace:
[  836.592214] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.597338] [<          (null)>]           (null)
[  836.602027] Task dump for CPU 17:
[  836.605328] swapper/17      R  running task        0     0      1 0x00000000
[  836.612362] Call trace:
[  836.614796] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.619920] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.625391] Task dump for CPU 19:
[  836.628692] swapper/19      R  running task        0     0      1 0x00000000
[  836.635727] Call trace:
[  836.638160] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.643285] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  836.650059] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  836.655444] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.660915] Task dump for CPU 20:
[  836.664216] swapper/20      R  running task        0     0      1 0x00000000
[  836.671251] Call trace:
[  836.673684] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.678808] [<          (null)>]           (null)
[  836.683497] Task dump for CPU 21:
[  836.686798] swapper/21      R  running task        0     0      1 0x00000000
[  836.693833] Call trace:
[  836.696266] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.701391] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.706862] Task dump for CPU 23:
[  836.710162] swapper/23      R  running task        0     0      1 0x00000000
[  836.717197] Call trace:
[  836.719631] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.724754] [<          (null)>]           (null)
[  836.729444] Task dump for CPU 24:
[  836.732745] swapper/24      R  running task        0     0      1 0x00000000
[  836.739779] Call trace:
[  836.742213] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.747337] [<          (null)>]           (null)
[  836.752026] Task dump for CPU 25:
[  836.755327] swapper/25      R  running task        0     0      1 0x00000000
[  836.762362] Call trace:
[  836.764795] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.769919] [<          (null)>]           (null)
[  836.774608] Task dump for CPU 26:
[  836.777909] swapper/26      R  running task        0     0      1 0x00000000
[  836.784944] Call trace:
[  836.787377] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.792501] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.797972] Task dump for CPU 27:
[  836.801273] swapper/27      R  running task        0     0      1 0x00000000
[  836.808308] Call trace:
[  836.810741] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.815865] [<          (null)>]           (null)
[  836.820555] Task dump for CPU 28:
[  836.823855] swapper/28      R  running task        0     0      1 0x00000000
[  836.830890] Call trace:
[  836.833323] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.838448] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.843918] Task dump for CPU 29:
[  836.847219] swapper/29      R  running task        0     0      1 0x00000000
[  836.854254] Call trace:
[  836.856687] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.861811] [<          (null)>]           (null)
[  836.866500] Task dump for CPU 30:
[  836.869801] swapper/30      R  running task        0     0      1 0x00000000
[  836.876836] Call trace:
[  836.879269] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.884394] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  836.889865] Task dump for CPU 31:
[  836.893166] swapper/31      R  running task        0     0      1 0x00000000
[  836.900200] Call trace:
[  836.902634] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.907758] [<          (null)>]           (null)
[  836.912447] Task dump for CPU 32:
[  836.915749] swapper/32      R  running task        0     0      1 0x00000000
[  836.922784] Call trace:
[  836.925217] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.930341] [<          (null)>]           (null)
[  836.935031] Task dump for CPU 33:
[  836.938332] swapper/33      R  running task        0     0      1 0x00000000
[  836.945367] Call trace:
[  836.947801] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.952925] [<          (null)>]           (null)
[  836.957614] Task dump for CPU 35:
[  836.960916] swapper/35      R  running task        0     0      1 0x00000000
[  836.967951] Call trace:
[  836.970384] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.975508] [<          (null)>]           (null)
[  836.980198] Task dump for CPU 36:
[  836.983499] swapper/36      R  running task        0     0      1 0x00000000
[  836.990534] Call trace:
[  836.992967] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  836.998091] [<          (null)>]           (null)
[  837.002781] Task dump for CPU 37:
[  837.006082] swapper/37      R  running task        0     0      1 0x00000000
[  837.013117] Call trace:
[  837.015551] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.020675] [<          (null)>]           (null)
[  837.025365] Task dump for CPU 38:
[  837.028666] swapper/38      R  running task        0     0      1 0x00000000
[  837.035701] Call trace:
[  837.038135] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.043259] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.048730] Task dump for CPU 39:
[  837.052031] swapper/39      R  running task        0     0      1 0x00000000
[  837.059067] Call trace:
[  837.061500] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.066624] [<          (null)>]           (null)
[  837.071314] Task dump for CPU 40:
[  837.074615] swapper/40      R  running task        0     0      1 0x00000000
[  837.081650] Call trace:
[  837.084083] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.089207] [<          (null)>]           (null)
[  837.093897] Task dump for CPU 41:
[  837.097198] swapper/41      R  running task        0     0      1 0x00000000
[  837.104234] Call trace:
[  837.106667] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.111791] [<          (null)>]           (null)
[  837.116480] Task dump for CPU 42:
[  837.119781] swapper/42      R  running task        0     0      1 0x00000000
[  837.126817] Call trace:
[  837.129250] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.134374] [<          (null)>]           (null)
[  837.139064] Task dump for CPU 43:
[  837.142365] swapper/43      R  running task        0     0      1 0x00000000
[  837.149400] Call trace:
[  837.151833] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.156957] [<          (null)>]           (null)
[  837.161647] Task dump for CPU 44:
[  837.164948] swapper/44      R  running task        0     0      1 0x00000000
[  837.171984] Call trace:
[  837.174417] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.179541] [<          (null)>]           (null)
[  837.184231] Task dump for CPU 45:
[  837.187532] swapper/45      R  running task        0     0      1 0x00000000
[  837.194567] Call trace:
[  837.197000] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.202124] [<          (null)>]           (null)
[  837.206813] Task dump for CPU 46:
[  837.210114] swapper/46      R  running task        0     0      1 0x00000000
[  837.217150] Call trace:
[  837.219583] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.224707] [<          (null)>]           (null)
[  837.229397] Task dump for CPU 47:
[  837.232698] swapper/47      R  running task        0     0      1 0x00000000
[  837.239733] Call trace:
[  837.242166] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.247290] [<          (null)>]           (null)
[  837.251980] Task dump for CPU 48:
[  837.255281] swapper/48      R  running task        0     0      1 0x00000000
[  837.262316] Call trace:
[  837.264749] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.269873] [<          (null)>]           (null)
[  837.274563] Task dump for CPU 49:
[  837.277864] swapper/49      R  running task        0     0      1 0x00000000
[  837.284899] Call trace:
[  837.287333] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.292457] [<          (null)>]           (null)
[  837.297146] Task dump for CPU 50:
[  837.300448] swapper/50      R  running task        0     0      1 0x00000000
[  837.307483] Call trace:
[  837.309916] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.315040] [<          (null)>]           (null)
[  837.319729] Task dump for CPU 51:
[  837.323031] swapper/51      R  running task        0     0      1 0x00000000
[  837.330065] Call trace:
[  837.332499] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.337623] [<          (null)>]           (null)
[  837.342313] Task dump for CPU 52:
[  837.345614] swapper/52      R  running task        0     0      1 0x00000000
[  837.352649] Call trace:
[  837.355083] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.360207] [<          (null)>]           (null)
[  837.364896] Task dump for CPU 53:
[  837.368198] swapper/53      R  running task        0     0      1 0x00000000
[  837.375233] Call trace:
[  837.377666] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.382790] [<          (null)>]           (null)
[  837.387480] Task dump for CPU 54:
[  837.390781] swapper/54      R  running task        0     0      1 0x00000000
[  837.397817] Call trace:
[  837.400250] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.405375] [<ffff00000813cf78>] rcu_eqs_enter_common.isra.32+0x1b8/0x228
[  837.412149] [<ffff00000813d044>] rcu_idle_enter+0x5c/0x60
[  837.417534] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.423005] Task dump for CPU 55:
[  837.426305] swapper/55      R  running task        0     0      1 0x00000000
[  837.433341] Call trace:
[  837.435774] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.440898] [<          (null)>]           (null)
[  837.445588] Task dump for CPU 56:
[  837.448889] swapper/56      R  running task        0     0      1 0x00000000
[  837.455924] Call trace:
[  837.458358] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.463482] [<          (null)>]           (null)
[  837.468172] Task dump for CPU 57:
[  837.471473] swapper/57      R  running task        0     0      1 0x00000000
[  837.478508] Call trace:
[  837.480942] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.486066] [<ffff000008ff9df0>] __cpu_online_mask+0x0/0x8
[  837.491536] Task dump for CPU 58:
[  837.494837] swapper/58      R  running task        0     0      1 0x00000000
[  837.501873] Call trace:
[  837.504306] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.509430] [<          (null)>]           (null)
[  837.514120] Task dump for CPU 59:
[  837.517421] swapper/59      R  running task        0     0      1 0x00000000
[  837.524456] Call trace:
[  837.526889] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.532013] [<          (null)>]           (null)
[  837.536703] Task dump for CPU 60:
[  837.540004] swapper/60      R  running task        0     0      1 0x00000000
[  837.547039] Call trace:
[  837.549473] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.554597] [<          (null)>]           (null)
[  837.559287] Task dump for CPU 61:
[  837.562588] swapper/61      R  running task        0     0      1 0x00000000
[  837.569623] Call trace:
[  837.572056] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.577180] [<          (null)>]           (null)
[  837.581870] Task dump for CPU 62:
[  837.585171] swapper/62      R  running task        0     0      1 0x00000000
[  837.592206] Call trace:
[  837.594640] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.599764] [<          (null)>]           (null)
[  837.604453] Task dump for CPU 63:
[  837.607754] swapper/63      R  running task        0     0      1 0x00000000
[  837.614790] Call trace:
[  837.617223] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.622347] [<          (null)>]           (null)
[  837.627039] rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[  837.636416] rcu_preempt     S    0     9      2 0x00000000
[  837.641888] Call trace:
[  837.644321] [<ffff000008085cb8>] __switch_to+0x90/0xa8
[  837.649448] [<ffff000008a16724>] __schedule+0x1a4/0x720
[  837.654659] [<ffff000008a16ce0>] schedule+0x40/0xa8
[  837.659524] [<ffff000008a1a2f0>] schedule_timeout+0x178/0x358
[  837.665257] [<ffff00000813e694>] rcu_gp_kthread+0x534/0x7b8
[  837.670817] [<ffff0000080f33d0>] kthread+0x108/0x138
[  837.675767] [<ffff0000080836c0>] ret_from_fork+0x10/0x50


And from 8ish seconds before where I presume the problem would like.
If more is needed then ping me and I'll get them uploaded somewhere.


         <idle>-0     [000] d.h2   825.003085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.h1   825.003085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=823324002340
          <idle>-0     [000] d.h1   825.003086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   825.003087: timer_cancel: timer=ffff8017d3923c08
          <idle>-0     [000] d.s1   825.003087: timer_expire_entry: timer=ffff8017d3923c08 function=delayed_work_timer_fn now=4295098128
          <idle>-0     [000] dns1   825.003089: timer_expire_exit: timer=ffff8017d3923c08
          <idle>-0     [000] dn.2   825.003091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=823328000000 softexpires=823328000000
     kworker/0:0-3     [000] d..1   825.003093: timer_start: timer=ffff8017d3923c08 function=delayed_work_timer_fn expires=4295098378 [timeout=250] cpu=0 idx=98 flags=I
          <idle>-0     [000] d..1   825.003095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   825.003095: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   825.003096: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824316000000 softexpires=824316000000
          <idle>-0     [000] dn.2   825.234230: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   825.234230: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=823556000000 softexpires=823556000000
          <idle>-0     [001] dn.2   825.234235: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   825.234235: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   825.234235: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=823556000000 softexpires=823556000000
          <idle>-0     [000] d..2   825.234236: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   825.234236: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824316000000 softexpires=824316000000
          <idle>-0     [001] d..1   825.234244: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   825.234244: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   825.234245: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1264347202655 softexpires=1264347202655
          <idle>-0     [000] d.h2   825.995084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [034] d.h2   825.995085: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [034] d.h1   825.995085: hrtimer_expire_entry: hrtimer=ffff8017dba91808 function=tick_sched_timer now=824316002180
          <idle>-0     [000] d.h1   825.995085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=824316002240
          <idle>-0     [000] d.h1   825.995086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [034] d.h1   825.995087: hrtimer_expire_exit: hrtimer=ffff8017dba91808
          <idle>-0     [000] d.s2   825.995087: timer_cancel: timer=ffff8017d3923408
          <idle>-0     [000] d.s1   825.995087: timer_expire_entry: timer=ffff8017d3923408 function=delayed_work_timer_fn now=4295098376
          <idle>-0     [034] d.s2   825.995087: timer_cancel: timer=ffff00000909a9a0
          <idle>-0     [034] d.s1   825.995088: timer_expire_entry: timer=ffff00000909a9a0 function=delayed_work_timer_fn now=4295098376
          <idle>-0     [000] dns1   825.995089: timer_expire_exit: timer=ffff8017d3923408
          <idle>-0     [034] dns1   825.995089: timer_expire_exit: timer=ffff00000909a9a0
          <idle>-0     [000] dn.2   825.995091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824320000000 softexpires=824320000000
          <idle>-0     [034] dn.2   825.995092: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=824320000000 softexpires=824320000000
     kworker/0:0-3     [000] d..1   825.995093: timer_start: timer=ffff8017d3923408 function=delayed_work_timer_fn expires=4295098626 [timeout=250] cpu=0 idx=65 flags=I
          <idle>-0     [000] d..1   825.995096: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   825.995096: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   825.995096: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824348000000 softexpires=824348000000
    kworker/34:2-1587  [034] d..1   825.995097: timer_start: timer=ffff00000909a9a0 function=delayed_work_timer_fn expires=4295098626 [timeout=250] cpu=34 idx=65 flags=I
          <idle>-0     [034] d..1   825.995100: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   825.995101: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [034] d..2   825.995101: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=825340000000 softexpires=825340000000
          <idle>-0     [000] d.h2   826.027085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.h1   826.027085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=824348002340
          <idle>-0     [000] d.h1   826.027086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   826.027087: timer_cancel: timer=ffff8017d3923c08
          <idle>-0     [000] d.s1   826.027087: timer_expire_entry: timer=ffff8017d3923c08 function=delayed_work_timer_fn now=4295098384
          <idle>-0     [000] dns1   826.027088: timer_expire_exit: timer=ffff8017d3923c08
          <idle>-0     [000] dn.2   826.027091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824352000000 softexpires=824352000000
     kworker/0:0-3     [000] d..1   826.027093: timer_start: timer=ffff8017d3923c08 function=delayed_work_timer_fn expires=4295098634 [timeout=250] cpu=0 idx=66 flags=I
          <idle>-0     [000] d..1   826.027095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   826.027095: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   826.027096: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825340000000 softexpires=825340000000
          <idle>-0     [000] dn.2   826.185331: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   826.185331: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824508000000 softexpires=824508000000
          <idle>-0     [001] dn.2   826.185336: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   826.185336: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.185336: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=824508000000 softexpires=824508000000
          <idle>-0     [000] d..2   826.185337: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   826.185337: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825340000000 softexpires=825340000000
          <idle>-0     [001] d..1   826.185349: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.185349: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   826.185350: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1265299202655 softexpires=1265299202655
          <idle>-0     [000] dn.2   826.382948: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   826.382949: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=824704000000 softexpires=824704000000
          <idle>-0     [001] dn.2   826.382954: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   826.382954: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.382954: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=824704000000 softexpires=824704000000
          <idle>-0     [000] d..2   826.382955: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   826.382955: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825340000000 softexpires=825340000000
          <idle>-0     [001] d..1   826.382963: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.382963: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   826.382964: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1265495202655 softexpires=1265495202655
          <idle>-0     [000] dn.2   826.863667: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   826.863667: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825188000000 softexpires=825188000000
          <idle>-0     [001] dn.2   826.863672: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   826.863672: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   826.863672: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825188000000 softexpires=825188000000
          <idle>-0     [000] d..2   826.863673: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   826.863673: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825340000000 softexpires=825340000000
          <idle>-0     [001] d..1   826.863680: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   826.863681: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   826.863681: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1265979202655 softexpires=1265979202655
8 seconds before the issue is detected is somewhere around here.

	  <idle>-0     [000] d.h2   827.019085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [034] d.h2   827.019085: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [034] d.h1   827.019085: hrtimer_expire_entry: hrtimer=ffff8017dba91808 function=tick_sched_timer now=825340002280
          <idle>-0     [000] d.h1   827.019085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=825340002260
          <idle>-0     [034] d.h1   827.019086: hrtimer_expire_exit: hrtimer=ffff8017dba91808
          <idle>-0     [000] d.h1   827.019087: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [034] d.s2   827.019087: timer_cancel: timer=ffff00000909a9a0
          <idle>-0     [000] d.s2   827.019088: timer_cancel: timer=ffff8017d3923408
          <idle>-0     [034] d.s1   827.019088: timer_expire_entry: timer=ffff00000909a9a0 function=delayed_work_timer_fn now=4295098632
          <idle>-0     [000] d.s1   827.019088: timer_expire_entry: timer=ffff8017d3923408 function=delayed_work_timer_fn now=4295098632
          <idle>-0     [034] dns1   827.019089: timer_expire_exit: timer=ffff00000909a9a0
          <idle>-0     [000] dns1   827.019089: timer_expire_exit: timer=ffff8017d3923408
          <idle>-0     [000] dns2   827.019090: timer_cancel: timer=ffff0000090295a8
          <idle>-0     [000] dns1   827.019090: timer_expire_entry: timer=ffff0000090295a8 function=delayed_work_timer_fn now=4295098632
          <idle>-0     [000] dns1   827.019090: timer_expire_exit: timer=ffff0000090295a8
          <idle>-0     [034] dn.2   827.019092: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=825344000000 softexpires=825344000000
          <idle>-0     [000] dn.2   827.019093: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825344000000 softexpires=825344000000
     kworker/0:0-3     [000] d..1   827.019095: timer_start: timer=ffff8017d3923408 function=delayed_work_timer_fn expires=4295098882 [timeout=250] cpu=0 idx=97 flags=I
    kworker/34:2-1587  [034] d..1   827.019098: timer_start: timer=ffff00000909a9a0 function=delayed_work_timer_fn expires=4295098882 [timeout=250] cpu=34 idx=97 flags=I
          <idle>-0     [034] d..1   827.019101: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   827.019101: hrtimer_cancel: hrtimer=ffff8017dba91808
     kworker/0:0-3     [000] d..1   827.019102: timer_start: timer=ffff0000090295a8 function=delayed_work_timer_fn expires=4295099000 [timeout=368] cpu=0 idx=81 flags=D|I
          <idle>-0     [034] d..2   827.019102: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [000] d..1   827.019104: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.019104: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.019105: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825372000000 softexpires=825372000000
          <idle>-0     [000] d.h2   827.051085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.h1   827.051085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=825372002340
          <idle>-0     [000] d.h1   827.051086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   827.051087: timer_cancel: timer=ffff8017d3923c08
          <idle>-0     [000] d.s1   827.051087: timer_expire_entry: timer=ffff8017d3923c08 function=delayed_work_timer_fn now=4295098640
          <idle>-0     [000] dns1   827.051088: timer_expire_exit: timer=ffff8017d3923c08
          <idle>-0     [000] dn.2   827.051091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825376000000 softexpires=825376000000
     kworker/0:0-3     [000] d..1   827.051093: timer_start: timer=ffff8017d3923c08 function=delayed_work_timer_fn expires=4295098890 [timeout=250] cpu=0 idx=98 flags=I
          <idle>-0     [000] d..1   827.051095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.051095: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.051096: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [000] dn.2   827.153486: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   827.153486: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825476000000 softexpires=825476000000
          <idle>-0     [001] dn.2   827.153491: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   827.153491: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.153491: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825476000000 softexpires=825476000000
          <idle>-0     [000] d..2   827.153491: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.153492: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [001] d..1   827.153499: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.153500: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.153500: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1266267202655 softexpires=1266267202655
          <idle>-0     [000] dn.2   827.463397: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   827.463398: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825788000000 softexpires=825788000000
          <idle>-0     [001] dn.2   827.463402: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   827.463403: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.463403: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825788000000 softexpires=825788000000
          <idle>-0     [000] d..2   827.463403: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.463403: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [001] d..1   827.463411: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.463411: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.463412: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=1266579202655 softexpires=1266579202655
          <idle>-0     [000] dn.2   827.563461: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   827.563462: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=825888000000 softexpires=825888000000
          <idle>-0     [001] dn.2   827.563466: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d..1   827.563467: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] dn.2   827.563467: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825888000000 softexpires=825888000000
          <idle>-0     [000] d..2   827.563467: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.563468: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
              sh-1993  [001] ....   827.563474: timer_init: timer=ffff8017db00fb40
              sh-1993  [001] d..1   827.563475: timer_start: timer=ffff8017db00fb40 function=process_timeout expires=4295098770 [timeout=2] cpu=1 idx=0 flags=
          <idle>-0     [001] d..1   827.563477: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.563478: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.563478: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825896000000 softexpires=825896000000
          <idle>-0     [001] d.h2   827.575084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   827.575084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825896001620
          <idle>-0     [001] d.h1   827.575086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.s2   827.575087: timer_cancel: timer=ffff8017db00fb40
          <idle>-0     [001] ..s1   827.575087: timer_expire_entry: timer=ffff8017db00fb40 function=process_timeout now=4295098771
          <idle>-0     [001] .ns1   827.575089: timer_expire_exit: timer=ffff8017db00fb40
          <idle>-0     [001] dns2   827.575089: timer_cancel: timer=ffff8017fbe5e558
          <idle>-0     [001] dns1   827.575089: timer_expire_entry: timer=ffff8017fbe5e558 function=delayed_work_timer_fn now=4295098771
          <idle>-0     [001] dns1   827.575091: timer_expire_exit: timer=ffff8017fbe5e558
          <idle>-0     [001] dn.2   827.575094: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825900000000 softexpires=825900000000
              sh-1993  [001] ...1   827.575136: hrtimer_init: hrtimer=ffff8017d4f1e8a0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-1993  [001] ...1   827.575137: hrtimer_init: hrtimer=ffff8017d4f1e8e0 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
              sh-1993  [001] ....   827.575145: hrtimer_init: hrtimer=ffff80176f273888 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_REL
          <idle>-0     [018] dn.2   827.575215: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] dn.2   827.575216: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825900000000 softexpires=825900000000
          <idle>-0     [001] d..1   827.575297: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   827.575298: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.575299: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825916000000 softexpires=825916000000
              ps-2012  [018] ....   827.576609: timer_init: timer=ffff801764b98188
              ps-2012  [018] ....   827.576614: timer_init: timer=ffff801764b98608
              ps-2012  [018] ....   827.576633: timer_init: timer=ffff801764b98188
              ps-2012  [018] ....   827.576635: timer_init: timer=ffff801764b98608
              ps-2012  [018] d.h2   827.579085: hrtimer_cancel: hrtimer=ffff8017fbc74808
              ps-2012  [018] d.h1   827.579086: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825900002120
              ps-2012  [018] d.h1   827.579093: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
              ps-2012  [018] d.h2   827.579094: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825904000000 softexpires=825904000000
          <idle>-0     [019] dn.2   827.579107: hrtimer_start: hrtimer=ffff8017fbc8f808 function=tick_sched_timer expires=825904000000 softexpires=825904000000
     rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
     rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295098773 [timeout=1] cpu=19 idx=0 flags=
          <idle>-0     [019] d..1   827.579119: tick_stop: success=1 dependency=NONE
          <idle>-0     [019] d..2   827.579119: hrtimer_cancel: hrtimer=ffff8017fbc8f808
          <idle>-0     [019] d..2   827.579119: hrtimer_start: hrtimer=ffff8017fbc8f808 function=tick_sched_timer expires=840668000000 softexpires=840668000000
          <idle>-0     [018] d..1   827.580189: tick_stop: success=1 dependency=NONE
          <idle>-0     [018] d..2   827.580190: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   827.580191: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825916000000 softexpires=825916000000
          <idle>-0     [018] d.h2   827.595084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.595084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.595085: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825916001860
          <idle>-0     [001] d.h1   827.595085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825916001960
          <idle>-0     [001] d.h1   827.595086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.595087: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   827.595089: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825932000000 softexpires=825932000000
          <idle>-0     [001] d..2   827.595090: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825932000000 softexpires=825932000000
          <idle>-0     [018] d.h2   827.611084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.611084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.611084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825932001300
          <idle>-0     [001] d.h1   827.611084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825932001520
          <idle>-0     [018] d.h1   827.611085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.611085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.611088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825948000000 softexpires=825948000000
          <idle>-0     [018] d..2   827.611090: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825948000000 softexpires=825948000000
          <idle>-0     [018] d.h2   827.627084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.627084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.627084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825948001320
          <idle>-0     [001] d.h1   827.627084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825948001560
          <idle>-0     [001] d.h1   827.627085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.627086: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   827.627087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825964000000 softexpires=825964000000
          <idle>-0     [001] d..2   827.627088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825964000000 softexpires=825964000000
          <idle>-0     [018] d.h2   827.643084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.643084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.643084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825964001260
          <idle>-0     [001] d.h1   827.643084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825964001440
          <idle>-0     [018] d.h1   827.643085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.643085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.643087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825980000000 softexpires=825980000000
          <idle>-0     [018] d..2   827.643088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825980000000 softexpires=825980000000
          <idle>-0     [018] d.h2   827.659084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.659084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.659084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825980001300
          <idle>-0     [001] d.h1   827.659084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825980001540
          <idle>-0     [018] d.h1   827.659085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.659085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.659086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=825996000000 softexpires=825996000000
          <idle>-0     [001] d..2   827.659088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=825996000000 softexpires=825996000000
          <idle>-0     [018] d.h2   827.675084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.675084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.675084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=825996001320
          <idle>-0     [001] d.h1   827.675084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=825996001520
          <idle>-0     [018] d.h1   827.675085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.675085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.675087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826012000000 softexpires=826012000000
          <idle>-0     [018] d..2   827.675088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826012000000 softexpires=826012000000
          <idle>-0     [018] d.h2   827.691084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.691084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.691084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826012001280
          <idle>-0     [001] d.h1   827.691084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826012001480
          <idle>-0     [018] d.h1   827.691085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.691085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.691086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826028000000 softexpires=826028000000
          <idle>-0     [001] d..2   827.691088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826028000000 softexpires=826028000000
          <idle>-0     [018] d.h2   827.707084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.707084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.707084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826028001300
          <idle>-0     [001] d.h1   827.707084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826028001520
          <idle>-0     [018] d.h1   827.707085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.707085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.707087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826044000000 softexpires=826044000000
          <idle>-0     [018] d..2   827.707087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826044000000 softexpires=826044000000
          <idle>-0     [018] d.h2   827.723084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d.h1   827.723084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826044001280
          <idle>-0     [001] d.h2   827.723084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   827.723085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826044001480
          <idle>-0     [018] d.h1   827.723085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.723086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.723086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826060000000 softexpires=826060000000
          <idle>-0     [001] d..2   827.723088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826060000000 softexpires=826060000000
          <idle>-0     [018] d.h2   827.739084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.739084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.739084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826060001280
          <idle>-0     [001] d.h1   827.739084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826060001460
          <idle>-0     [018] d.h1   827.739085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.739085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.739087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826076000000 softexpires=826076000000
          <idle>-0     [001] d..2   827.739087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826076000000 softexpires=826076000000
          <idle>-0     [018] d.h2   827.755084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.755084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.755084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826076001280
          <idle>-0     [001] d.h1   827.755084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826076001460
          <idle>-0     [018] d.h1   827.755085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.755085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.755086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826092000000 softexpires=826092000000
          <idle>-0     [001] d..2   827.755088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826092000000 softexpires=826092000000
          <idle>-0     [018] d.h2   827.771084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.771084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.771084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826092001240
          <idle>-0     [001] d.h1   827.771084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826092001460
          <idle>-0     [018] d.h1   827.771085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.771085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.771087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826108000000 softexpires=826108000000
          <idle>-0     [001] d..2   827.771087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826108000000 softexpires=826108000000
          <idle>-0     [018] d.h2   827.787084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.787084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.787084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826108001280
          <idle>-0     [001] d.h1   827.787084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826108001500
          <idle>-0     [018] d.h1   827.787085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.787085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.787086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826124000000 softexpires=826124000000
          <idle>-0     [001] d..2   827.787087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826124000000 softexpires=826124000000
          <idle>-0     [018] d.h2   827.803084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.803084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.803084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826124001280
          <idle>-0     [001] d.h1   827.803084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826124001480
          <idle>-0     [018] d.h1   827.803085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.803085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.803087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826140000000 softexpires=826140000000
          <idle>-0     [018] d..2   827.803087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826140000000 softexpires=826140000000
          <idle>-0     [018] d.h2   827.819084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.819084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.819084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826140001280
          <idle>-0     [001] d.h1   827.819084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826140001480
          <idle>-0     [018] d.h1   827.819085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.819085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.819086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826156000000 softexpires=826156000000
          <idle>-0     [001] d..2   827.819088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826156000000 softexpires=826156000000
          <idle>-0     [018] d.h2   827.835084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.835084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.835084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826156001260
          <idle>-0     [001] d.h1   827.835084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826156001460
          <idle>-0     [018] d.h1   827.835085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.835085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.835087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826172000000 softexpires=826172000000
          <idle>-0     [018] d..2   827.835087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826172000000 softexpires=826172000000
          <idle>-0     [018] d.h2   827.851084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.851084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.851084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826172001260
          <idle>-0     [001] d.h1   827.851084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826172001440
          <idle>-0     [018] d.h1   827.851085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.851085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.851086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826188000000 softexpires=826188000000
          <idle>-0     [001] d..2   827.851087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826188000000 softexpires=826188000000
          <idle>-0     [018] d.h2   827.867084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.867084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.867084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826188001280
          <idle>-0     [001] d.h1   827.867084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826188001500
          <idle>-0     [018] d.h1   827.867085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.867085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.867087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826204000000 softexpires=826204000000
          <idle>-0     [001] d..2   827.867087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826204000000 softexpires=826204000000
          <idle>-0     [018] d.h2   827.883084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.883084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.883084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826204001260
          <idle>-0     [001] d.h1   827.883084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826204001460
          <idle>-0     [018] d.h1   827.883085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.883085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.883086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826220000000 softexpires=826220000000
          <idle>-0     [001] d..2   827.883088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826220000000 softexpires=826220000000
          <idle>-0     [018] d.h2   827.899084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.899084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.899084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826220001280
          <idle>-0     [001] d.h1   827.899084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826220001440
          <idle>-0     [018] d.h1   827.899085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.899085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.899087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826236000000 softexpires=826236000000
          <idle>-0     [001] d..2   827.899087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826236000000 softexpires=826236000000
          <idle>-0     [000] dn.2   827.913426: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   827.913427: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826236000000 softexpires=826236000000
          <idle>-0     [018] d.h2   827.915084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.915084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h2   827.915084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   827.915084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826236001200
          <idle>-0     [001] d.h1   827.915084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826236001400
              ps-2012  [000] d.h1   827.915085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=826236001280
          <idle>-0     [018] d.h1   827.915085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.915085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.915086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826252000000 softexpires=826252000000
              ps-2012  [000] d.h1   827.915088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h2   827.915088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826240000000 softexpires=826240000000
          <idle>-0     [001] d..2   827.915090: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826252000000 softexpires=826252000000
          <idle>-0     [000] d..1   827.916071: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   827.916072: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.916073: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826252000000 softexpires=826252000000
          <idle>-0     [000] d..2   827.920194: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.920195: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826268000000 softexpires=826268000000
          <idle>-0     [018] d.h2   827.931084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.931084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.931084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826252001280
          <idle>-0     [001] d.h1   827.931084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826252001500
          <idle>-0     [018] d.h1   827.931085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.931085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   827.931088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826268000000 softexpires=826268000000
          <idle>-0     [018] d..2   827.931088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826268000000 softexpires=826268000000
          <idle>-0     [000] d..2   827.935471: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.935471: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826284000000 softexpires=826284000000
          <idle>-0     [018] d.h2   827.947084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.947084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.947084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826268001260
          <idle>-0     [001] d.h1   827.947084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826268001480
          <idle>-0     [018] d.h1   827.947085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.947085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.947086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826284000000 softexpires=826284000000
          <idle>-0     [001] d..2   827.947088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826284000000 softexpires=826284000000
          <idle>-0     [000] d..2   827.952136: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.952137: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826300000000 softexpires=826300000000
          <idle>-0     [005] d..2   827.960777: hrtimer_cancel: hrtimer=ffff8017fbec8808
          <idle>-0     [005] d..2   827.960779: hrtimer_start: hrtimer=ffff8017fbec8808 function=tick_sched_timer expires=1267075202655 softexpires=1267075202655
          <idle>-0     [062] d..2   827.960781: hrtimer_cancel: hrtimer=ffff8017dba25808
          <idle>-0     [062] d..2   827.960783: hrtimer_start: hrtimer=ffff8017dba25808 function=tick_sched_timer expires=1267075202655 softexpires=1267075202655
          <idle>-0     [018] d.h2   827.963085: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.963085: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.963085: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826284002420
          <idle>-0     [001] d.h1   827.963085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826284002560
          <idle>-0     [018] d.h1   827.963086: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.963087: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.963089: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826300000000 softexpires=826300000000
          <idle>-0     [001] d..2   827.963089: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826300000000 softexpires=826300000000
          <idle>-0     [000] d..2   827.967413: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.967414: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826316000000 softexpires=826316000000
          <idle>-0     [004] d..2   827.968778: hrtimer_cancel: hrtimer=ffff8017fbead808
          <idle>-0     [004] d..2   827.968779: hrtimer_start: hrtimer=ffff8017fbead808 function=tick_sched_timer expires=1267083202655 softexpires=1267083202655
          <idle>-0     [036] d..2   827.969900: hrtimer_cancel: hrtimer=ffff8017dbac7808
          <idle>-0     [036] d..2   827.969902: hrtimer_start: hrtimer=ffff8017dbac7808 function=tick_sched_timer expires=1267083202655 softexpires=1267083202655
          <idle>-0     [018] d.h2   827.979084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.979084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.979084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826300001260
          <idle>-0     [001] d.h1   827.979084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826300001480
          <idle>-0     [018] d.h1   827.979085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.979085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.979087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826316000000 softexpires=826316000000
          <idle>-0     [001] d..2   827.979088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826316000000 softexpires=826316000000
          <idle>-0     [039] d..2   827.980320: hrtimer_cancel: hrtimer=ffff8017dbb18808
          <idle>-0     [039] d..2   827.980321: hrtimer_start: hrtimer=ffff8017dbb18808 function=tick_sched_timer expires=1267095202655 softexpires=1267095202655
          <idle>-0     [000] d..2   827.984080: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.984080: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826332000000 softexpires=826332000000
          <idle>-0     [018] d.h2   827.995084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   827.995084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   827.995084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826316001280
          <idle>-0     [001] d.h1   827.995084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826316001460
          <idle>-0     [018] d.h1   827.995085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   827.995085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   827.995087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826332000000 softexpires=826332000000
          <idle>-0     [001] d..2   827.995088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826332000000 softexpires=826332000000
          <idle>-0     [000] d..2   827.999356: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   827.999356: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826348000000 softexpires=826348000000
          <idle>-0     [018] d.h2   828.011084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.011084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   828.011084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826332001500
          <idle>-0     [018] d.h1   828.011084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826332001300
          <idle>-0     [001] d.h1   828.011085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.011085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   828.011087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826348000000 softexpires=826348000000
          <idle>-0     [001] d..2   828.011088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826348000000 softexpires=826348000000
          <idle>-0     [000] d..2   828.016021: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.016022: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [018] d.h2   828.027084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.027084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.027084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826348001300
          <idle>-0     [001] d.h1   828.027084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826348001520
          <idle>-0     [018] d.h1   828.027085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.027085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.027087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [018] d..2   828.027087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826364000000 softexpires=826364000000
          <idle>-0     [000] d.h2   828.043085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   828.043085: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.043085: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.043085: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826364002620
          <idle>-0     [034] d.h2   828.043085: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [001] d.h1   828.043085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826364002780
          <idle>-0     [000] d.h1   828.043086: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=826364002620
          <idle>-0     [034] d.h1   828.043086: hrtimer_expire_entry: hrtimer=ffff8017dba91808 function=tick_sched_timer now=826364002780
          <idle>-0     [018] d.h1   828.043086: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.043087: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   828.043087: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d..2   828.043088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826380000000 softexpires=826380000000
          <idle>-0     [034] d.h1   828.043088: hrtimer_expire_exit: hrtimer=ffff8017dba91808
          <idle>-0     [000] d.s2   828.043088: timer_cancel: timer=ffff8017d3923408
          <idle>-0     [000] d.s1   828.043088: timer_expire_entry: timer=ffff8017d3923408 function=delayed_work_timer_fn now=4295098888
          <idle>-0     [034] d.s2   828.043089: timer_cancel: timer=ffff00000909a9a0
          <idle>-0     [034] d.s1   828.043089: timer_expire_entry: timer=ffff00000909a9a0 function=delayed_work_timer_fn now=4295098888
          <idle>-0     [001] d..2   828.043089: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826380000000 softexpires=826380000000
          <idle>-0     [034] dns1   828.043090: timer_expire_exit: timer=ffff00000909a9a0
          <idle>-0     [000] dns1   828.043090: timer_expire_exit: timer=ffff8017d3923408
          <idle>-0     [034] dn.2   828.043094: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=826368000000 softexpires=826368000000
          <idle>-0     [000] dn.2   828.043095: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826368000000 softexpires=826368000000
     kworker/0:0-3     [000] d..1   828.043098: timer_start: timer=ffff8017d3923408 function=delayed_work_timer_fn expires=4295099138 [timeout=250] cpu=0 idx=65 flags=I
          <idle>-0     [000] d..1   828.043101: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.043101: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.043101: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826380000000 softexpires=826380000000
    kworker/34:2-1587  [034] d..1   828.043102: timer_start: timer=ffff00000909a9a0 function=delayed_work_timer_fn expires=4295099138 [timeout=250] cpu=34 idx=65 flags=I
          <idle>-0     [034] d..1   828.043104: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   828.043105: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [034] d..2   828.043105: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=827388000000 softexpires=827388000000
          <idle>-0     [000] d..2   828.047965: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.047965: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826396000000 softexpires=826396000000
          <idle>-0     [018] d.h2   828.059084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.059084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.059084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826380001280
          <idle>-0     [001] d.h1   828.059084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826380001500
          <idle>-0     [018] d.h1   828.059085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.059085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.059087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826396000000 softexpires=826396000000
          <idle>-0     [001] d..2   828.059088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826396000000 softexpires=826396000000
          <idle>-0     [000] d.h2   828.075084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   828.075084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.075084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.075084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826396001280
          <idle>-0     [000] d.h1   828.075084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=826396001280
          <idle>-0     [001] d.h1   828.075084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826396001480
          <idle>-0     [018] d.h1   828.075085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.075085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   828.075085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   828.075086: timer_cancel: timer=ffff8017d3923c08
          <idle>-0     [018] d..2   828.075086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826412000000 softexpires=826412000000
          <idle>-0     [000] d.s1   828.075086: timer_expire_entry: timer=ffff8017d3923c08 function=delayed_work_timer_fn now=4295098896
          <idle>-0     [000] dns1   828.075088: timer_expire_exit: timer=ffff8017d3923c08
          <idle>-0     [001] d..2   828.075088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826412000000 softexpires=826412000000
          <idle>-0     [000] dn.2   828.075091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826400000000 softexpires=826400000000
     kworker/0:0-3     [000] d..1   828.075093: timer_start: timer=ffff8017d3923c08 function=delayed_work_timer_fn expires=4295099146 [timeout=250] cpu=0 idx=66 flags=I
          <idle>-0     [000] d..1   828.075095: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.075095: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.075096: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826412000000 softexpires=826412000000
          <idle>-0     [000] d..2   828.079906: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.079907: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826428000000 softexpires=826428000000
          <idle>-0     [018] d.h2   828.091084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.091084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.091084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826412001280
          <idle>-0     [001] d.h1   828.091084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826412001460
          <idle>-0     [018] d.h1   828.091085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.091085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.091087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826428000000 softexpires=826428000000
          <idle>-0     [001] d..2   828.091087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826428000000 softexpires=826428000000
          <idle>-0     [000] d..2   828.095183: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.095183: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826444000000 softexpires=826444000000
          <idle>-0     [018] d.h2   828.107084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.107084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.107084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826428001260
          <idle>-0     [001] d.h1   828.107084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826428001480
          <idle>-0     [018] d.h1   828.107085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.107085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.107086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826444000000 softexpires=826444000000
          <idle>-0     [001] d..2   828.107088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826444000000 softexpires=826444000000
          <idle>-0     [000] d..2   828.111848: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.111849: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826460000000 softexpires=826460000000
          <idle>-0     [018] d.h2   828.123084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.123084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.123084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826444001280
          <idle>-0     [001] d.h1   828.123084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826444001460
          <idle>-0     [018] d.h1   828.123085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.123085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.123087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826460000000 softexpires=826460000000
          <idle>-0     [001] d..2   828.123087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826460000000 softexpires=826460000000
          <idle>-0     [000] d..2   828.127125: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.127126: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826476000000 softexpires=826476000000
          <idle>-0     [018] d.h2   828.139084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.139084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.139084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826460001260
          <idle>-0     [001] d.h1   828.139084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826460001460
          <idle>-0     [018] d.h1   828.139085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.139085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.139086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826476000000 softexpires=826476000000
          <idle>-0     [001] d..2   828.139088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826476000000 softexpires=826476000000
          <idle>-0     [000] d..2   828.143791: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.143791: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826492000000 softexpires=826492000000
          <idle>-0     [018] d.h2   828.155084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.155084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.155084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826476001280
          <idle>-0     [001] d.h1   828.155084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826476001480
          <idle>-0     [018] d.h1   828.155085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.155085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.155087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826492000000 softexpires=826492000000
          <idle>-0     [018] d..2   828.155087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826492000000 softexpires=826492000000
          <idle>-0     [000] d..2   828.160457: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.160457: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826508000000 softexpires=826508000000
          <idle>-0     [018] d.h2   828.171084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.171084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.171084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826492001300
          <idle>-0     [001] d.h1   828.171084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826492001500
          <idle>-0     [018] d.h1   828.171085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.171085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.171086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826508000000 softexpires=826508000000
          <idle>-0     [001] d..2   828.171087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826508000000 softexpires=826508000000
          <idle>-0     [000] d..2   828.175733: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.175734: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826524000000 softexpires=826524000000
          <idle>-0     [018] d.h2   828.187084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.187084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.187084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826508001300
          <idle>-0     [001] d.h1   828.187084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826508001480
          <idle>-0     [018] d.h1   828.187085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.187085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.187087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826524000000 softexpires=826524000000
          <idle>-0     [001] d..2   828.187087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826524000000 softexpires=826524000000
          <idle>-0     [000] d..2   828.192399: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.192399: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826540000000 softexpires=826540000000
          <idle>-0     [018] d.h2   828.203084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.203084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.203084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826524001260
          <idle>-0     [001] d.h1   828.203084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826524001460
          <idle>-0     [018] d.h1   828.203085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.203085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.203086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826540000000 softexpires=826540000000
          <idle>-0     [001] d..2   828.203087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826540000000 softexpires=826540000000
          <idle>-0     [000] d..2   828.207676: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.207676: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826556000000 softexpires=826556000000
          <idle>-0     [018] d.h2   828.219084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.219084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.219084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826540001260
          <idle>-0     [001] d.h1   828.219084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826540001420
          <idle>-0     [018] d.h1   828.219085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.219085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.219087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826556000000 softexpires=826556000000
          <idle>-0     [001] d..2   828.219087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826556000000 softexpires=826556000000
          <idle>-0     [000] d..2   828.224341: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.224342: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826572000000 softexpires=826572000000
          <idle>-0     [018] d.h2   828.235084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.235084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.235084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826556001260
          <idle>-0     [001] d.h1   828.235084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826556001440
          <idle>-0     [018] d.h1   828.235085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.235085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.235086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826572000000 softexpires=826572000000
          <idle>-0     [001] d..2   828.235087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826572000000 softexpires=826572000000
          <idle>-0     [000] d..2   828.239618: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.239619: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826588000000 softexpires=826588000000
          <idle>-0     [000] dn.2   828.249341: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   828.249342: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826572000000 softexpires=826572000000
          <idle>-0     [018] d.h2   828.251084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.251084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h2   828.251084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   828.251084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826572001200
          <idle>-0     [001] d.h1   828.251084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826572001420
              ps-2012  [000] d.h1   828.251085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=826572001280
          <idle>-0     [018] d.h1   828.251085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.251085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.251087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826588000000 softexpires=826588000000
              ps-2012  [000] d.h1   828.251088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h2   828.251088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826576000000 softexpires=826576000000
          <idle>-0     [001] d..2   828.251093: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826588000000 softexpires=826588000000
          <idle>-0     [000] d..1   828.251937: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.251937: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.251938: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826588000000 softexpires=826588000000
          <idle>-0     [000] d..2   828.256023: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.256024: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826604000000 softexpires=826604000000
          <idle>-0     [018] d.h2   828.267084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d.h1   828.267084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826588001240
          <idle>-0     [001] d.h2   828.267084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   828.267085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826588001440
          <idle>-0     [018] d.h1   828.267085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.267086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.267086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826604000000 softexpires=826604000000
          <idle>-0     [001] d..2   828.267088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826604000000 softexpires=826604000000
          <idle>-0     [000] d..2   828.271300: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.271301: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826620000000 softexpires=826620000000
          <idle>-0     [018] d.h2   828.283084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.283084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.283084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826604001240
          <idle>-0     [001] d.h1   828.283084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826604001420
          <idle>-0     [018] d.h1   828.283085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.283085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.283087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826620000000 softexpires=826620000000
          <idle>-0     [001] d..2   828.283087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826620000000 softexpires=826620000000
          <idle>-0     [000] d..2   828.287966: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.287967: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826636000000 softexpires=826636000000
          <idle>-0     [018] d.h2   828.299084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.299084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.299084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826620001260
          <idle>-0     [001] d.h1   828.299084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826620001460
          <idle>-0     [018] d.h1   828.299085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.299085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.299086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826636000000 softexpires=826636000000
          <idle>-0     [001] d..2   828.299087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826636000000 softexpires=826636000000
          <idle>-0     [000] d..2   828.303242: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.303243: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826652000000 softexpires=826652000000
          <idle>-0     [018] d.h2   828.315084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.315084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.315084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826636001260
          <idle>-0     [001] d.h1   828.315084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826636001460
          <idle>-0     [018] d.h1   828.315085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.315085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.315087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826652000000 softexpires=826652000000
          <idle>-0     [001] d..2   828.315087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826652000000 softexpires=826652000000
          <idle>-0     [000] d..2   828.319908: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.319909: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826668000000 softexpires=826668000000
          <idle>-0     [018] d.h2   828.331084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.331084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.331084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826652001240
          <idle>-0     [001] d.h1   828.331084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826652001440
          <idle>-0     [018] d.h1   828.331085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.331085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.331086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826668000000 softexpires=826668000000
          <idle>-0     [001] d..2   828.331087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826668000000 softexpires=826668000000
          <idle>-0     [000] d..2   828.335185: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.335185: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826684000000 softexpires=826684000000
          <idle>-0     [018] d.h2   828.347084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.347084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.347084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826668001260
          <idle>-0     [001] d.h1   828.347084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826668001460
          <idle>-0     [018] d.h1   828.347085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.347085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.347087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826684000000 softexpires=826684000000
          <idle>-0     [001] d..2   828.347087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826684000000 softexpires=826684000000
          <idle>-0     [000] d..2   828.351850: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.351851: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826700000000 softexpires=826700000000
          <idle>-0     [018] d.h2   828.363084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.363084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.363084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826684001260
          <idle>-0     [001] d.h1   828.363084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826684001460
          <idle>-0     [018] d.h1   828.363085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.363085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.363086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826700000000 softexpires=826700000000
          <idle>-0     [001] d..2   828.363087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826700000000 softexpires=826700000000
          <idle>-0     [000] d..2   828.367127: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.367128: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826716000000 softexpires=826716000000
          <idle>-0     [018] d.h2   828.379084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.379084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.379084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826700001260
          <idle>-0     [001] d.h1   828.379084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826700001440
          <idle>-0     [018] d.h1   828.379085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.379085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.379087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826716000000 softexpires=826716000000
          <idle>-0     [018] d..2   828.379087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826716000000 softexpires=826716000000
          <idle>-0     [000] d..2   828.383793: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.383794: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826732000000 softexpires=826732000000
          <idle>-0     [018] d.h2   828.395084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.395084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.395084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826716001260
          <idle>-0     [001] d.h1   828.395084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826716001480
          <idle>-0     [018] d.h1   828.395085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.395085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.395086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826732000000 softexpires=826732000000
          <idle>-0     [001] d..2   828.395087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826732000000 softexpires=826732000000
          <idle>-0     [000] d..2   828.400458: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.400459: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826748000000 softexpires=826748000000
          <idle>-0     [018] d.h2   828.411084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.411084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.411084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826732001300
          <idle>-0     [001] d.h1   828.411084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826732001500
          <idle>-0     [018] d.h1   828.411085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.411085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.411087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826748000000 softexpires=826748000000
          <idle>-0     [018] d..2   828.411087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826748000000 softexpires=826748000000
          <idle>-0     [000] d..2   828.415735: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.415736: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826764000000 softexpires=826764000000
          <idle>-0     [018] d.h2   828.427084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.427084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.427084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826748001320
          <idle>-0     [001] d.h1   828.427084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826748001520
          <idle>-0     [018] d.h1   828.427085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.427085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.427086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826764000000 softexpires=826764000000
          <idle>-0     [001] d..2   828.427087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826764000000 softexpires=826764000000
          <idle>-0     [000] d..2   828.432401: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.432401: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826780000000 softexpires=826780000000
          <idle>-0     [018] d.h2   828.443084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.443084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.443084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826764001260
          <idle>-0     [001] d.h1   828.443084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826764001480
          <idle>-0     [018] d.h1   828.443085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.443085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.443087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826780000000 softexpires=826780000000
          <idle>-0     [001] d..2   828.443087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826780000000 softexpires=826780000000
          <idle>-0     [000] d..2   828.447678: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.447678: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826796000000 softexpires=826796000000
          <idle>-0     [018] d.h2   828.459084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.459084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.459084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826780001260
          <idle>-0     [001] d.h1   828.459084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826780001460
          <idle>-0     [018] d.h1   828.459085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.459085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.459086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826796000000 softexpires=826796000000
          <idle>-0     [001] d..2   828.459087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826796000000 softexpires=826796000000
          <idle>-0     [000] d..2   828.464343: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.464344: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826812000000 softexpires=826812000000
          <idle>-0     [018] d.h2   828.475084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.475084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.475084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826796001240
          <idle>-0     [001] d.h1   828.475084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826796001460
          <idle>-0     [018] d.h1   828.475085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.475085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.475087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826812000000 softexpires=826812000000
          <idle>-0     [001] d..2   828.475087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826812000000 softexpires=826812000000
          <idle>-0     [000] d..2   828.479620: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.479621: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826828000000 softexpires=826828000000
          <idle>-0     [018] d.h2   828.491084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.491084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.491084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826812001260
          <idle>-0     [001] d.h1   828.491084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826812001460
          <idle>-0     [018] d.h1   828.491085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.491085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.491086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826828000000 softexpires=826828000000
          <idle>-0     [001] d..2   828.491087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826828000000 softexpires=826828000000
          <idle>-0     [000] d..2   828.496286: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.496286: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826844000000 softexpires=826844000000
          <idle>-0     [018] d.h2   828.507084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.507084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.507084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826828001260
          <idle>-0     [001] d.h1   828.507084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826828001480
          <idle>-0     [018] d.h1   828.507085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.507085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.507087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826844000000 softexpires=826844000000
          <idle>-0     [018] d..2   828.507087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826844000000 softexpires=826844000000
          <idle>-0     [000] d..2   828.511562: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.511563: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826860000000 softexpires=826860000000
          <idle>-0     [018] d.h2   828.523084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.523084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.523084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826844001300
          <idle>-0     [001] d.h1   828.523084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826844001480
          <idle>-0     [018] d.h1   828.523085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.523085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.523087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826860000000 softexpires=826860000000
          <idle>-0     [001] d..2   828.523087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826860000000 softexpires=826860000000
          <idle>-0     [000] d..2   828.528228: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.528229: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826876000000 softexpires=826876000000
          <idle>-0     [018] d.h2   828.539084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.539084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.539084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826860001260
          <idle>-0     [001] d.h1   828.539084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826860001460
          <idle>-0     [018] d.h1   828.539085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.539085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.539087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826876000000 softexpires=826876000000
          <idle>-0     [001] d..2   828.539087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826876000000 softexpires=826876000000
          <idle>-0     [000] d..2   828.543505: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.543505: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826892000000 softexpires=826892000000
          <idle>-0     [018] d.h2   828.555084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.555084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.555084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826876001240
          <idle>-0     [001] d.h1   828.555084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826876001420
          <idle>-0     [018] d.h1   828.555085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.555085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.555087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826892000000 softexpires=826892000000
          <idle>-0     [001] d..2   828.555088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826892000000 softexpires=826892000000
          <idle>-0     [000] d..2   828.560170: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.560171: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826908000000 softexpires=826908000000
          <idle>-0     [018] d.h2   828.571084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.571084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.571084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826892001320
          <idle>-0     [001] d.h1   828.571084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826892001520
          <idle>-0     [018] d.h1   828.571085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.571085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.571087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826908000000 softexpires=826908000000
          <idle>-0     [001] d..2   828.571088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826908000000 softexpires=826908000000
          <idle>-0     [000] d..2   828.575447: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.575448: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
          <idle>-0     [000] dn.2   828.585170: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   828.585171: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826908000000 softexpires=826908000000
          <idle>-0     [018] d.h2   828.587084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.587084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h1   828.587084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   828.587084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826908001200
          <idle>-0     [001] d.h1   828.587084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826908001420
              ps-2012  [000] d.h.   828.587085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=826908001320
          <idle>-0     [018] d.h1   828.587085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.587085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.587086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
              ps-2012  [000] d.h.   828.587088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h1   828.587088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826912000000 softexpires=826912000000
              ps-2012  [000] d.s1   828.587090: timer_cancel: timer=ffff0000090295a8
              ps-2012  [000] d.s.   828.587090: timer_expire_entry: timer=ffff0000090295a8 function=delayed_work_timer_fn now=4295099024
              ps-2012  [000] dns.   828.587092: timer_expire_exit: timer=ffff0000090295a8
          <idle>-0     [001] d..2   828.587095: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
          <idle>-0     [001] dn.2   828.587101: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] dn.2   828.587101: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826912000000 softexpires=826912000000
          <idle>-0     [018] dn.2   828.587106: hrtimer_cancel: hrtimer=ffff8017fbc74808
     kworker/1:1-1158  [001] d..1   828.587107: timer_start: timer=ffff8017fbe5e558 function=delayed_work_timer_fn expires=4295099247 [timeout=223] cpu=1 idx=131 flags=D|I
          <idle>-0     [018] dn.2   828.587107: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826912000000 softexpires=826912000000
     kworker/0:0-3     [000] d..1   828.587108: timer_start: timer=ffff0000090295a8 function=delayed_work_timer_fn expires=4295099250 [timeout=226] cpu=0 idx=80 flags=D|I
          <idle>-0     [001] d..1   828.587110: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   828.587110: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.587110: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
     kworker/0:0-3     [000] d..1   828.587112: timer_start: timer=ffff8017fbe43558 function=delayed_work_timer_fn expires=4295099250 [timeout=226] cpu=0 idx=101 flags=D|I
    kworker/18:1-1177  [018] d..1   828.587115: timer_start: timer=ffff8017fbc76558 function=delayed_work_timer_fn expires=4295099446 [timeout=422] cpu=18 idx=70 flags=D|I
          <idle>-0     [018] d..1   828.587119: tick_stop: success=1 dependency=NONE
          <idle>-0     [018] d..2   828.587119: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   828.587120: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
          <idle>-0     [000] d..1   828.587808: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.587809: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.587809: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826924000000 softexpires=826924000000
          <idle>-0     [000] d..2   828.591939: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.591940: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826940000000 softexpires=826940000000
          <idle>-0     [018] d.h2   828.603084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.603084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.603084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826924001400
          <idle>-0     [001] d.h1   828.603084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826924001640
          <idle>-0     [018] d.h1   828.603085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.603086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.603088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826940000000 softexpires=826940000000
          <idle>-0     [001] d..2   828.603088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826940000000 softexpires=826940000000
          <idle>-0     [000] d..2   828.607216: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.607217: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826956000000 softexpires=826956000000
          <idle>-0     [018] d.h2   828.619084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.619084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.619084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826940001300
          <idle>-0     [001] d.h1   828.619084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826940001520
          <idle>-0     [018] d.h1   828.619085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.619085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.619087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826956000000 softexpires=826956000000
          <idle>-0     [001] d..2   828.619088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826956000000 softexpires=826956000000
          <idle>-0     [000] d..2   828.623882: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.623882: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826972000000 softexpires=826972000000
          <idle>-0     [018] d.h2   828.635084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.635084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.635084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826956001340
          <idle>-0     [001] d.h1   828.635084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826956001540
          <idle>-0     [018] d.h1   828.635085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.635085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   828.635087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826972000000 softexpires=826972000000
          <idle>-0     [018] d..2   828.635087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826972000000 softexpires=826972000000
          <idle>-0     [000] d..2   828.639158: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.639159: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=826988000000 softexpires=826988000000
          <idle>-0     [018] d.h2   828.651084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.651084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.651084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826972001300
          <idle>-0     [001] d.h1   828.651084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826972001520
          <idle>-0     [018] d.h1   828.651085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.651085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.651086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=826988000000 softexpires=826988000000
          <idle>-0     [001] d..2   828.651088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=826988000000 softexpires=826988000000
          <idle>-0     [000] d..2   828.655824: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.655825: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827004000000 softexpires=827004000000
          <idle>-0     [018] d.h2   828.667084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.667084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.667084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=826988001320
          <idle>-0     [001] d.h1   828.667084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=826988001540
          <idle>-0     [018] d.h1   828.667085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.667085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.667087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827004000000 softexpires=827004000000
          <idle>-0     [001] d..2   828.667087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827004000000 softexpires=827004000000
          <idle>-0     [000] d..2   828.671101: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.671101: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827020000000 softexpires=827020000000
          <idle>-0     [018] d.h2   828.683084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.683084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.683084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827004001340
          <idle>-0     [001] d.h1   828.683084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827004001520
          <idle>-0     [018] d.h1   828.683085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.683085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.683086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827020000000 softexpires=827020000000
          <idle>-0     [001] d..2   828.683088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827020000000 softexpires=827020000000
          <idle>-0     [000] d..2   828.687766: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.687767: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827036000000 softexpires=827036000000
          <idle>-0     [018] d.h2   828.699084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.699084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.699084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827020001320
          <idle>-0     [001] d.h1   828.699084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827020001520
          <idle>-0     [018] d.h1   828.699085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.699085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.699087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827036000000 softexpires=827036000000
          <idle>-0     [001] d..2   828.699088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827036000000 softexpires=827036000000
          <idle>-0     [000] d..2   828.704432: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.704433: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827052000000 softexpires=827052000000
          <idle>-0     [018] d.h2   828.715084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.715084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.715084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827036001320
          <idle>-0     [001] d.h1   828.715084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827036001520
          <idle>-0     [018] d.h1   828.715085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.715085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.715086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827052000000 softexpires=827052000000
          <idle>-0     [001] d..2   828.715088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827052000000 softexpires=827052000000
          <idle>-0     [000] d..2   828.719709: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.719709: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827068000000 softexpires=827068000000
          <idle>-0     [018] d.h2   828.731084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.731084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.731084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827052001320
          <idle>-0     [001] d.h1   828.731084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827052001520
          <idle>-0     [018] d.h1   828.731085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.731085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.731087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827068000000 softexpires=827068000000
          <idle>-0     [001] d..2   828.731087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827068000000 softexpires=827068000000
          <idle>-0     [000] d..2   828.736374: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.736375: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827084000000 softexpires=827084000000
          <idle>-0     [018] d.h2   828.747084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.747084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.747084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827068001320
          <idle>-0     [001] d.h1   828.747084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827068001500
          <idle>-0     [018] d.h1   828.747085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.747085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.747086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827084000000 softexpires=827084000000
          <idle>-0     [001] d..2   828.747088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827084000000 softexpires=827084000000
          <idle>-0     [000] d..2   828.751651: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.751652: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827100000000 softexpires=827100000000
          <idle>-0     [018] d.h2   828.763084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.763084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.763084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827084001340
          <idle>-0     [001] d.h1   828.763084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827084001560
          <idle>-0     [018] d.h1   828.763085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.763085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.763087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827100000000 softexpires=827100000000
          <idle>-0     [001] d..2   828.763087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827100000000 softexpires=827100000000
          <idle>-0     [000] d..2   828.768317: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.768317: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827116000000 softexpires=827116000000
          <idle>-0     [018] d.h2   828.779084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.779084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.779084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827100001320
          <idle>-0     [001] d.h1   828.779085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827100001520
          <idle>-0     [018] d.h1   828.779085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.779086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.779086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827116000000 softexpires=827116000000
          <idle>-0     [001] d..2   828.779088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827116000000 softexpires=827116000000
          <idle>-0     [000] d..2   828.783594: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.783594: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827132000000 softexpires=827132000000
          <idle>-0     [018] d.h2   828.795084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.795084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.795084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827116001320
          <idle>-0     [001] d.h1   828.795084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827116001520
          <idle>-0     [018] d.h1   828.795085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.795085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.795087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827132000000 softexpires=827132000000
          <idle>-0     [001] d..2   828.795087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827132000000 softexpires=827132000000
          <idle>-0     [000] d..2   828.800259: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.800260: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827148000000 softexpires=827148000000
          <idle>-0     [018] d.h2   828.811084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.811084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.811084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827132001320
          <idle>-0     [001] d.h1   828.811084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827132001520
          <idle>-0     [018] d.h1   828.811085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.811085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.811086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827148000000 softexpires=827148000000
          <idle>-0     [001] d..2   828.811088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827148000000 softexpires=827148000000
          <idle>-0     [000] d..2   828.815536: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.815537: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827164000000 softexpires=827164000000
          <idle>-0     [018] d.h2   828.827084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.827084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.827084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827148001300
          <idle>-0     [001] d.h1   828.827084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827148001500
          <idle>-0     [018] d.h1   828.827085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.827085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.827087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827164000000 softexpires=827164000000
          <idle>-0     [001] d..2   828.827087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827164000000 softexpires=827164000000
          <idle>-0     [000] d..2   828.832202: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.832202: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827180000000 softexpires=827180000000
          <idle>-0     [018] d.h2   828.843084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.843084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.843084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827164001320
          <idle>-0     [001] d.h1   828.843084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827164001520
          <idle>-0     [018] d.h1   828.843085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.843085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.843086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827180000000 softexpires=827180000000
          <idle>-0     [001] d..2   828.843088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827180000000 softexpires=827180000000
          <idle>-0     [000] d..2   828.847478: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.847479: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827196000000 softexpires=827196000000
          <idle>-0     [018] d.h2   828.859084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.859084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.859084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827180001300
          <idle>-0     [001] d.h1   828.859084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827180001500
          <idle>-0     [018] d.h1   828.859085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.859085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.859087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827196000000 softexpires=827196000000
          <idle>-0     [001] d..2   828.859087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827196000000 softexpires=827196000000
          <idle>-0     [000] d..2   828.864144: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.864145: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827212000000 softexpires=827212000000
          <idle>-0     [018] d.h2   828.875084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.875084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.875084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827196001320
          <idle>-0     [001] d.h1   828.875084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827196001500
          <idle>-0     [018] d.h1   828.875085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.875085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.875086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827212000000 softexpires=827212000000
          <idle>-0     [001] d..2   828.875087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827212000000 softexpires=827212000000
          <idle>-0     [000] d..2   828.879421: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.879421: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827228000000 softexpires=827228000000
          <idle>-0     [018] d.h2   828.891084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.891084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.891084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827212001340
          <idle>-0     [001] d.h1   828.891084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827212001560
          <idle>-0     [018] d.h1   828.891085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.891085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.891087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827228000000 softexpires=827228000000
          <idle>-0     [001] d..2   828.891087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827228000000 softexpires=827228000000
          <idle>-0     [000] d..2   828.896086: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.896087: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827244000000 softexpires=827244000000
          <idle>-0     [018] d.h2   828.907084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.907084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.907084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827228001300
          <idle>-0     [001] d.h1   828.907084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827228001500
          <idle>-0     [018] d.h1   828.907085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.907085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.907086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827244000000 softexpires=827244000000
          <idle>-0     [001] d..2   828.907088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827244000000 softexpires=827244000000
          <idle>-0     [000] d..2   828.911363: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.911364: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827260000000 softexpires=827260000000
          <idle>-0     [000] dn.2   828.921086: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   828.921087: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827244000000 softexpires=827244000000
          <idle>-0     [018] d.h2   828.923084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.923084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h1   828.923084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   828.923084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827244001340
          <idle>-0     [001] d.h1   828.923084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827244001520
              ps-2012  [000] d.h.   828.923085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=827244001400
          <idle>-0     [018] d.h1   828.923085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.923085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.923087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827260000000 softexpires=827260000000
              ps-2012  [000] d.h.   828.923088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h1   828.923088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827248000000 softexpires=827248000000
          <idle>-0     [001] d..2   828.923089: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827260000000 softexpires=827260000000
          <idle>-0     [000] d..1   828.923707: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   828.923708: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.923708: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827260000000 softexpires=827260000000
          <idle>-0     [000] d..2   828.927855: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.927856: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827276000000 softexpires=827276000000
          <idle>-0     [018] d.h2   828.939084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.939084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.939084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827260001320
          <idle>-0     [001] d.h1   828.939084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827260001560
          <idle>-0     [018] d.h1   828.939085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.939085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.939086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827276000000 softexpires=827276000000
          <idle>-0     [001] d..2   828.939088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827276000000 softexpires=827276000000
          <idle>-0     [000] d..2   828.943132: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.943133: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827292000000 softexpires=827292000000
          <idle>-0     [018] d.h2   828.955084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.955084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.955084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827276001340
          <idle>-0     [001] d.h1   828.955084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827276001560
          <idle>-0     [018] d.h1   828.955085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.955085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.955087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827292000000 softexpires=827292000000
          <idle>-0     [001] d..2   828.955087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827292000000 softexpires=827292000000
          <idle>-0     [000] d..2   828.959798: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.959798: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827308000000 softexpires=827308000000
          <idle>-0     [018] d.h2   828.971084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.971084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.971084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827292001300
          <idle>-0     [001] d.h1   828.971084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827292001480
          <idle>-0     [018] d.h1   828.971085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.971085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.971086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827308000000 softexpires=827308000000
          <idle>-0     [001] d..2   828.971088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827308000000 softexpires=827308000000
          <idle>-0     [000] d..2   828.976463: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.976464: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827324000000 softexpires=827324000000
          <idle>-0     [018] d.h2   828.987084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   828.987084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   828.987084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827308001320
          <idle>-0     [001] d.h1   828.987084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827308001520
          <idle>-0     [018] d.h1   828.987085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   828.987085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   828.987087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827324000000 softexpires=827324000000
          <idle>-0     [001] d..2   828.987087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827324000000 softexpires=827324000000
          <idle>-0     [000] d..2   828.991740: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   828.991741: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827340000000 softexpires=827340000000
          <idle>-0     [018] d.h2   829.003084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.003084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.003084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827324001300
          <idle>-0     [001] d.h1   829.003084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827324001520
          <idle>-0     [018] d.h1   829.003085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.003085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.003086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827340000000 softexpires=827340000000
          <idle>-0     [001] d..2   829.003088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827340000000 softexpires=827340000000
          <idle>-0     [000] d..2   829.008406: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.008406: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827356000000 softexpires=827356000000
          <idle>-0     [018] d.h2   829.019084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.019084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.019084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827340001340
          <idle>-0     [001] d.h1   829.019084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827340001580
          <idle>-0     [018] d.h1   829.019085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.019085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.019087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827356000000 softexpires=827356000000
          <idle>-0     [001] d..2   829.019087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827356000000 softexpires=827356000000
          <idle>-0     [000] d..2   829.023682: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.023683: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827372000000 softexpires=827372000000
          <idle>-0     [018] d.h2   829.035084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.035084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.035084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827356001300
          <idle>-0     [001] d.h1   829.035084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827356001520
          <idle>-0     [018] d.h1   829.035085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.035085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.035087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827372000000 softexpires=827372000000
          <idle>-0     [001] d..2   829.035088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827372000000 softexpires=827372000000
          <idle>-0     [000] d..2   829.040348: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.040349: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827388000000 softexpires=827388000000
          <idle>-0     [018] d.h2   829.051084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.051084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.051084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827372001300
          <idle>-0     [001] d.h1   829.051084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827372001480
          <idle>-0     [018] d.h1   829.051085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.051085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   829.051087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827388000000 softexpires=827388000000
          <idle>-0     [018] d..2   829.051087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827388000000 softexpires=827388000000
          <idle>-0     [001] d.h2   829.067085: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h2   829.067085: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   829.067085: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.067086: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827388002960
          <idle>-0     [018] d.h1   829.067086: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827388003120
          <idle>-0     [034] d.h2   829.067086: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [000] d.h1   829.067086: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=827388002980
          <idle>-0     [034] d.h1   829.067086: hrtimer_expire_entry: hrtimer=ffff8017dba91808 function=tick_sched_timer now=827388003140
          <idle>-0     [018] d.h1   829.067087: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.067087: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   829.067088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [034] d.h1   829.067088: hrtimer_expire_exit: hrtimer=ffff8017dba91808
          <idle>-0     [018] d..2   829.067088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827404000000 softexpires=827404000000
          <idle>-0     [034] d.s2   829.067089: timer_cancel: timer=ffff00000909a9a0
          <idle>-0     [000] d.s2   829.067089: timer_cancel: timer=ffff8017d3923408
          <idle>-0     [034] d.s1   829.067089: timer_expire_entry: timer=ffff00000909a9a0 function=delayed_work_timer_fn now=4295099144
          <idle>-0     [000] d.s1   829.067089: timer_expire_entry: timer=ffff8017d3923408 function=delayed_work_timer_fn now=4295099144
          <idle>-0     [001] d..2   829.067089: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827404000000 softexpires=827404000000
          <idle>-0     [034] dns1   829.067091: timer_expire_exit: timer=ffff00000909a9a0
          <idle>-0     [000] dns1   829.067091: timer_expire_exit: timer=ffff8017d3923408
          <idle>-0     [034] dn.2   829.067094: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=827392000000 softexpires=827392000000
          <idle>-0     [000] dn.2   829.067094: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827392000000 softexpires=827392000000
     kworker/0:0-3     [000] d..1   829.067097: timer_start: timer=ffff8017d3923408 function=delayed_work_timer_fn expires=4295099394 [timeout=250] cpu=0 idx=97 flags=I
          <idle>-0     [000] d..1   829.067100: tick_stop: success=1 dependency=NONE
    kworker/34:2-1587  [034] d..1   829.067101: timer_start: timer=ffff00000909a9a0 function=delayed_work_timer_fn expires=4295099394 [timeout=250] cpu=34 idx=97 flags=I
          <idle>-0     [000] d..2   829.067101: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.067102: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827404000000 softexpires=827404000000
          <idle>-0     [034] d..1   829.067104: tick_stop: success=1 dependency=NONE
          <idle>-0     [034] d..2   829.067104: hrtimer_cancel: hrtimer=ffff8017dba91808
          <idle>-0     [034] d..2   829.067104: hrtimer_start: hrtimer=ffff8017dba91808 function=tick_sched_timer expires=828412000000 softexpires=828412000000
          <idle>-0     [000] d..2   829.072292: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.072292: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827420000000 softexpires=827420000000
          <idle>-0     [018] d.h2   829.083084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.083084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.083084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827404001380
          <idle>-0     [001] d.h1   829.083084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827404001600
          <idle>-0     [018] d.h1   829.083085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.083086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.083087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827420000000 softexpires=827420000000
          <idle>-0     [001] d..2   829.083088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827420000000 softexpires=827420000000
          <idle>-0     [000] d.h2   829.099084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   829.099084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.099084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.099084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827420001460
          <idle>-0     [000] d.h1   829.099084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=827420001460
          <idle>-0     [001] d.h1   829.099084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827420001660
          <idle>-0     [018] d.h1   829.099085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.099085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   829.099086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   829.099086: timer_cancel: timer=ffff8017d3923c08
          <idle>-0     [018] d..2   829.099087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827436000000 softexpires=827436000000
          <idle>-0     [000] d.s1   829.099087: timer_expire_entry: timer=ffff8017d3923c08 function=delayed_work_timer_fn now=4295099152
          <idle>-0     [001] d..2   829.099088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827436000000 softexpires=827436000000
          <idle>-0     [000] dns1   829.099088: timer_expire_exit: timer=ffff8017d3923c08
          <idle>-0     [000] dn.2   829.099092: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827424000000 softexpires=827424000000
     kworker/0:0-3     [000] d..1   829.099094: timer_start: timer=ffff8017d3923c08 function=delayed_work_timer_fn expires=4295099402 [timeout=250] cpu=0 idx=98 flags=I
          <idle>-0     [000] d..1   829.099096: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.099097: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.099097: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827436000000 softexpires=827436000000
          <idle>-0     [000] d..2   829.104233: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.104233: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827452000000 softexpires=827452000000
          <idle>-0     [018] d.h2   829.115084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.115084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.115084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827436001340
          <idle>-0     [001] d.h1   829.115084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827436001540
          <idle>-0     [018] d.h1   829.115085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.115085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.115087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827452000000 softexpires=827452000000
          <idle>-0     [001] d..2   829.115087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827452000000 softexpires=827452000000
          <idle>-0     [000] d..2   829.119510: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.119510: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827468000000 softexpires=827468000000
          <idle>-0     [018] d.h2   829.131084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.131084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.131084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827452001360
          <idle>-0     [001] d.h1   829.131084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827452001580
          <idle>-0     [018] d.h1   829.131085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.131085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.131086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827468000000 softexpires=827468000000
          <idle>-0     [001] d..2   829.131087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827468000000 softexpires=827468000000
          <idle>-0     [000] d..2   829.136175: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.136176: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827484000000 softexpires=827484000000
          <idle>-0     [018] d.h2   829.147084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.147084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.147084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827468001400
          <idle>-0     [001] d.h1   829.147084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827468001620
          <idle>-0     [018] d.h1   829.147085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.147085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.147087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827484000000 softexpires=827484000000
          <idle>-0     [001] d..2   829.147087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827484000000 softexpires=827484000000
          <idle>-0     [000] d..2   829.151452: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.151453: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827500000000 softexpires=827500000000
          <idle>-0     [018] d.h2   829.163084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.163084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.163084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827484001300
          <idle>-0     [001] d.h1   829.163084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827484001480
          <idle>-0     [018] d.h1   829.163085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.163085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.163086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827500000000 softexpires=827500000000
          <idle>-0     [001] d..2   829.163087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827500000000 softexpires=827500000000
          <idle>-0     [000] d..2   829.168118: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.168118: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827516000000 softexpires=827516000000
          <idle>-0     [018] d.h2   829.179084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.179084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   829.179084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827500001540
          <idle>-0     [018] d.h1   829.179084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827500001340
          <idle>-0     [001] d.h1   829.179085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.179085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d..2   829.179087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827516000000 softexpires=827516000000
          <idle>-0     [018] d..2   829.179088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827516000000 softexpires=827516000000
          <idle>-0     [000] d..2   829.183394: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.183395: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827532000000 softexpires=827532000000
          <idle>-0     [018] d.h2   829.195084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.195084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.195084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827516001340
          <idle>-0     [001] d.h1   829.195084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827516001540
          <idle>-0     [018] d.h1   829.195085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.195085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.195086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827532000000 softexpires=827532000000
          <idle>-0     [001] d..2   829.195088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827532000000 softexpires=827532000000
          <idle>-0     [000] d..2   829.200060: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.200061: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827548000000 softexpires=827548000000
          <idle>-0     [018] d.h2   829.211084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.211084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.211084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827532001380
          <idle>-0     [001] d.h1   829.211084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827532001560
          <idle>-0     [018] d.h1   829.211085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.211085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   829.211087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827548000000 softexpires=827548000000
          <idle>-0     [018] d..2   829.211087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827548000000 softexpires=827548000000
          <idle>-0     [000] d..2   829.215337: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.215337: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827564000000 softexpires=827564000000
          <idle>-0     [018] d.h2   829.227084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.227084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.227084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827548001340
          <idle>-0     [001] d.h1   829.227084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827548001560
          <idle>-0     [018] d.h1   829.227085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.227085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.227086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827564000000 softexpires=827564000000
          <idle>-0     [001] d..2   829.227088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827564000000 softexpires=827564000000
          <idle>-0     [000] d..2   829.232002: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.232003: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827580000000 softexpires=827580000000
          <idle>-0     [018] d.h2   829.243084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.243084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.243084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827564001360
          <idle>-0     [001] d.h1   829.243084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827564001580
          <idle>-0     [018] d.h1   829.243085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.243085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.243087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827580000000 softexpires=827580000000
          <idle>-0     [001] d..2   829.243087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827580000000 softexpires=827580000000
          <idle>-0     [000] d..2   829.247279: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.247280: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827596000000 softexpires=827596000000
          <idle>-0     [000] dn.2   829.257002: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.257003: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827580000000 softexpires=827580000000
          <idle>-0     [018] d.h2   829.259084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.259084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h1   829.259084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   829.259084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827580001360
          <idle>-0     [001] d.h1   829.259084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827580001540
              ps-2012  [000] d.h.   829.259085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=827580001460
          <idle>-0     [018] d.h1   829.259085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.259085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.259086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827596000000 softexpires=827596000000
              ps-2012  [000] d.h.   829.259088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h1   829.259089: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827584000000 softexpires=827584000000
          <idle>-0     [001] d..2   829.259094: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827596000000 softexpires=827596000000
          <idle>-0     [000] d..1   829.259641: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.259642: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.259642: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827596000000 softexpires=827596000000
          <idle>-0     [000] d..2   829.263771: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.263772: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827612000000 softexpires=827612000000
          <idle>-0     [018] d.h2   829.275084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.275084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.275084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827596001380
          <idle>-0     [001] d.h1   829.275084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827596001620
          <idle>-0     [018] d.h1   829.275085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.275085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.275087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827612000000 softexpires=827612000000
          <idle>-0     [001] d..2   829.275087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827612000000 softexpires=827612000000
          <idle>-0     [000] d..2   829.280437: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.280437: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827628000000 softexpires=827628000000
          <idle>-0     [018] d.h2   829.291084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.291084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.291084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827612001360
          <idle>-0     [001] d.h1   829.291084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827612001580
          <idle>-0     [018] d.h1   829.291085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.291085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.291086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827628000000 softexpires=827628000000
          <idle>-0     [001] d..2   829.291088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827628000000 softexpires=827628000000
          <idle>-0     [000] d..2   829.295714: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.295714: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827644000000 softexpires=827644000000
          <idle>-0     [018] d.h2   829.307084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.307084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.307084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827628001420
          <idle>-0     [001] d.h1   829.307084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827628001640
          <idle>-0     [018] d.h1   829.307085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.307085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.307087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827644000000 softexpires=827644000000
          <idle>-0     [001] d..2   829.307087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827644000000 softexpires=827644000000
          <idle>-0     [000] d..2   829.312379: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.312380: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827660000000 softexpires=827660000000
          <idle>-0     [018] d.h2   829.323084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.323084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.323084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827644001340
          <idle>-0     [001] d.h1   829.323085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827644001560
          <idle>-0     [018] d.h1   829.323085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.323086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.323086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827660000000 softexpires=827660000000
          <idle>-0     [001] d..2   829.323088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827660000000 softexpires=827660000000
          <idle>-0     [000] d..2   829.327656: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.327657: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827676000000 softexpires=827676000000
          <idle>-0     [018] d.h2   829.339084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.339084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.339084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827660001460
          <idle>-0     [001] d.h1   829.339084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827660001660
          <idle>-0     [018] d.h1   829.339085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.339085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.339087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827676000000 softexpires=827676000000
          <idle>-0     [001] d..2   829.339087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827676000000 softexpires=827676000000
          <idle>-0     [000] d..2   829.344322: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.344322: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827692000000 softexpires=827692000000
          <idle>-0     [018] d.h2   829.355084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.355084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.355084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827676001340
          <idle>-0     [001] d.h1   829.355084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827676001560
          <idle>-0     [018] d.h1   829.355085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.355085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.355086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827692000000 softexpires=827692000000
          <idle>-0     [001] d..2   829.355088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827692000000 softexpires=827692000000
          <idle>-0     [000] d..2   829.359599: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.359599: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827708000000 softexpires=827708000000
          <idle>-0     [018] d.h2   829.371084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.371084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.371084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827692001360
          <idle>-0     [001] d.h1   829.371084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827692001580
          <idle>-0     [018] d.h1   829.371085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.371085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.371087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827708000000 softexpires=827708000000
          <idle>-0     [001] d..2   829.371087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827708000000 softexpires=827708000000
          <idle>-0     [000] d..2   829.376264: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.376265: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827724000000 softexpires=827724000000
          <idle>-0     [018] d.h2   829.387084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.387084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.387084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827708001360
          <idle>-0     [001] d.h1   829.387084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827708001560
          <idle>-0     [018] d.h1   829.387085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.387085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.387086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827724000000 softexpires=827724000000
          <idle>-0     [001] d..2   829.387088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827724000000 softexpires=827724000000
          <idle>-0     [000] d..2   829.391541: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.391541: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827740000000 softexpires=827740000000
          <idle>-0     [018] d.h2   829.403084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.403084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.403084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827724001380
          <idle>-0     [001] d.h1   829.403084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827724001580
          <idle>-0     [018] d.h1   829.403085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.403085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.403087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827740000000 softexpires=827740000000
          <idle>-0     [001] d..2   829.403087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827740000000 softexpires=827740000000
          <idle>-0     [000] d..2   829.408206: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.408207: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827756000000 softexpires=827756000000
          <idle>-0     [018] d.h2   829.419084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.419084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.419084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827740001360
          <idle>-0     [001] d.h1   829.419084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827740001560
          <idle>-0     [018] d.h1   829.419085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.419085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.419086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827756000000 softexpires=827756000000
          <idle>-0     [001] d..2   829.419088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827756000000 softexpires=827756000000
          <idle>-0     [000] d..2   829.423483: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.423484: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827772000000 softexpires=827772000000
          <idle>-0     [018] d.h2   829.435084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.435084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.435084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827756001340
          <idle>-0     [001] d.h1   829.435084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827756001560
          <idle>-0     [018] d.h1   829.435085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.435085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.435087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827772000000 softexpires=827772000000
          <idle>-0     [001] d..2   829.435087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827772000000 softexpires=827772000000
          <idle>-0     [000] d..2   829.440149: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.440149: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827788000000 softexpires=827788000000
          <idle>-0     [018] d.h2   829.451084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.451084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.451084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827772001360
          <idle>-0     [001] d.h1   829.451084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827772001560
          <idle>-0     [018] d.h1   829.451085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.451085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.451087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827788000000 softexpires=827788000000
          <idle>-0     [001] d..2   829.451088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827788000000 softexpires=827788000000
          <idle>-0     [000] d..2   829.455426: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.455426: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827804000000 softexpires=827804000000
          <idle>-0     [018] d.h2   829.467084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.467084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.467084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827788001380
          <idle>-0     [001] d.h1   829.467084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827788001560
          <idle>-0     [018] d.h1   829.467085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.467085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   829.467087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827804000000 softexpires=827804000000
          <idle>-0     [018] d..2   829.467087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827804000000 softexpires=827804000000
          <idle>-0     [000] d..2   829.472091: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.472092: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827820000000 softexpires=827820000000
          <idle>-0     [018] d.h2   829.483084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.483084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.483084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827804001360
          <idle>-0     [001] d.h1   829.483084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827804001580
          <idle>-0     [018] d.h1   829.483085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.483085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.s2   829.483086: timer_cancel: timer=ffff8017fbe5e558
          <idle>-0     [018] d..2   829.483086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827820000000 softexpires=827820000000
          <idle>-0     [001] d.s1   829.483087: timer_expire_entry: timer=ffff8017fbe5e558 function=delayed_work_timer_fn now=4295099248
          <idle>-0     [001] dns1   829.483089: timer_expire_exit: timer=ffff8017fbe5e558
          <idle>-0     [001] dn.2   829.483091: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827808000000 softexpires=827808000000
          <idle>-0     [001] d..1   829.483097: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   829.483098: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   829.483098: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827820000000 softexpires=827820000000
          <idle>-0     [000] d..2   829.487368: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.487369: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827836000000 softexpires=827836000000
          <idle>-0     [018] d.h2   829.499084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.499084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.499084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827820001420
          <idle>-0     [001] d.h1   829.499084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827820001660
          <idle>-0     [018] d.h1   829.499085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.499085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.499087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827836000000 softexpires=827836000000
          <idle>-0     [001] d..2   829.499088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827836000000 softexpires=827836000000
          <idle>-0     [000] d..2   829.504034: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.504034: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827852000000 softexpires=827852000000
          <idle>-0     [018] d.h2   829.515084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.515084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.515084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827836001640
          <idle>-0     [001] d.h1   829.515084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827836001780
          <idle>-0     [018] d.h1   829.515085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.515085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.515087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827852000000 softexpires=827852000000
          <idle>-0     [001] d..2   829.515088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827852000000 softexpires=827852000000
          <idle>-0     [000] d..2   829.519310: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.519311: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827868000000 softexpires=827868000000
          <idle>-0     [018] d.h2   829.531084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.531084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.531084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827852001340
          <idle>-0     [001] d.h1   829.531084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827852001540
          <idle>-0     [018] d.h1   829.531085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.531085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.531087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827868000000 softexpires=827868000000
          <idle>-0     [001] d..2   829.531087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827868000000 softexpires=827868000000
          <idle>-0     [000] d..2   829.535976: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.535977: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827884000000 softexpires=827884000000
          <idle>-0     [018] d.h2   829.547084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.547084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.547084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827868001360
          <idle>-0     [001] d.h1   829.547084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827868001560
          <idle>-0     [018] d.h1   829.547085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.547085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.547086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827884000000 softexpires=827884000000
          <idle>-0     [001] d..2   829.547088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827884000000 softexpires=827884000000
          <idle>-0     [000] d..2   829.551253: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.551253: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827900000000 softexpires=827900000000
          <idle>-0     [018] d.h2   829.563084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.563084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.563084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827884001380
          <idle>-0     [001] d.h1   829.563084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827884001580
          <idle>-0     [018] d.h1   829.563085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.563085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.563087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827900000000 softexpires=827900000000
          <idle>-0     [001] d..2   829.563087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827900000000 softexpires=827900000000
          <idle>-0     [000] d..2   829.567918: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.567919: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827916000000 softexpires=827916000000
          <idle>-0     [018] d.h2   829.579084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.579084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.579084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827900001360
          <idle>-0     [001] d.h1   829.579084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827900001600
          <idle>-0     [018] d.h1   829.579085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.579085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.579087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827916000000 softexpires=827916000000
          <idle>-0     [001] d..2   829.579088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827916000000 softexpires=827916000000
          <idle>-0     [000] d..2   829.583195: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.583196: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827932000000 softexpires=827932000000
          <idle>-0     [000] dn.2   829.592918: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.592919: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827916000000 softexpires=827916000000
          <idle>-0     [018] d.h2   829.595084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.595084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
              ps-2012  [000] d.h1   829.595084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   829.595084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827916001380
          <idle>-0     [001] d.h1   829.595084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827916001600
              ps-2012  [000] d.h.   829.595085: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=827916001380
          <idle>-0     [018] d.h1   829.595085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.595086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.595087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827932000000 softexpires=827932000000
              ps-2012  [000] d.h.   829.595088: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
              ps-2012  [000] d.h1   829.595088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827920000000 softexpires=827920000000
              ps-2012  [000] d.s1   829.595090: timer_cancel: timer=ffff8017fbe43558
              ps-2012  [000] d.s.   829.595090: timer_expire_entry: timer=ffff8017fbe43558 function=delayed_work_timer_fn now=4295099276
              ps-2012  [000] dns.   829.595092: timer_expire_exit: timer=ffff8017fbe43558
              ps-2012  [000] dns1   829.595092: timer_cancel: timer=ffff0000090295a8
              ps-2012  [000] dns.   829.595093: timer_expire_entry: timer=ffff0000090295a8 function=delayed_work_timer_fn now=4295099276
              ps-2012  [000] dns.   829.595093: timer_expire_exit: timer=ffff0000090295a8
          <idle>-0     [001] d..2   829.595094: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827932000000 softexpires=827932000000
     kworker/0:0-3     [000] d..1   829.595102: timer_start: timer=ffff8017fbe43558 function=delayed_work_timer_fn expires=4295099500 [timeout=224] cpu=0 idx=111 flags=D|I
     kworker/0:0-3     [000] d..1   829.595109: timer_start: timer=ffff0000090295a8 function=delayed_work_timer_fn expires=4295099500 [timeout=224] cpu=0 idx=111 flags=D|I
          <idle>-0     [001] dn.2   829.595344: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] dn.2   829.595344: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827920000000 softexpires=827920000000
          <idle>-0     [000] d..1   829.595363: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.595364: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.595365: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827932000000 softexpires=827932000000
          <idle>-0     [001] d..1   829.595418: tick_stop: success=1 dependency=NONE
          <idle>-0     [001] d..2   829.595419: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d..2   829.595419: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827932000000 softexpires=827932000000
          <idle>-0     [000] d..2   829.599340: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.599341: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827948000000 softexpires=827948000000
          <idle>-0     [018] d.h2   829.611084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.611084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.611085: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827932001760
          <idle>-0     [001] d.h1   829.611085: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827932002100
          <idle>-0     [018] d.h1   829.611086: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.611086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.611087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827948000000 softexpires=827948000000
          <idle>-0     [001] d..2   829.611090: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827948000000 softexpires=827948000000
          <idle>-0     [000] d..2   829.616006: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.616006: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827964000000 softexpires=827964000000
          <idle>-0     [018] d.h2   829.627084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.627084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.627084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827948001600
          <idle>-0     [001] d.h1   829.627084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827948001860
          <idle>-0     [018] d.h1   829.627085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.627086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.627087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827964000000 softexpires=827964000000
          <idle>-0     [001] d..2   829.627088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827964000000 softexpires=827964000000
          <idle>-0     [000] d..2   829.631282: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.631283: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827980000000 softexpires=827980000000
          <idle>-0     [018] d.h2   829.643084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.643084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.643084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827964001500
          <idle>-0     [001] d.h1   829.643084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827964001720
          <idle>-0     [018] d.h1   829.643085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.643085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.643087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827980000000 softexpires=827980000000
          <idle>-0     [001] d..2   829.643088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827980000000 softexpires=827980000000
          <idle>-0     [000] d..2   829.647948: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.647949: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=827996000000 softexpires=827996000000
          <idle>-0     [018] d.h2   829.659084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.659084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.659084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827980001460
          <idle>-0     [001] d.h1   829.659084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827980001680
          <idle>-0     [018] d.h1   829.659085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.659085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.659087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=827996000000 softexpires=827996000000
          <idle>-0     [001] d..2   829.659088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=827996000000 softexpires=827996000000
          <idle>-0     [000] d..2   829.663225: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.663225: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828012000000 softexpires=828012000000
          <idle>-0     [018] d.h2   829.675084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.675084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.675084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=827996001340
          <idle>-0     [001] d.h1   829.675084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=827996001560
          <idle>-0     [018] d.h1   829.675085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.675085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.675086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828012000000 softexpires=828012000000
          <idle>-0     [001] d..2   829.675088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828012000000 softexpires=828012000000
          <idle>-0     [000] d..2   829.679890: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.679891: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828028000000 softexpires=828028000000
          <idle>-0     [018] d.h2   829.691084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.691084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.691084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828012001360
          <idle>-0     [001] d.h1   829.691084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828012001560
          <idle>-0     [018] d.h1   829.691085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.691085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.691087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828028000000 softexpires=828028000000
          <idle>-0     [001] d..2   829.691087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828028000000 softexpires=828028000000
          <idle>-0     [000] d..2   829.695167: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.695168: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828044000000 softexpires=828044000000
          <idle>-0     [018] d.h2   829.707084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.707084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.707084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828028001380
          <idle>-0     [001] d.h1   829.707084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828028001600
          <idle>-0     [018] d.h1   829.707085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.707085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.707086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828044000000 softexpires=828044000000
          <idle>-0     [001] d..2   829.707088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828044000000 softexpires=828044000000
          <idle>-0     [000] d..2   829.711833: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.711833: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828060000000 softexpires=828060000000
          <idle>-0     [018] d.h2   829.723084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.723084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [001] d.h1   829.723084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828044001580
          <idle>-0     [018] d.h1   829.723084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828044001380
          <idle>-0     [001] d.h1   829.723085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.723085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d..2   829.723087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828060000000 softexpires=828060000000
          <idle>-0     [018] d..2   829.723088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828060000000 softexpires=828060000000
          <idle>-0     [000] d..2   829.727110: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.727110: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828076000000 softexpires=828076000000
          <idle>-0     [018] d.h2   829.739084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.739084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.739084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828060001360
          <idle>-0     [001] d.h1   829.739084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828060001580
          <idle>-0     [018] d.h1   829.739085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.739085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.739086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828076000000 softexpires=828076000000
          <idle>-0     [001] d..2   829.739088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828076000000 softexpires=828076000000
          <idle>-0     [000] d..2   829.743775: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.743776: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828092000000 softexpires=828092000000
          <idle>-0     [018] d.h2   829.755084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.755084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.755084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828076001360
          <idle>-0     [001] d.h1   829.755084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828076001540
          <idle>-0     [001] d.h1   829.755085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.755085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   829.755087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828092000000 softexpires=828092000000
          <idle>-0     [001] d..2   829.755087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828092000000 softexpires=828092000000
          <idle>-0     [000] d..2   829.760441: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.760441: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828108000000 softexpires=828108000000
          <idle>-0     [018] d.h2   829.771084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.771084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.771084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828092001360
          <idle>-0     [001] d.h1   829.771084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828092001540
          <idle>-0     [018] d.h1   829.771085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.771085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.771086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828108000000 softexpires=828108000000
          <idle>-0     [001] d..2   829.771088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828108000000 softexpires=828108000000
          <idle>-0     [000] d..2   829.775718: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.775718: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828124000000 softexpires=828124000000
          <idle>-0     [018] d.h2   829.787084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.787084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.787084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828108001380
          <idle>-0     [001] d.h1   829.787084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828108001600
          <idle>-0     [018] d.h1   829.787085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.787085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.787087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828124000000 softexpires=828124000000
          <idle>-0     [001] d..2   829.787088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828124000000 softexpires=828124000000
          <idle>-0     [000] d..2   829.792383: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.792384: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828140000000 softexpires=828140000000
          <idle>-0     [018] d.h2   829.803084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.803084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.803084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828124001360
          <idle>-0     [001] d.h1   829.803084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828124001540
          <idle>-0     [018] d.h1   829.803085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.803085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.803086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828140000000 softexpires=828140000000
          <idle>-0     [001] d..2   829.803088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828140000000 softexpires=828140000000
          <idle>-0     [000] d..2   829.807660: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.807661: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828156000000 softexpires=828156000000
          <idle>-0     [018] d.h2   829.819084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.819084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.819084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828140001340
          <idle>-0     [001] d.h1   829.819084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828140001520
          <idle>-0     [018] d.h1   829.819085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.819086: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.819087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828156000000 softexpires=828156000000
          <idle>-0     [001] d..2   829.819088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828156000000 softexpires=828156000000
          <idle>-0     [000] d..2   829.824326: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.824326: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828172000000 softexpires=828172000000
          <idle>-0     [018] d.h2   829.835084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.835084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.835084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828156001340
          <idle>-0     [001] d.h1   829.835084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828156001540
          <idle>-0     [018] d.h1   829.835085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.835085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.835086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828172000000 softexpires=828172000000
          <idle>-0     [001] d..2   829.835088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828172000000 softexpires=828172000000
          <idle>-0     [000] d..2   829.839602: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.839603: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828188000000 softexpires=828188000000
          <idle>-0     [018] d.h2   829.851084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.851084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.851084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828172001380
          <idle>-0     [001] d.h1   829.851084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828172001580
          <idle>-0     [018] d.h1   829.851085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.851085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.851087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828188000000 softexpires=828188000000
          <idle>-0     [001] d..2   829.851088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828188000000 softexpires=828188000000
          <idle>-0     [000] d..2   829.856268: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.856269: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [018] d.h2   829.867084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.867084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.867084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828188001420
          <idle>-0     [001] d.h1   829.867084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828188001640
          <idle>-0     [018] d.h1   829.867085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.867085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.867086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [001] d..2   829.867088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [000] d..2   829.871545: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.871545: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.872935: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.872936: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828196000000 softexpires=828196000000
          <idle>-0     [000] d..1   829.872940: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.872941: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.872941: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.874323: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.874324: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828196000000 softexpires=828196000000
          <idle>-0     [000] d..1   829.874327: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.874327: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.874328: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.875713: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.875713: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828200000000 softexpires=828200000000
          <idle>-0     [000] d..1   829.875717: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.875717: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.875718: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.877101: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.877101: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828200000000 softexpires=828200000000
          <idle>-0     [000] d..1   829.877104: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.877105: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.877105: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.878489: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.878490: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828200000000 softexpires=828200000000
          <idle>-0     [000] d..1   829.878493: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.878493: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.878494: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.879879: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.879880: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [000] d..1   829.879883: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.879883: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.879884: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.881267: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.881268: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [000] d..1   829.881271: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.881271: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.881271: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.882656: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.882656: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828204000000 softexpires=828204000000
          <idle>-0     [000] d..1   829.882659: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.882660: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.882660: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [018] d.h2   829.883084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.883084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.883084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828204001360
          <idle>-0     [001] d.h1   829.883084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828204001580
          <idle>-0     [018] d.h1   829.883085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.883085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.883087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [001] d..2   829.883087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.884045: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.884046: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828208000000 softexpires=828208000000
          <idle>-0     [000] d..1   829.884049: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.884049: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.884049: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.885433: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.885434: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828208000000 softexpires=828208000000
          <idle>-0     [000] d..1   829.885437: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.885437: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.885438: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.886822: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.886823: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828208000000 softexpires=828208000000
          <idle>-0     [000] d..1   829.886826: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.886826: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.886826: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
          <idle>-0     [000] dn.2   829.888212: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.888213: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828212000000 softexpires=828212000000
          <idle>-0     [000] d..1   829.888216: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.888216: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.888217: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] dn.2   829.889600: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.889600: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828212000000 softexpires=828212000000
          <idle>-0     [000] d..1   829.889603: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.889604: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.889604: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] dn.2   829.890989: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.890989: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828212000000 softexpires=828212000000
          <idle>-0     [000] d..1   829.890992: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.890992: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.890993: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] dn.2   829.892378: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.892379: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828216000000 softexpires=828216000000
          <idle>-0     [000] d..1   829.892382: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.892382: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.892383: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] dn.2   829.893766: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.893767: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828216000000 softexpires=828216000000
          <idle>-0     [000] d..1   829.893770: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.893770: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.893771: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] dn.2   829.895155: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] dn.2   829.895156: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828220000000 softexpires=828220000000
              sh-1993  [000] ....   829.895158: timer_init: timer=ffff8017db00fb40
              sh-1993  [000] d..1   829.895158: timer_start: timer=ffff8017db00fb40 function=process_timeout expires=4295099353 [timeout=2] cpu=0 idx=0 flags=
          <idle>-0     [000] d..1   829.895161: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.895161: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.895161: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828228000000 softexpires=828228000000
          <idle>-0     [018] d.h2   829.899084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.899084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.899084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828220001340
          <idle>-0     [001] d.h1   829.899084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828220001540
          <idle>-0     [018] d.h1   829.899085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.899085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.899086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [001] d..2   829.899088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828236000000 softexpires=828236000000
          <idle>-0     [000] d.h2   829.907084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.h1   829.907084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828228001680
          <idle>-0     [000] d.h1   829.907086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d.s2   829.907087: timer_cancel: timer=ffff8017db00fb40
          <idle>-0     [000] ..s1   829.907087: timer_expire_entry: timer=ffff8017db00fb40 function=process_timeout now=4295099354
          <idle>-0     [000] .ns1   829.907089: timer_expire_exit: timer=ffff8017db00fb40
          <idle>-0     [000] dn.2   829.907092: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828232000000 softexpires=828232000000
          <idle>-0     [000] d..1   829.907120: tick_stop: success=1 dependency=NONE
          <idle>-0     [000] d..2   829.907120: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [000] d..2   829.907121: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828252000000 softexpires=828252000000
          <idle>-0     [018] d.h2   829.915084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.915084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.915084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828236001260
          <idle>-0     [001] d.h1   829.915084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828236001480
          <idle>-0     [018] d.h1   829.915085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.915085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.915087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828252000000 softexpires=828252000000
          <idle>-0     [001] d..2   829.915087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828252000000 softexpires=828252000000
          <idle>-0     [018] d.h2   829.931084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h2   829.931084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h2   829.931084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.931084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828252001300
          <idle>-0     [000] d.h1   829.931084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828252001300
          <idle>-0     [001] d.h1   829.931084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828252001460
          <idle>-0     [018] d.h1   829.931085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.931085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   829.931085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d..2   829.931086: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828268000000 softexpires=828268000000
          <idle>-0     [001] d..2   829.931088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828268000000 softexpires=828268000000
          <idle>-0     [000] d..2   829.931098: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828268000000 softexpires=828268000000
          <idle>-0     [001] d.h2   829.947084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h2   829.947084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h2   829.947084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h1   829.947084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828268001500
          <idle>-0     [018] d.h1   829.947084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828268001480
          <idle>-0     [000] d.h1   829.947084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828268001740
          <idle>-0     [001] d.h1   829.947085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.947085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h1   829.947086: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d..2   829.947087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828284000000 softexpires=828284000000
          <idle>-0     [001] d..2   829.947087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828284000000 softexpires=828284000000
          <idle>-0     [000] d..2   829.947089: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828284000000 softexpires=828284000000
          <idle>-0     [000] d.h2   829.963084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   829.963084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   829.963084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   829.963084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828284001460
          <idle>-0     [000] d.h1   829.963084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828284001480
          <idle>-0     [001] d.h1   829.963084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828284001600
          <idle>-0     [018] d.h1   829.963085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h1   829.963085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h1   829.963085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.963087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828300000000 softexpires=828300000000
          <idle>-0     [001] d..2   829.963088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828300000000 softexpires=828300000000
          <idle>-0     [000] d..2   829.963088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828300000000 softexpires=828300000000
          <idle>-0     [018] d.h2   829.979085: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d.h1   829.979085: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828300002260
          <idle>-0     [000] d.h2   829.979087: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h2   829.979088: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   829.979088: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828300005100
          <idle>-0     [001] d.h1   829.979088: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828300005260
          <idle>-0     [018] d.h1   829.979088: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h1   829.979089: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h1   829.979089: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d..2   829.979090: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828316000000 softexpires=828316000000
          <idle>-0     [001] d..2   829.979091: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828316000000 softexpires=828316000000
          <idle>-0     [000] d..2   829.979092: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828316000000 softexpires=828316000000
          <idle>-0     [001] d.h2   829.995084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h2   829.995084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h2   829.995084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h1   829.995084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828316001520
          <idle>-0     [000] d.h1   829.995084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828316001680
          <idle>-0     [001] d.h1   829.995084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828316001460
          <idle>-0     [018] d.h1   829.995085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h1   829.995085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [000] d.h1   829.995085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d..2   829.995087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828332000000 softexpires=828332000000
          <idle>-0     [001] d..2   829.995088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828332000000 softexpires=828332000000
          <idle>-0     [000] d..2   829.995089: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828332000000 softexpires=828332000000
          <idle>-0     [000] d.h2   830.011084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d.h2   830.011084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [001] d.h2   830.011084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   830.011084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828332001440
          <idle>-0     [000] d.h1   830.011084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828332001480
          <idle>-0     [001] d.h1   830.011084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828332001640
          <idle>-0     [000] d.h1   830.011085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h1   830.011085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   830.011085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [018] d..2   830.011088: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828348000000 softexpires=828348000000
          <idle>-0     [001] d..2   830.011088: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828348000000 softexpires=828348000000
          <idle>-0     [000] d..2   830.011088: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828348000000 softexpires=828348000000
          <idle>-0     [001] d.h2   830.027084: hrtimer_cancel: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h2   830.027084: hrtimer_cancel: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h2   830.027084: hrtimer_cancel: hrtimer=ffff8017fbe41808
          <idle>-0     [001] d.h1   830.027084: hrtimer_expire_entry: hrtimer=ffff8017fbe5c808 function=tick_sched_timer now=828348001500
          <idle>-0     [018] d.h1   830.027084: hrtimer_expire_entry: hrtimer=ffff8017fbc74808 function=tick_sched_timer now=828348001500
          <idle>-0     [000] d.h1   830.027084: hrtimer_expire_entry: hrtimer=ffff8017fbe41808 function=tick_sched_timer now=828348001660
          <idle>-0     [001] d.h1   830.027085: hrtimer_expire_exit: hrtimer=ffff8017fbe5c808
          <idle>-0     [018] d.h1   830.027085: hrtimer_expire_exit: hrtimer=ffff8017fbc74808
          <idle>-0     [000] d.h1   830.027085: hrtimer_expire_exit: hrtimer=ffff8017fbe41808
          <idle>-0     [018] d..2   830.027087: hrtimer_start: hrtimer=ffff8017fbc74808 function=tick_sched_timer expires=828364000000 softexpires=828364000000
          <idle>-0     [001] d..2   830.027087: hrtimer_start: hrtimer=ffff8017fbe5c808 function=tick_sched_timer expires=828364000000 softexpires=828364000000
          <idle>-0     [000] d..2   830.027091: hrtimer_start: hrtimer=ffff8017fbe41808 function=tick_sched_timer expires=828364000000 softexpires=828364000000
 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 12:54                                                             ` Boqun Feng
  (?)
@ 2017-07-28 13:13                                                               ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 20:54:16 +0800
Boqun Feng <boqun.feng@gmail.com> wrote:

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m\x149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
Sadly seems to be a no...  Just splatted before I could even get
the tracing set up. Back to staring at logs and hoping something
will stand out!

Jonathan
> 
> Regards,
> Boqun
> 
> ------------------>8  
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {
>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *
> 
>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *
> 
>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 13:13                                                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:13 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Paul E. McKenney, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Fri, 28 Jul 2017 20:54:16 +0800
Boqun Feng <boqun.feng@gmail.com> wrote:

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
Sadly seems to be a no...  Just splatted before I could even get
the tracing set up. Back to staring at logs and hoping something
will stand out!

Jonathan
> 
> Regards,
> Boqun
> 
> ------------------>8  
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {
>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *
> 
>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *
> 
>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 13:13                                                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 20:54:16 +0800
Boqun Feng <boqun.feng@gmail.com> wrote:

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
Sadly seems to be a no...  Just splatted before I could even get
the tracing set up. Back to staring at logs and hoping something
will stand out!

Jonathan
> 
> Regards,
> Boqun
> 
> ------------------>8  
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {
>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *
> 
>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *
> 
>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
>  
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28  7:44                                                           ` Jonathan Cameron
  (?)
@ 2017-07-28 13:24                                                             ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:      
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >         
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:        
> > > > > >         
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace period.          
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.        
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > > because the timer has not fired:        
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?      
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?    
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 
> > > 

Firstly sorry to those who got the rather silly length email a minute ago.
It bounced on the list (fair enough - I was just being lazy on getting
data past our firewalls).

Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
fewer timer events.  Issue became much easier to trigger (on some runs before
I could get tracing up and running)

So logs are large enough that pastebin doesn't like them - please shout if
another timer period is of interest.

https://pastebin.com/iUZDfQGM for the timer trace.
https://pastebin.com/3w1F7amH for dmesg.  

The relevant timeout on the RCU stall detector was 8 seconds.  Event is
detected around 835.

It's a lot of logs, so I haven't identified a smoking gun yet but there
may well be one in there.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 13:24                                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:24 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel

On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:      
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >         
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:        
> > > > > >         
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace period.          
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.        
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > > because the timer has not fired:        
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?      
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?    
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 
> > > 

Firstly sorry to those who got the rather silly length email a minute ago.
It bounced on the list (fair enough - I was just being lazy on getting
data past our firewalls).

Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
fewer timer events.  Issue became much easier to trigger (on some runs before
I could get tracing up and running)

So logs are large enough that pastebin doesn't like them - please shout if
another timer period is of interest.

https://pastebin.com/iUZDfQGM for the timer trace.
https://pastebin.com/3w1F7amH for dmesg.  

The relevant timeout on the RCU stall detector was 8 seconds.  Event is
detected around 835.

It's a lot of logs, so I haven't identified a smoking gun yet but there
may well be one in there.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 13:24                                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:      
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >         
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:        
> > > > > >         
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace period.          
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.        
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > > because the timer has not fired:        
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?      
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?    
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 
> > > 

Firstly sorry to those who got the rather silly length email a minute ago.
It bounced on the list (fair enough - I was just being lazy on getting
data past our firewalls).

Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
fewer timer events.  Issue became much easier to trigger (on some runs before
I could get tracing up and running)

So logs are large enough that pastebin doesn't like them - please shout if
another timer period is of interest.

https://pastebin.com/iUZDfQGM for the timer trace.
https://pastebin.com/3w1F7amH for dmesg.  

The relevant timeout on the RCU stall detector was 8 seconds.  Event is
detected around 835.

It's a lot of logs, so I haven't identified a smoking gun yet but there
may well be one in there.

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 12:54                                                             ` Boqun Feng
  (?)
@ 2017-07-28 14:55                                                               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m\x149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
> 
> Regards,
> Boqun
> 
> ------------------>8
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {

So the idea is that we get the old value of ->gp_flags here, correct?

>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *

And the above fetch is really part of the swait_active() called out
below, right?

>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *

And here we fetch the old value of ->gp_flags again, this time under
the lock, right?

>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

Even though Jonathan's testing indicates that it didn't fix this
particular problem:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {
> -- 
> 2.13.0
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 14:55                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 14:55 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
> 
> Regards,
> Boqun
> 
> ------------------>8
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {

So the idea is that we get the old value of ->gp_flags here, correct?

>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *

And the above fetch is really part of the swait_active() called out
below, right?

>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *

And here we fetch the old value of ->gp_flags again, this time under
the lock, right?

>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

Even though Jonathan's testing indicates that it didn't fix this
particular problem:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {
> -- 
> 2.13.0
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 14:55                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
> 
> Regards,
> Boqun
> 
> ------------------>8
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
>         CPU0                            CPU1
>         ----                            ----
>                                 __call_rcu_core() {
> 
>                                  spin_lock(rnp_root)
>                                  need_wake = __rcu_start_gp() {
>                                   rcu_start_gp_advanced() {
>                                    gp_flags = FLAG_INIT
>                                   }
>                                  }
> 
>  rcu_gp_kthread() {
>    swait_event_interruptible(wq,
>         gp_flags & FLAG_INIT) {

So the idea is that we get the old value of ->gp_flags here, correct?

>    spin_lock(q->lock)
> 
>                                 *fetch wq->task_list here! *

And the above fetch is really part of the swait_active() called out
below, right?

>    list_add(wq->task_list, q->task_list)
>    spin_unlock(q->lock);
> 
>    *fetch old value of gp_flags here *

And here we fetch the old value of ->gp_flags again, this time under
the lock, right?

>                                  spin_unlock(rnp_root)
> 
>                                  rcu_gp_kthread_wake() {
>                                   swake_up(wq) {
>                                    swait_active(wq) {
>                                     list_empty(wq->task_list)
> 
>                                    } * return false *
> 
>   if (condition) * false *
>     schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

Even though Jonathan's testing indicates that it didn't fix this
particular problem:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/sched/swait.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>  	unsigned long flags;
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irqsave(&q->lock, flags);
>  	swake_up_locked(q);
>  	raw_spin_unlock_irqrestore(&q->lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>  	struct swait_queue *curr;
>  	LIST_HEAD(tmp);
> 
> -	if (!swait_active(q))
> -		return;
> -
>  	raw_spin_lock_irq(&q->lock);
>  	list_splice_init(&q->task_list, &tmp);
>  	while (!list_empty(&tmp)) {
> -- 
> 2.13.0
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 13:24                                                             ` Jonathan Cameron
@ 2017-07-28 16:55                                                               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 08:44:11 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

[ . . . ]

> Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> fewer timer events.  Issue became much easier to trigger (on some runs before
> I could get tracing up and running)
>e
> So logs are large enough that pastebin doesn't like them - please shoet if
>>e another timer period is of interest.
> 
> https://pastebin.com/iUZDfQGM for the timer trace.
> https://pastebin.com/3w1F7amH for dmesg.  
> 
> The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> detected around 835.
> 
> It's a lot of logs, so I haven't identified a smoking gun yet but there
> may well be one in there.

The dmesg says:

rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

So I look for "rcu_preempt" timer events and find these:

rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 

Next look for "ffff8017d5fc7da0" and I don't find anything else.

The timeout was one jiffy, and more than a second later, no expiration.
Is it possible that this event was lost?  I am not seeing any sign of
this is the trace.

I don't see any sign of CPU hotplug (and I test with lots of that in
any case).

The last time we saw something like this it was a timer HW/driver problem,
but it is a bit hard to imagine such a problem affecting both ARM64
and SPARC.  ;-)

Thomas, any debugging suggestions?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 16:55                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 08:44:11 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

[ . . . ]

> Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> fewer timer events.  Issue became much easier to trigger (on some runs before
> I could get tracing up and running)
>e
> So logs are large enough that pastebin doesn't like them - please shoet if
>>e another timer period is of interest.
> 
> https://pastebin.com/iUZDfQGM for the timer trace.
> https://pastebin.com/3w1F7amH for dmesg.  
> 
> The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> detected around 835.
> 
> It's a lot of logs, so I haven't identified a smoking gun yet but there
> may well be one in there.

The dmesg says:

rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

So I look for "rcu_preempt" timer events and find these:

rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 

Next look for "ffff8017d5fc7da0" and I don't find anything else.

The timeout was one jiffy, and more than a second later, no expiration.
Is it possible that this event was lost?  I am not seeing any sign of
this is the trace.

I don't see any sign of CPU hotplug (and I test with lots of that in
any case).

The last time we saw something like this it was a timer HW/driver problem,
but it is a bit hard to imagine such a problem affecting both ARM64
and SPARC.  ;-)

Thomas, any debugging suggestions?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 16:55                                                               ` Paul E. McKenney
  (?)
@ 2017-07-28 17:27                                                                 ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 09:55:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 08:44:11 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> 
> [ . . . ]
> 
> > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > fewer timer events.  Issue became much easier to trigger (on some runs before
> > I could get tracing up and running)
> >e
> > So logs are large enough that pastebin doesn't like them - please shoet if  
> >>e another timer period is of interest.  
> > 
> > https://pastebin.com/iUZDfQGM for the timer trace.
> > https://pastebin.com/3w1F7amH for dmesg.  
> > 
> > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > detected around 835.
> > 
> > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > may well be one in there.  
> 
> The dmesg says:
> 
> rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> 
> So I look for "rcu_preempt" timer events and find these:
> 
> rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> 
> Next look for "ffff8017d5fc7da0" and I don't find anything else.
It does show up off the bottom of what would fit in pastebin...

     rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
     rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
     rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags
> The timeout was one jiffy, and more than a second later, no expiration.
> Is it possible that this event was lost?  I am not seeing any sign of
> this is the trace.
> 
> I don't see any sign of CPU hotplug (and I test with lots of that in
> any case).
> 
> The last time we saw something like this it was a timer HW/driver problem,
> but it is a bit hard to imagine such a problem affecting both ARM64
> and SPARC.  ;-)
Could be different issues, both of which were hidden by that lockup detector.

There is an errata work around for the timers on this particular board.
I'm only vaguely aware of it, so may be unconnected.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb

Seems unlikely though! + we've not yet seen it on the other chips that
errata effects (not that that means much).

Jonathan

> 
> Thomas, any debugging suggestions?
> 
> 							Thanx, Paul
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 17:27                                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 17:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Fri, 28 Jul 2017 09:55:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 08:44:11 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> 
> [ . . . ]
> 
> > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > fewer timer events.  Issue became much easier to trigger (on some runs before
> > I could get tracing up and running)
> >e
> > So logs are large enough that pastebin doesn't like them - please shoet if  
> >>e another timer period is of interest.  
> > 
> > https://pastebin.com/iUZDfQGM for the timer trace.
> > https://pastebin.com/3w1F7amH for dmesg.  
> > 
> > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > detected around 835.
> > 
> > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > may well be one in there.  
> 
> The dmesg says:
> 
> rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> 
> So I look for "rcu_preempt" timer events and find these:
> 
> rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> 
> Next look for "ffff8017d5fc7da0" and I don't find anything else.
It does show up off the bottom of what would fit in pastebin...

     rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
     rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
     rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=

> The timeout was one jiffy, and more than a second later, no expiration.
> Is it possible that this event was lost?  I am not seeing any sign of
> this is the trace.
> 
> I don't see any sign of CPU hotplug (and I test with lots of that in
> any case).
> 
> The last time we saw something like this it was a timer HW/driver problem,
> but it is a bit hard to imagine such a problem affecting both ARM64
> and SPARC.  ;-)
Could be different issues, both of which were hidden by that lockup detector.

There is an errata work around for the timers on this particular board.
I'm only vaguely aware of it, so may be unconnected.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb

Seems unlikely though! + we've not yet seen it on the other chips that
errata effects (not that that means much).

Jonathan

> 
> Thomas, any debugging suggestions?
> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 17:27                                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-28 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 09:55:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 08:44:11 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> 
> [ . . . ]
> 
> > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > fewer timer events.  Issue became much easier to trigger (on some runs before
> > I could get tracing up and running)
> >e
> > So logs are large enough that pastebin doesn't like them - please shoet if  
> >>e another timer period is of interest.  
> > 
> > https://pastebin.com/iUZDfQGM for the timer trace.
> > https://pastebin.com/3w1F7amH for dmesg.  
> > 
> > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > detected around 835.
> > 
> > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > may well be one in there.  
> 
> The dmesg says:
> 
> rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> 
> So I look for "rcu_preempt" timer events and find these:
> 
> rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> 
> Next look for "ffff8017d5fc7da0" and I don't find anything else.
It does show up off the bottom of what would fit in pastebin...

     rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
     rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
     rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=

> The timeout was one jiffy, and more than a second later, no expiration.
> Is it possible that this event was lost?  I am not seeing any sign of
> this is the trace.
> 
> I don't see any sign of CPU hotplug (and I test with lots of that in
> any case).
> 
> The last time we saw something like this it was a timer HW/driver problem,
> but it is a bit hard to imagine such a problem affecting both ARM64
> and SPARC.  ;-)
Could be different issues, both of which were hidden by that lockup detector.

There is an errata work around for the timers on this particular board.
I'm only vaguely aware of it, so may be unconnected.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb

Seems unlikely though! + we've not yet seen it on the other chips that
errata effects (not that that means much).

Jonathan

> 
> Thomas, any debugging suggestions?
> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 14:55                                                               ` Paul E. McKenney
  (?)
@ 2017-07-28 18:41                                                                 ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 18:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > Hi Jonathan,
> > 
> > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > 
> > 	https://marc.info/?l=linux-kernel&m\x149750022019663
> > 
> > and RCU begins to use swait/wake last year, so I thought this could be
> > relevant.
> > 
> > Could you try the following patch and see if it works? Thanks.
> > 
> > Regards,
> > Boqun
> > 
> > ------------------>8
> > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> >  swake_up*()
> > 
> > Steven Rostedt reported a potential race in RCU core because of
> > swake_up():
> > 
> >         CPU0                            CPU1
> >         ----                            ----
> >                                 __call_rcu_core() {
> > 
> >                                  spin_lock(rnp_root)
> >                                  need_wake = __rcu_start_gp() {
> >                                   rcu_start_gp_advanced() {
> >                                    gp_flags = FLAG_INIT
> >                                   }
> >                                  }
> > 
> >  rcu_gp_kthread() {
> >    swait_event_interruptible(wq,
> >         gp_flags & FLAG_INIT) {
> 
> So the idea is that we get the old value of ->gp_flags here, correct?
> 
> >    spin_lock(q->lock)
> > 
> >                                 *fetch wq->task_list here! *
> 
> And the above fetch is really part of the swait_active() called out
> below, right?
> 
> >    list_add(wq->task_list, q->task_list)
> >    spin_unlock(q->lock);
> > 
> >    *fetch old value of gp_flags here *
> 
> And here we fetch the old value of ->gp_flags again, this time under
> the lock, right?
> 
> >                                  spin_unlock(rnp_root)
> > 
> >                                  rcu_gp_kthread_wake() {
> >                                   swake_up(wq) {
> >                                    swait_active(wq) {
> >                                     list_empty(wq->task_list)
> > 
> >                                    } * return false *
> > 
> >   if (condition) * false *
> >     schedule();
> > 
> > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > waits for a long time.
> > 
> > The reason of this is that we do a lockless swait_active() check in
> > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > before swait_active() to provide the proper order or 2) simply remove
> > the swait_active() in swake_up().
> > 
> > The solution 2 not only fixes this problem but also keeps the swait and
> > wait API as close as possible, as wake_up() doesn't provide a full
> > barrier and doesn't do a lockless check of the wait queue either.
> > Moreover, there are users already using swait_active() to do their quick
> > checks for the wait queues, so it make less sense that swake_up() and
> > swake_up_all() do this on their own.
> > 
> > This patch then removes the lockless swait_active() check in swake_up()
> > and swake_up_all().
> > 
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> 
> Even though Jonathan's testing indicates that it didn't fix this
> particular problem:
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

And while we are at it:

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> > ---
> >  kernel/sched/swait.c | 6 ------
> >  1 file changed, 6 deletions(-)
> > 
> > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > index 3d5610dcce11..2227e183e202 100644
> > --- a/kernel/sched/swait.c
> > +++ b/kernel/sched/swait.c
> > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> >  {
> >  	unsigned long flags;
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irqsave(&q->lock, flags);
> >  	swake_up_locked(q);
> >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> >  	struct swait_queue *curr;
> >  	LIST_HEAD(tmp);
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irq(&q->lock);
> >  	list_splice_init(&q->task_list, &tmp);
> >  	while (!list_empty(&tmp)) {
> > -- 
> > 2.13.0
> > 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 18:41                                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 18:41 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > Hi Jonathan,
> > 
> > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > 
> > 	https://marc.info/?l=linux-kernel&m=149750022019663
> > 
> > and RCU begins to use swait/wake last year, so I thought this could be
> > relevant.
> > 
> > Could you try the following patch and see if it works? Thanks.
> > 
> > Regards,
> > Boqun
> > 
> > ------------------>8
> > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> >  swake_up*()
> > 
> > Steven Rostedt reported a potential race in RCU core because of
> > swake_up():
> > 
> >         CPU0                            CPU1
> >         ----                            ----
> >                                 __call_rcu_core() {
> > 
> >                                  spin_lock(rnp_root)
> >                                  need_wake = __rcu_start_gp() {
> >                                   rcu_start_gp_advanced() {
> >                                    gp_flags = FLAG_INIT
> >                                   }
> >                                  }
> > 
> >  rcu_gp_kthread() {
> >    swait_event_interruptible(wq,
> >         gp_flags & FLAG_INIT) {
> 
> So the idea is that we get the old value of ->gp_flags here, correct?
> 
> >    spin_lock(q->lock)
> > 
> >                                 *fetch wq->task_list here! *
> 
> And the above fetch is really part of the swait_active() called out
> below, right?
> 
> >    list_add(wq->task_list, q->task_list)
> >    spin_unlock(q->lock);
> > 
> >    *fetch old value of gp_flags here *
> 
> And here we fetch the old value of ->gp_flags again, this time under
> the lock, right?
> 
> >                                  spin_unlock(rnp_root)
> > 
> >                                  rcu_gp_kthread_wake() {
> >                                   swake_up(wq) {
> >                                    swait_active(wq) {
> >                                     list_empty(wq->task_list)
> > 
> >                                    } * return false *
> > 
> >   if (condition) * false *
> >     schedule();
> > 
> > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > waits for a long time.
> > 
> > The reason of this is that we do a lockless swait_active() check in
> > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > before swait_active() to provide the proper order or 2) simply remove
> > the swait_active() in swake_up().
> > 
> > The solution 2 not only fixes this problem but also keeps the swait and
> > wait API as close as possible, as wake_up() doesn't provide a full
> > barrier and doesn't do a lockless check of the wait queue either.
> > Moreover, there are users already using swait_active() to do their quick
> > checks for the wait queues, so it make less sense that swake_up() and
> > swake_up_all() do this on their own.
> > 
> > This patch then removes the lockless swait_active() check in swake_up()
> > and swake_up_all().
> > 
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> 
> Even though Jonathan's testing indicates that it didn't fix this
> particular problem:
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

And while we are at it:

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> > ---
> >  kernel/sched/swait.c | 6 ------
> >  1 file changed, 6 deletions(-)
> > 
> > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > index 3d5610dcce11..2227e183e202 100644
> > --- a/kernel/sched/swait.c
> > +++ b/kernel/sched/swait.c
> > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> >  {
> >  	unsigned long flags;
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irqsave(&q->lock, flags);
> >  	swake_up_locked(q);
> >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> >  	struct swait_queue *curr;
> >  	LIST_HEAD(tmp);
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irq(&q->lock);
> >  	list_splice_init(&q->task_list, &tmp);
> >  	while (!list_empty(&tmp)) {
> > -- 
> > 2.13.0
> > 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 18:41                                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 18:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > Hi Jonathan,
> > 
> > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > 
> > 	https://marc.info/?l=linux-kernel&m=149750022019663
> > 
> > and RCU begins to use swait/wake last year, so I thought this could be
> > relevant.
> > 
> > Could you try the following patch and see if it works? Thanks.
> > 
> > Regards,
> > Boqun
> > 
> > ------------------>8
> > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> >  swake_up*()
> > 
> > Steven Rostedt reported a potential race in RCU core because of
> > swake_up():
> > 
> >         CPU0                            CPU1
> >         ----                            ----
> >                                 __call_rcu_core() {
> > 
> >                                  spin_lock(rnp_root)
> >                                  need_wake = __rcu_start_gp() {
> >                                   rcu_start_gp_advanced() {
> >                                    gp_flags = FLAG_INIT
> >                                   }
> >                                  }
> > 
> >  rcu_gp_kthread() {
> >    swait_event_interruptible(wq,
> >         gp_flags & FLAG_INIT) {
> 
> So the idea is that we get the old value of ->gp_flags here, correct?
> 
> >    spin_lock(q->lock)
> > 
> >                                 *fetch wq->task_list here! *
> 
> And the above fetch is really part of the swait_active() called out
> below, right?
> 
> >    list_add(wq->task_list, q->task_list)
> >    spin_unlock(q->lock);
> > 
> >    *fetch old value of gp_flags here *
> 
> And here we fetch the old value of ->gp_flags again, this time under
> the lock, right?
> 
> >                                  spin_unlock(rnp_root)
> > 
> >                                  rcu_gp_kthread_wake() {
> >                                   swake_up(wq) {
> >                                    swait_active(wq) {
> >                                     list_empty(wq->task_list)
> > 
> >                                    } * return false *
> > 
> >   if (condition) * false *
> >     schedule();
> > 
> > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > waits for a long time.
> > 
> > The reason of this is that we do a lockless swait_active() check in
> > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > before swait_active() to provide the proper order or 2) simply remove
> > the swait_active() in swake_up().
> > 
> > The solution 2 not only fixes this problem but also keeps the swait and
> > wait API as close as possible, as wake_up() doesn't provide a full
> > barrier and doesn't do a lockless check of the wait queue either.
> > Moreover, there are users already using swait_active() to do their quick
> > checks for the wait queues, so it make less sense that swake_up() and
> > swake_up_all() do this on their own.
> > 
> > This patch then removes the lockless swait_active() check in swake_up()
> > and swake_up_all().
> > 
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> 
> Even though Jonathan's testing indicates that it didn't fix this
> particular problem:
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

And while we are at it:

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> > ---
> >  kernel/sched/swait.c | 6 ------
> >  1 file changed, 6 deletions(-)
> > 
> > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > index 3d5610dcce11..2227e183e202 100644
> > --- a/kernel/sched/swait.c
> > +++ b/kernel/sched/swait.c
> > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> >  {
> >  	unsigned long flags;
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irqsave(&q->lock, flags);
> >  	swake_up_locked(q);
> >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> >  	struct swait_queue *curr;
> >  	LIST_HEAD(tmp);
> > 
> > -	if (!swait_active(q))
> > -		return;
> > -
> >  	raw_spin_lock_irq(&q->lock);
> >  	list_splice_init(&q->task_list, &tmp);
> >  	while (!list_empty(&tmp)) {
> > -- 
> > 2.13.0
> > 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 12:54                                                             ` Boqun Feng
  (?)
@ 2017-07-28 18:42                                                               ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-28 18:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Boqun Feng <boqun.feng@gmail.com>
Date: Fri, 28 Jul 2017 20:54:16 +0800

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m\x149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.

Just FYI I'm testing this patch now on sparc64...

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 18:42                                                               ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-28 18:42 UTC (permalink / raw)
  To: boqun.feng
  Cc: Jonathan.Cameron, paulmck, dzickus, sfr, linuxarm, npiggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, linux-arm-kernel

From: Boqun Feng <boqun.feng@gmail.com>
Date: Fri, 28 Jul 2017 20:54:16 +0800

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.

Just FYI I'm testing this patch now on sparc64...

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 18:42                                                               ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-07-28 18:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Boqun Feng <boqun.feng@gmail.com>
Date: Fri, 28 Jul 2017 20:54:16 +0800

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
> 	https://marc.info/?l=linux-kernel&m=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.

Just FYI I'm testing this patch now on sparc64...

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 17:27                                                                 ` Jonathan Cameron
  (?)
@ 2017-07-28 19:03                                                                   ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 09:55:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> > 
> > [ . . . ]
> > 
> > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > I could get tracing up and running)
> > >e
> > > So logs are large enough that pastebin doesn't like them - please shoet if  
> > >>e another timer period is of interest.  
> > > 
> > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > https://pastebin.com/3w1F7amH for dmesg.  
> > > 
> > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > detected around 835.
> > > 
> > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > may well be one in there.  
> > 
> > The dmesg says:
> > 
> > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > 
> > So I look for "rcu_preempt" timer events and find these:
> > 
> > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > 
> > Next look for "ffff8017d5fc7da0" and I don't find anything else.
> It does show up off the bottom of what would fit in pastebin...
> 
>      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
>      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
>      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags
Odd.  I would expect an expiration...  And ten seconds is way longer
than the requested one jiffy!

> > The timeout was one jiffy, and more than a second later, no expiration.
> > Is it possible that this event was lost?  I am not seeing any sign of
> > this is the trace.
> > 
> > I don't see any sign of CPU hotplug (and I test with lots of that in
> > any case).
> > 
> > The last time we saw something like this it was a timer HW/driver problem,
> > but it is a bit hard to imagine such a problem affecting both ARM64
> > and SPARC.  ;-)
> Could be different issues, both of which were hidden by that lockup detector.
> 
> There is an errata work around for the timers on this particular board.
> I'm only vaguely aware of it, so may be unconnected.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> 
> Seems unlikely though! + we've not yet seen it on the other chips that
> errata effects (not that that means much).

If you can reproduce quickly, might be worth trying anyway...

							Thanx, Paul

> Jonathan
> 
> > 
> > Thomas, any debugging suggestions?
> > 
> > 							Thanx, Paul
> > 
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 19:03                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 09:55:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> > 
> > [ . . . ]
> > 
> > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > I could get tracing up and running)
> > >e
> > > So logs are large enough that pastebin doesn't like them - please shoet if  
> > >>e another timer period is of interest.  
> > > 
> > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > https://pastebin.com/3w1F7amH for dmesg.  
> > > 
> > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > detected around 835.
> > > 
> > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > may well be one in there.  
> > 
> > The dmesg says:
> > 
> > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > 
> > So I look for "rcu_preempt" timer events and find these:
> > 
> > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > 
> > Next look for "ffff8017d5fc7da0" and I don't find anything else.
> It does show up off the bottom of what would fit in pastebin...
> 
>      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
>      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
>      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=

Odd.  I would expect an expiration...  And ten seconds is way longer
than the requested one jiffy!

> > The timeout was one jiffy, and more than a second later, no expiration.
> > Is it possible that this event was lost?  I am not seeing any sign of
> > this is the trace.
> > 
> > I don't see any sign of CPU hotplug (and I test with lots of that in
> > any case).
> > 
> > The last time we saw something like this it was a timer HW/driver problem,
> > but it is a bit hard to imagine such a problem affecting both ARM64
> > and SPARC.  ;-)
> Could be different issues, both of which were hidden by that lockup detector.
> 
> There is an errata work around for the timers on this particular board.
> I'm only vaguely aware of it, so may be unconnected.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> 
> Seems unlikely though! + we've not yet seen it on the other chips that
> errata effects (not that that means much).

If you can reproduce quickly, might be worth trying anyway...

							Thanx, Paul

> Jonathan
> 
> > 
> > Thomas, any debugging suggestions?
> > 
> > 							Thanx, Paul
> > 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 19:03                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 09:55:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> > 
> > [ . . . ]
> > 
> > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > I could get tracing up and running)
> > >e
> > > So logs are large enough that pastebin doesn't like them - please shoet if  
> > >>e another timer period is of interest.  
> > > 
> > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > https://pastebin.com/3w1F7amH for dmesg.  
> > > 
> > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > detected around 835.
> > > 
> > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > may well be one in there.  
> > 
> > The dmesg says:
> > 
> > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > 
> > So I look for "rcu_preempt" timer events and find these:
> > 
> > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > 
> > Next look for "ffff8017d5fc7da0" and I don't find anything else.
> It does show up off the bottom of what would fit in pastebin...
> 
>      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
>      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
>      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=

Odd.  I would expect an expiration...  And ten seconds is way longer
than the requested one jiffy!

> > The timeout was one jiffy, and more than a second later, no expiration.
> > Is it possible that this event was lost?  I am not seeing any sign of
> > this is the trace.
> > 
> > I don't see any sign of CPU hotplug (and I test with lots of that in
> > any case).
> > 
> > The last time we saw something like this it was a timer HW/driver problem,
> > but it is a bit hard to imagine such a problem affecting both ARM64
> > and SPARC.  ;-)
> Could be different issues, both of which were hidden by that lockup detector.
> 
> There is an errata work around for the timers on this particular board.
> I'm only vaguely aware of it, so may be unconnected.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> 
> Seems unlikely though! + we've not yet seen it on the other chips that
> errata effects (not that that means much).

If you can reproduce quickly, might be worth trying anyway...

							Thanx, Paul

> Jonathan
> 
> > 
> > Thomas, any debugging suggestions?
> > 
> > 							Thanx, Paul
> > 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 18:41                                                                 ` Paul E. McKenney
  (?)
@ 2017-07-28 19:09                                                                   ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:

[ . . . ]

> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Not because it it fixed the TREE01 issue -- it did not.  But as near
as I can see, it didn't cause any additional issues.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 19:09                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:09 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:

[ . . . ]

> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Not because it it fixed the TREE01 issue -- it did not.  But as near
as I can see, it didn't cause any additional issues.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-28 19:09                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-28 19:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:

[ . . . ]

> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Not because it it fixed the TREE01 issue -- it did not.  But as near
as I can see, it didn't cause any additional issues.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 18:41                                                                 ` Paul E. McKenney
  (?)
@ 2017-07-29  1:20                                                                   ` Boqun Feng
  -1 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-29  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 5322 bytes --]

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > > Hi Jonathan,
> > > 
> > > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > > 
> > > 	https://marc.info/?l=linux-kernel&m=149750022019663
> > > 
> > > and RCU begins to use swait/wake last year, so I thought this could be
> > > relevant.
> > > 
> > > Could you try the following patch and see if it works? Thanks.
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > ------------------>8
> > > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> > >  swake_up*()
> > > 
> > > Steven Rostedt reported a potential race in RCU core because of
> > > swake_up():
> > > 
> > >         CPU0                            CPU1
> > >         ----                            ----
> > >                                 __call_rcu_core() {
> > > 
> > >                                  spin_lock(rnp_root)
> > >                                  need_wake = __rcu_start_gp() {
> > >                                   rcu_start_gp_advanced() {
> > >                                    gp_flags = FLAG_INIT
> > >                                   }
> > >                                  }
> > > 
> > >  rcu_gp_kthread() {
> > >    swait_event_interruptible(wq,
> > >         gp_flags & FLAG_INIT) {
> > 
> > So the idea is that we get the old value of ->gp_flags here, correct?
> > 

Yes.

> > >    spin_lock(q->lock)
> > > 
> > >                                 *fetch wq->task_list here! *
> > 
> > And the above fetch is really part of the swait_active() called out
> > below, right?
> > 

Right.

> > >    list_add(wq->task_list, q->task_list)
> > >    spin_unlock(q->lock);
> > > 
> > >    *fetch old value of gp_flags here *
> > 
> > And here we fetch the old value of ->gp_flags again, this time under
> > the lock, right?
> > 

Hmm.. a bit different, this time is still lockless but *after* the wait
enqueued itself. We could rely on the spin_lock(q->lock) above to pair
with a spin_unlock() from another lock critical section of accessing
the wait queue(typically from some waker). But in the case Steven came
up, there is an lockless accessing to the wait queue from the waker, so
such a pair doesn't exist, which end up that the waker sees a empty wait
queue and do nothing, while the waiter still observes the old value
after its enqueue and goes to sleep.

> > >                                  spin_unlock(rnp_root)
> > > 
> > >                                  rcu_gp_kthread_wake() {
> > >                                   swake_up(wq) {
> > >                                    swait_active(wq) {
> > >                                     list_empty(wq->task_list)
> > > 
> > >                                    } * return false *
> > > 
> > >   if (condition) * false *
> > >     schedule();
> > > 
> > > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > > waits for a long time.
> > > 
> > > The reason of this is that we do a lockless swait_active() check in
> > > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > > before swait_active() to provide the proper order or 2) simply remove
> > > the swait_active() in swake_up().
> > > 
> > > The solution 2 not only fixes this problem but also keeps the swait and
> > > wait API as close as possible, as wake_up() doesn't provide a full
> > > barrier and doesn't do a lockless check of the wait queue either.
> > > Moreover, there are users already using swait_active() to do their quick
> > > checks for the wait queues, so it make less sense that swake_up() and
> > > swake_up_all() do this on their own.
> > > 
> > > This patch then removes the lockless swait_active() check in swake_up()
> > > and swake_up_all().
> > > 
> > > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > 
> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 

Thanks!

Regards,
Boqun

> > > ---
> > >  kernel/sched/swait.c | 6 ------
> > >  1 file changed, 6 deletions(-)
> > > 
> > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > > index 3d5610dcce11..2227e183e202 100644
> > > --- a/kernel/sched/swait.c
> > > +++ b/kernel/sched/swait.c
> > > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> > >  {
> > >  	unsigned long flags;
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irqsave(&q->lock, flags);
> > >  	swake_up_locked(q);
> > >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> > >  	struct swait_queue *curr;
> > >  	LIST_HEAD(tmp);
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irq(&q->lock);
> > >  	list_splice_init(&q->task_list, &tmp);
> > >  	while (!list_empty(&tmp)) {
> > > -- 
> > > 2.13.0
> > > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-29  1:20                                                                   ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-29  1:20 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 5322 bytes --]

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > > Hi Jonathan,
> > > 
> > > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > > 
> > > 	https://marc.info/?l=linux-kernel&m=149750022019663
> > > 
> > > and RCU begins to use swait/wake last year, so I thought this could be
> > > relevant.
> > > 
> > > Could you try the following patch and see if it works? Thanks.
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > ------------------>8
> > > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> > >  swake_up*()
> > > 
> > > Steven Rostedt reported a potential race in RCU core because of
> > > swake_up():
> > > 
> > >         CPU0                            CPU1
> > >         ----                            ----
> > >                                 __call_rcu_core() {
> > > 
> > >                                  spin_lock(rnp_root)
> > >                                  need_wake = __rcu_start_gp() {
> > >                                   rcu_start_gp_advanced() {
> > >                                    gp_flags = FLAG_INIT
> > >                                   }
> > >                                  }
> > > 
> > >  rcu_gp_kthread() {
> > >    swait_event_interruptible(wq,
> > >         gp_flags & FLAG_INIT) {
> > 
> > So the idea is that we get the old value of ->gp_flags here, correct?
> > 

Yes.

> > >    spin_lock(q->lock)
> > > 
> > >                                 *fetch wq->task_list here! *
> > 
> > And the above fetch is really part of the swait_active() called out
> > below, right?
> > 

Right.

> > >    list_add(wq->task_list, q->task_list)
> > >    spin_unlock(q->lock);
> > > 
> > >    *fetch old value of gp_flags here *
> > 
> > And here we fetch the old value of ->gp_flags again, this time under
> > the lock, right?
> > 

Hmm.. a bit different, this time is still lockless but *after* the wait
enqueued itself. We could rely on the spin_lock(q->lock) above to pair
with a spin_unlock() from another lock critical section of accessing
the wait queue(typically from some waker). But in the case Steven came
up, there is an lockless accessing to the wait queue from the waker, so
such a pair doesn't exist, which end up that the waker sees a empty wait
queue and do nothing, while the waiter still observes the old value
after its enqueue and goes to sleep.

> > >                                  spin_unlock(rnp_root)
> > > 
> > >                                  rcu_gp_kthread_wake() {
> > >                                   swake_up(wq) {
> > >                                    swait_active(wq) {
> > >                                     list_empty(wq->task_list)
> > > 
> > >                                    } * return false *
> > > 
> > >   if (condition) * false *
> > >     schedule();
> > > 
> > > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > > waits for a long time.
> > > 
> > > The reason of this is that we do a lockless swait_active() check in
> > > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > > before swait_active() to provide the proper order or 2) simply remove
> > > the swait_active() in swake_up().
> > > 
> > > The solution 2 not only fixes this problem but also keeps the swait and
> > > wait API as close as possible, as wake_up() doesn't provide a full
> > > barrier and doesn't do a lockless check of the wait queue either.
> > > Moreover, there are users already using swait_active() to do their quick
> > > checks for the wait queues, so it make less sense that swake_up() and
> > > swake_up_all() do this on their own.
> > > 
> > > This patch then removes the lockless swait_active() check in swake_up()
> > > and swake_up_all().
> > > 
> > > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > 
> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 

Thanks!

Regards,
Boqun

> > > ---
> > >  kernel/sched/swait.c | 6 ------
> > >  1 file changed, 6 deletions(-)
> > > 
> > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > > index 3d5610dcce11..2227e183e202 100644
> > > --- a/kernel/sched/swait.c
> > > +++ b/kernel/sched/swait.c
> > > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> > >  {
> > >  	unsigned long flags;
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irqsave(&q->lock, flags);
> > >  	swake_up_locked(q);
> > >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> > >  	struct swait_queue *curr;
> > >  	LIST_HEAD(tmp);
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irq(&q->lock);
> > >  	list_splice_init(&q->task_list, &tmp);
> > >  	while (!list_empty(&tmp)) {
> > > -- 
> > > 2.13.0
> > > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-29  1:20                                                                   ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-29  1:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > > Hi Jonathan,
> > > 
> > > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > > 
> > > 	https://marc.info/?l=linux-kernel&m=149750022019663
> > > 
> > > and RCU begins to use swait/wake last year, so I thought this could be
> > > relevant.
> > > 
> > > Could you try the following patch and see if it works? Thanks.
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > ------------------>8
> > > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> > >  swake_up*()
> > > 
> > > Steven Rostedt reported a potential race in RCU core because of
> > > swake_up():
> > > 
> > >         CPU0                            CPU1
> > >         ----                            ----
> > >                                 __call_rcu_core() {
> > > 
> > >                                  spin_lock(rnp_root)
> > >                                  need_wake = __rcu_start_gp() {
> > >                                   rcu_start_gp_advanced() {
> > >                                    gp_flags = FLAG_INIT
> > >                                   }
> > >                                  }
> > > 
> > >  rcu_gp_kthread() {
> > >    swait_event_interruptible(wq,
> > >         gp_flags & FLAG_INIT) {
> > 
> > So the idea is that we get the old value of ->gp_flags here, correct?
> > 

Yes.

> > >    spin_lock(q->lock)
> > > 
> > >                                 *fetch wq->task_list here! *
> > 
> > And the above fetch is really part of the swait_active() called out
> > below, right?
> > 

Right.

> > >    list_add(wq->task_list, q->task_list)
> > >    spin_unlock(q->lock);
> > > 
> > >    *fetch old value of gp_flags here *
> > 
> > And here we fetch the old value of ->gp_flags again, this time under
> > the lock, right?
> > 

Hmm.. a bit different, this time is still lockless but *after* the wait
enqueued itself. We could rely on the spin_lock(q->lock) above to pair
with a spin_unlock() from another lock critical section of accessing
the wait queue(typically from some waker). But in the case Steven came
up, there is an lockless accessing to the wait queue from the waker, so
such a pair doesn't exist, which end up that the waker sees a empty wait
queue and do nothing, while the waiter still observes the old value
after its enqueue and goes to sleep.

> > >                                  spin_unlock(rnp_root)
> > > 
> > >                                  rcu_gp_kthread_wake() {
> > >                                   swake_up(wq) {
> > >                                    swait_active(wq) {
> > >                                     list_empty(wq->task_list)
> > > 
> > >                                    } * return false *
> > > 
> > >   if (condition) * false *
> > >     schedule();
> > > 
> > > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > > waits for a long time.
> > > 
> > > The reason of this is that we do a lockless swait_active() check in
> > > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > > before swait_active() to provide the proper order or 2) simply remove
> > > the swait_active() in swake_up().
> > > 
> > > The solution 2 not only fixes this problem but also keeps the swait and
> > > wait API as close as possible, as wake_up() doesn't provide a full
> > > barrier and doesn't do a lockless check of the wait queue either.
> > > Moreover, there are users already using swait_active() to do their quick
> > > checks for the wait queues, so it make less sense that swake_up() and
> > > swake_up_all() do this on their own.
> > > 
> > > This patch then removes the lockless swait_active() check in swake_up()
> > > and swake_up_all().
> > > 
> > > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > 
> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 

Thanks!

Regards,
Boqun

> > > ---
> > >  kernel/sched/swait.c | 6 ------
> > >  1 file changed, 6 deletions(-)
> > > 
> > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > > index 3d5610dcce11..2227e183e202 100644
> > > --- a/kernel/sched/swait.c
> > > +++ b/kernel/sched/swait.c
> > > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> > >  {
> > >  	unsigned long flags;
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irqsave(&q->lock, flags);
> > >  	swake_up_locked(q);
> > >  	raw_spin_unlock_irqrestore(&q->lock, flags);
> > > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> > >  	struct swait_queue *curr;
> > >  	LIST_HEAD(tmp);
> > > 
> > > -	if (!swait_active(q))
> > > -		return;
> > > -
> > >  	raw_spin_lock_irq(&q->lock);
> > >  	list_splice_init(&q->task_list, &tmp);
> > >  	while (!list_empty(&tmp)) {
> > > -- 
> > > 2.13.0
> > > 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170729/20cbb0f2/attachment.sig>

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 19:09                                                                   ` Paul E. McKenney
  (?)
@ 2017-07-30 13:37                                                                     ` Boqun Feng
  -1 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-30 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> 
> [ . . . ]
> 
> > > Even though Jonathan's testing indicates that it didn't fix this
> > > particular problem:
> > > 
> > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > And while we are at it:
> > 
> > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Not because it it fixed the TREE01 issue -- it did not.  But as near
> as I can see, it didn't cause any additional issues.
> 

Understood.

Still work on waketorture for a test case which could trigger this
problem in a real world. My old plan is to send this out when I could
use waketorture to show this patch actually resolves some potential
bugs, but just put it ahead here in case it may help.

Will send it out with your Tested-by and Acked-by and continue to work
on waketorture.

Regards,
Boqun

> 							Thanx, Paul
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-30 13:37                                                                     ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-30 13:37 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> 
> [ . . . ]
> 
> > > Even though Jonathan's testing indicates that it didn't fix this
> > > particular problem:
> > > 
> > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > And while we are at it:
> > 
> > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Not because it it fixed the TREE01 issue -- it did not.  But as near
> as I can see, it didn't cause any additional issues.
> 

Understood.

Still work on waketorture for a test case which could trigger this
problem in a real world. My old plan is to send this out when I could
use waketorture to show this patch actually resolves some potential
bugs, but just put it ahead here in case it may help.

Will send it out with your Tested-by and Acked-by and continue to work
on waketorture.

Regards,
Boqun

> 							Thanx, Paul
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-30 13:37                                                                     ` Boqun Feng
  0 siblings, 0 replies; 241+ messages in thread
From: Boqun Feng @ 2017-07-30 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> 
> [ . . . ]
> 
> > > Even though Jonathan's testing indicates that it didn't fix this
> > > particular problem:
> > > 
> > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > And while we are at it:
> > 
> > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Not because it it fixed the TREE01 issue -- it did not.  But as near
> as I can see, it didn't cause any additional issues.
> 

Understood.

Still work on waketorture for a test case which could trigger this
problem in a real world. My old plan is to send this out when I could
use waketorture to show this patch actually resolves some potential
bugs, but just put it ahead here in case it may help.

Will send it out with your Tested-by and Acked-by and continue to work
on waketorture.

Regards,
Boqun

> 							Thanx, Paul
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170730/7da029b1/attachment.sig>

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-30 13:37                                                                     ` Boqun Feng
  (?)
@ 2017-07-30 16:59                                                                       ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-30 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jul 30, 2017 at 09:37:47PM +0800, Boqun Feng wrote:
> On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > 
> > [ . . . ]
> > 
> > > > Even though Jonathan's testing indicates that it didn't fix this
> > > > particular problem:
> > > > 
> > > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > 
> > > And while we are at it:
> > > 
> > > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > Not because it it fixed the TREE01 issue -- it did not.  But as near
> > as I can see, it didn't cause any additional issues.
> > 
> 
> Understood.
> 
> Still work on waketorture for a test case which could trigger this
> problem in a real world. My old plan is to send this out when I could
> use waketorture to show this patch actually resolves some potential
> bugs, but just put it ahead here in case it may help.
> 
> Will send it out with your Tested-by and Acked-by and continue to work
> on waketorture.

Sounds good!

Given that Jonathan's traces didn't show a timer expiration, the problems
might be in timers.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-30 16:59                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-30 16:59 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Sun, Jul 30, 2017 at 09:37:47PM +0800, Boqun Feng wrote:
> On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > 
> > [ . . . ]
> > 
> > > > Even though Jonathan's testing indicates that it didn't fix this
> > > > particular problem:
> > > > 
> > > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > 
> > > And while we are at it:
> > > 
> > > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > Not because it it fixed the TREE01 issue -- it did not.  But as near
> > as I can see, it didn't cause any additional issues.
> > 
> 
> Understood.
> 
> Still work on waketorture for a test case which could trigger this
> problem in a real world. My old plan is to send this out when I could
> use waketorture to show this patch actually resolves some potential
> bugs, but just put it ahead here in case it may help.
> 
> Will send it out with your Tested-by and Acked-by and continue to work
> on waketorture.

Sounds good!

Given that Jonathan's traces didn't show a timer expiration, the problems
might be in timers.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-30 16:59                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-30 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jul 30, 2017 at 09:37:47PM +0800, Boqun Feng wrote:
> On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > 
> > [ . . . ]
> > 
> > > > Even though Jonathan's testing indicates that it didn't fix this
> > > > particular problem:
> > > > 
> > > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > 
> > > And while we are at it:
> > > 
> > > Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > Not because it it fixed the TREE01 issue -- it did not.  But as near
> > as I can see, it didn't cause any additional issues.
> > 
> 
> Understood.
> 
> Still work on waketorture for a test case which could trigger this
> problem in a real world. My old plan is to send this out when I could
> use waketorture to show this patch actually resolves some potential
> bugs, but just put it ahead here in case it may help.
> 
> Will send it out with your Tested-by and Acked-by and continue to work
> on waketorture.

Sounds good!

Given that Jonathan's traces didn't show a timer expiration, the problems
might be in timers.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-28 19:03                                                                   ` Paul E. McKenney
  (?)
@ 2017-07-31 11:08                                                                     ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 12:03:50 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 09:55:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > 
> > > [ . . . ]
> > >   
> > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > I could get tracing up and running)
> > > >e
> > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > >>e another timer period is of interest.    
> > > > 
> > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > 
> > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > detected around 835.
> > > > 
> > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > may well be one in there.    
> > > 
> > > The dmesg says:
> > > 
> > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > 
> > > So I look for "rcu_preempt" timer events and find these:
> > > 
> > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > 
> > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > It does show up off the bottom of what would fit in pastebin...
> > 
> >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=  
> 
> Odd.  I would expect an expiration...  And ten seconds is way longer
> than the requested one jiffy!
> 
> > > The timeout was one jiffy, and more than a second later, no expiration.
> > > Is it possible that this event was lost?  I am not seeing any sign of
> > > this is the trace.
> > > 
> > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > any case).
> > > 
> > > The last time we saw something like this it was a timer HW/driver problem,
> > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > and SPARC.  ;-)  
> > Could be different issues, both of which were hidden by that lockup detector.
> > 
> > There is an errata work around for the timers on this particular board.
> > I'm only vaguely aware of it, so may be unconnected.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > 
> > Seems unlikely though! + we've not yet seen it on the other chips that
> > errata effects (not that that means much).  
> 
> If you can reproduce quickly, might be worth trying anyway...
> 
> 							Thanx, Paul
Errata fix is running already and was for all those tests.

I'll have a dig into the timers today and see where I get to.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Thomas, any debugging suggestions?
> > > 
> > > 							Thanx, Paul
> > >   
> >   
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:08                                                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Fri, 28 Jul 2017 12:03:50 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 09:55:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > 
> > > [ . . . ]
> > >   
> > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > I could get tracing up and running)
> > > >e
> > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > >>e another timer period is of interest.    
> > > > 
> > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > 
> > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > detected around 835.
> > > > 
> > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > may well be one in there.    
> > > 
> > > The dmesg says:
> > > 
> > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > 
> > > So I look for "rcu_preempt" timer events and find these:
> > > 
> > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > 
> > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > It does show up off the bottom of what would fit in pastebin...
> > 
> >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=  
> 
> Odd.  I would expect an expiration...  And ten seconds is way longer
> than the requested one jiffy!
> 
> > > The timeout was one jiffy, and more than a second later, no expiration.
> > > Is it possible that this event was lost?  I am not seeing any sign of
> > > this is the trace.
> > > 
> > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > any case).
> > > 
> > > The last time we saw something like this it was a timer HW/driver problem,
> > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > and SPARC.  ;-)  
> > Could be different issues, both of which were hidden by that lockup detector.
> > 
> > There is an errata work around for the timers on this particular board.
> > I'm only vaguely aware of it, so may be unconnected.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > 
> > Seems unlikely though! + we've not yet seen it on the other chips that
> > errata effects (not that that means much).  
> 
> If you can reproduce quickly, might be worth trying anyway...
> 
> 							Thanx, Paul
Errata fix is running already and was for all those tests.

I'll have a dig into the timers today and see where I get to.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Thomas, any debugging suggestions?
> > > 
> > > 							Thanx, Paul
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:08                                                                     ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 Jul 2017 12:03:50 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 09:55:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > 
> > > [ . . . ]
> > >   
> > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > I could get tracing up and running)
> > > >e
> > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > >>e another timer period is of interest.    
> > > > 
> > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > 
> > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > detected around 835.
> > > > 
> > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > may well be one in there.    
> > > 
> > > The dmesg says:
> > > 
> > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > 
> > > So I look for "rcu_preempt" timer events and find these:
> > > 
> > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > 
> > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > It does show up off the bottom of what would fit in pastebin...
> > 
> >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=  
> 
> Odd.  I would expect an expiration...  And ten seconds is way longer
> than the requested one jiffy!
> 
> > > The timeout was one jiffy, and more than a second later, no expiration.
> > > Is it possible that this event was lost?  I am not seeing any sign of
> > > this is the trace.
> > > 
> > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > any case).
> > > 
> > > The last time we saw something like this it was a timer HW/driver problem,
> > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > and SPARC.  ;-)  
> > Could be different issues, both of which were hidden by that lockup detector.
> > 
> > There is an errata work around for the timers on this particular board.
> > I'm only vaguely aware of it, so may be unconnected.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > 
> > Seems unlikely though! + we've not yet seen it on the other chips that
> > errata effects (not that that means much).  
> 
> If you can reproduce quickly, might be worth trying anyway...
> 
> 							Thanx, Paul
Errata fix is running already and was for all those tests.

I'll have a dig into the timers today and see where I get to.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Thomas, any debugging suggestions?
> > > 
> > > 							Thanx, Paul
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-26 23:15                                           ` Paul E. McKenney
  (?)
@ 2017-07-31 11:09                                             ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 16:15:05 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 15:36:58 -0700
> >   
> > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > really is having an effect.  
> > 
> > Thanks for all of the info Paul, I'll digest this and scan over the
> > code myself.
> > 
> > Just out of curiousity, what x86 idle method is your machine using?
> > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > might mask this bug, and halt would be a lot closer to how sparc64 and
> > Jonathan's system operates.  
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set
> 
> > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > local TICK register keeps advancing, and the local timer therefore
> > will still trigger.  Also, any externally generated interrupts
> > (including cross calls) will wake up the cpu as well.
> > 
> > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > case.  One of my running theories is that we miss scheduling a tick
> > due to a race.  That would be consistent with the behavior we see
> > in the RCU dumps, I think.  
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?
> 
> > Anyways, just a theory, and that's why I keep mentioning that commit
> > about the revert of the revert (specifically
> > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > 
> > :-)  
> 
> I am running an overnight test in preparation for attempting to push
> some fixes for regressions into 4.12, but will try reverting this
> and enabling CONFIG_HZ_PERIODIC tomorrow.
> 
> Jonathan, might the commit that Dave points out above be what reduces
> the probability of occurrence as you test older releases?
I just got around to trying this out of curiosity.  Superficially it did
appear to possibly make the issue harder to hit took over 30 minutes
but the issue otherwise looks much the same with or without that patch.

Just out of curiosity, next thing on my list is to disable hrtimers entirely
and see what happens.

Jonathan
> 
> 							Thanx, Paul
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:09                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Miller, dzickus, sfr, linuxarm, npiggin, abdhalee,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel

On Wed, 26 Jul 2017 16:15:05 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 15:36:58 -0700
> >   
> > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > really is having an effect.  
> > 
> > Thanks for all of the info Paul, I'll digest this and scan over the
> > code myself.
> > 
> > Just out of curiousity, what x86 idle method is your machine using?
> > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > might mask this bug, and halt would be a lot closer to how sparc64 and
> > Jonathan's system operates.  
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set
> 
> > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > local TICK register keeps advancing, and the local timer therefore
> > will still trigger.  Also, any externally generated interrupts
> > (including cross calls) will wake up the cpu as well.
> > 
> > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > case.  One of my running theories is that we miss scheduling a tick
> > due to a race.  That would be consistent with the behavior we see
> > in the RCU dumps, I think.  
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?
> 
> > Anyways, just a theory, and that's why I keep mentioning that commit
> > about the revert of the revert (specifically
> > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > 
> > :-)  
> 
> I am running an overnight test in preparation for attempting to push
> some fixes for regressions into 4.12, but will try reverting this
> and enabling CONFIG_HZ_PERIODIC tomorrow.
> 
> Jonathan, might the commit that Dave points out above be what reduces
> the probability of occurrence as you test older releases?
I just got around to trying this out of curiosity.  Superficially it did
appear to possibly make the issue harder to hit took over 30 minutes
but the issue otherwise looks much the same with or without that patch.

Just out of curiosity, next thing on my list is to disable hrtimers entirely
and see what happens.

Jonathan
> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:09                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 26 Jul 2017 16:15:05 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Date: Wed, 26 Jul 2017 15:36:58 -0700
> >   
> > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > really is having an effect.  
> > 
> > Thanks for all of the info Paul, I'll digest this and scan over the
> > code myself.
> > 
> > Just out of curiousity, what x86 idle method is your machine using?
> > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > might mask this bug, and halt would be a lot closer to how sparc64 and
> > Jonathan's system operates.  
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
> 	CONFIG_NO_HZ_IDLE=y
> 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> 	# CONFIG_IDLE_PAGE_TRACKING is not set
> 	CONFIG_ACPI_PROCESSOR_IDLE=y
> 	CONFIG_CPU_IDLE=y
> 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> 	CONFIG_CPU_IDLE_GOV_MENU=y
> 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> 	# CONFIG_INTEL_IDLE is not set
> 
> > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > local TICK register keeps advancing, and the local timer therefore
> > will still trigger.  Also, any externally generated interrupts
> > (including cross calls) will wake up the cpu as well.
> > 
> > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > case.  One of my running theories is that we miss scheduling a tick
> > due to a race.  That would be consistent with the behavior we see
> > in the RCU dumps, I think.  
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?
> 
> > Anyways, just a theory, and that's why I keep mentioning that commit
> > about the revert of the revert (specifically
> > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > 
> > :-)  
> 
> I am running an overnight test in preparation for attempting to push
> some fixes for regressions into 4.12, but will try reverting this
> and enabling CONFIG_HZ_PERIODIC tomorrow.
> 
> Jonathan, might the commit that Dave points out above be what reduces
> the probability of occurrence as you test older releases?
I just got around to trying this out of curiosity.  Superficially it did
appear to possibly make the issue harder to hit took over 30 minutes
but the issue otherwise looks much the same with or without that patch.

Just out of curiosity, next thing on my list is to disable hrtimers entirely
and see what happens.

Jonathan
> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-31 11:09                                             ` Jonathan Cameron
  (?)
@ 2017-07-31 11:55                                               ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 31 Jul 2017 12:09:08 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 16:15:05 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > >     
> > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > really is having an effect.    
> > > 
> > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > code myself.
> > > 
> > > Just out of curiousity, what x86 idle method is your machine using?
> > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > Jonathan's system operates.    
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> >   
> > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > local TICK register keeps advancing, and the local timer therefore
> > > will still trigger.  Also, any externally generated interrupts
> > > (including cross calls) will wake up the cpu as well.
> > > 
> > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > case.  One of my running theories is that we miss scheduling a tick
> > > due to a race.  That would be consistent with the behavior we see
> > > in the RCU dumps, I think.    
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> >   
> > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > about the revert of the revert (specifically
> > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > 
> > > :-)    
> > 
> > I am running an overnight test in preparation for attempting to push
> > some fixes for regressions into 4.12, but will try reverting this
> > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > 
> > Jonathan, might the commit that Dave points out above be what reduces
> > the probability of occurrence as you test older releases?  
> I just got around to trying this out of curiosity.  Superficially it did
> appear to possibly make the issue harder to hit took over 30 minutes
> but the issue otherwise looks much the same with or without that patch.
> 
> Just out of curiosity, next thing on my list is to disable hrtimers entirely
> and see what happens.
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:55                                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel

On Mon, 31 Jul 2017 12:09:08 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 16:15:05 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > >     
> > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > really is having an effect.    
> > > 
> > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > code myself.
> > > 
> > > Just out of curiousity, what x86 idle method is your machine using?
> > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > Jonathan's system operates.    
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> >   
> > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > local TICK register keeps advancing, and the local timer therefore
> > > will still trigger.  Also, any externally generated interrupts
> > > (including cross calls) will wake up the cpu as well.
> > > 
> > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > case.  One of my running theories is that we miss scheduling a tick
> > > due to a race.  That would be consistent with the behavior we see
> > > in the RCU dumps, I think.    
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> >   
> > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > about the revert of the revert (specifically
> > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > 
> > > :-)    
> > 
> > I am running an overnight test in preparation for attempting to push
> > some fixes for regressions into 4.12, but will try reverting this
> > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > 
> > Jonathan, might the commit that Dave points out above be what reduces
> > the probability of occurrence as you test older releases?  
> I just got around to trying this out of curiosity.  Superficially it did
> appear to possibly make the issue harder to hit took over 30 minutes
> but the issue otherwise looks much the same with or without that patch.
> 
> Just out of curiosity, next thing on my list is to disable hrtimers entirely
> and see what happens.
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 11:55                                               ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 31 Jul 2017 12:09:08 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 26 Jul 2017 16:15:05 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > >     
> > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > really is having an effect.    
> > > 
> > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > code myself.
> > > 
> > > Just out of curiousity, what x86 idle method is your machine using?
> > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > Jonathan's system operates.    
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > 	CONFIG_NO_HZ_IDLE=y
> > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > 	CONFIG_CPU_IDLE=y
> > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > 	# CONFIG_INTEL_IDLE is not set
> >   
> > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > local TICK register keeps advancing, and the local timer therefore
> > > will still trigger.  Also, any externally generated interrupts
> > > (including cross calls) will wake up the cpu as well.
> > > 
> > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > case.  One of my running theories is that we miss scheduling a tick
> > > due to a race.  That would be consistent with the behavior we see
> > > in the RCU dumps, I think.    
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> >   
> > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > about the revert of the revert (specifically
> > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > 
> > > :-)    
> > 
> > I am running an overnight test in preparation for attempting to push
> > some fixes for regressions into 4.12, but will try reverting this
> > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > 
> > Jonathan, might the commit that Dave points out above be what reduces
> > the probability of occurrence as you test older releases?  
> I just got around to trying this out of curiosity.  Superficially it did
> appear to possibly make the issue harder to hit took over 30 minutes
> but the issue otherwise looks much the same with or without that patch.
> 
> Just out of curiosity, next thing on my list is to disable hrtimers entirely
> and see what happens.
> 
> Jonathan
> > 
> > 							Thanx, Paul
> >   
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-31 11:08                                                                     ` Jonathan Cameron
  (?)
@ 2017-07-31 15:04                                                                       ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-31 15:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 12:03:50 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > I could get tracing up and running)
> > > > >e
> > > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > > >>e another timer period is of interest.    
> > > > > 
> > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > 
> > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > detected around 835.
> > > > > 
> > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > may well be one in there.    
> > > > 
> > > > The dmesg says:
> > > > 
> > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > 
> > > > So I look for "rcu_preempt" timer events and find these:
> > > > 
> > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > 
> > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > > It does show up off the bottom of what would fit in pastebin...
> > > 
> > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=  
> > 
> > Odd.  I would expect an expiration...  And ten seconds is way longer
> > than the requested one jiffy!
> > 
> > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > this is the trace.
> > > > 
> > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > any case).
> > > > 
> > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > and SPARC.  ;-)  
> > > Could be different issues, both of which were hidden by that lockup detector.
> > > 
> > > There is an errata work around for the timers on this particular board.
> > > I'm only vaguely aware of it, so may be unconnected.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > 
> > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > errata effects (not that that means much).  
> > 
> > If you can reproduce quickly, might be worth trying anyway...
> > 
> > 							Thanx, Paul
> Errata fix is running already and was for all those tests.

I was afraid of that...  ;-)

> I'll have a dig into the timers today and see where I get to.

Look forward to seeing what you find!

							Thanx, Paul

> Jonathan
> > 
> > > Jonathan
> > >   
> > > > 
> > > > Thomas, any debugging suggestions?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > >   
> > 
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 15:04                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-31 15:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 12:03:50 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > I could get tracing up and running)
> > > > >e
> > > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > > >>e another timer period is of interest.    
> > > > > 
> > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > 
> > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > detected around 835.
> > > > > 
> > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > may well be one in there.    
> > > > 
> > > > The dmesg says:
> > > > 
> > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > 
> > > > So I look for "rcu_preempt" timer events and find these:
> > > > 
> > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > 
> > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > > It does show up off the bottom of what would fit in pastebin...
> > > 
> > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=  
> > 
> > Odd.  I would expect an expiration...  And ten seconds is way longer
> > than the requested one jiffy!
> > 
> > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > this is the trace.
> > > > 
> > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > any case).
> > > > 
> > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > and SPARC.  ;-)  
> > > Could be different issues, both of which were hidden by that lockup detector.
> > > 
> > > There is an errata work around for the timers on this particular board.
> > > I'm only vaguely aware of it, so may be unconnected.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > 
> > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > errata effects (not that that means much).  
> > 
> > If you can reproduce quickly, might be worth trying anyway...
> > 
> > 							Thanx, Paul
> Errata fix is running already and was for all those tests.

I was afraid of that...  ;-)

> I'll have a dig into the timers today and see where I get to.

Look forward to seeing what you find!

							Thanx, Paul

> Jonathan
> > 
> > > Jonathan
> > >   
> > > > 
> > > > Thomas, any debugging suggestions?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > >   
> > 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 15:04                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-07-31 15:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 12:03:50 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > I could get tracing up and running)
> > > > >e
> > > > > So logs are large enough that pastebin doesn't like them - please shoet if    
> > > > >>e another timer period is of interest.    
> > > > > 
> > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > 
> > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > detected around 835.
> > > > > 
> > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > may well be one in there.    
> > > > 
> > > > The dmesg says:
> > > > 
> > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > 
> > > > So I look for "rcu_preempt" timer events and find these:
> > > > 
> > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > 
> > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.  
> > > It does show up off the bottom of what would fit in pastebin...
> > > 
> > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=  
> > 
> > Odd.  I would expect an expiration...  And ten seconds is way longer
> > than the requested one jiffy!
> > 
> > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > this is the trace.
> > > > 
> > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > any case).
> > > > 
> > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > and SPARC.  ;-)  
> > > Could be different issues, both of which were hidden by that lockup detector.
> > > 
> > > There is an errata work around for the timers on this particular board.
> > > I'm only vaguely aware of it, so may be unconnected.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > 
> > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > errata effects (not that that means much).  
> > 
> > If you can reproduce quickly, might be worth trying anyway...
> > 
> > 							Thanx, Paul
> Errata fix is running already and was for all those tests.

I was afraid of that...  ;-)

> I'll have a dig into the timers today and see where I get to.

Look forward to seeing what you find!

							Thanx, Paul

> Jonathan
> > 
> > > Jonathan
> > >   
> > > > 
> > > > Thomas, any debugging suggestions?
> > > > 
> > > > 							Thanx, Paul
> > > >   
> > >   
> > 
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-31 15:04                                                                       ` Paul E. McKenney
  (?)
@ 2017-07-31 15:27                                                                         ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 31 Jul 2017 08:04:11 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 12:03:50 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > I could get tracing up and running)
> > > > > >e
> > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > >>e another timer period is of interest.      
> > > > > > 
> > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > 
> > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > detected around 835.
> > > > > > 
> > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > may well be one in there.      
> > > > > 
> > > > > The dmesg says:
> > > > > 
> > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > 
> > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > 
> > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > > 
> > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > It does show up off the bottom of what would fit in pastebin...
> > > > 
> > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=    
> > > 
> > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > than the requested one jiffy!
> > >   
> > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > this is the trace.
> > > > > 
> > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > any case).
> > > > > 
> > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > and SPARC.  ;-)    
> > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > 
> > > > There is an errata work around for the timers on this particular board.
> > > > I'm only vaguely aware of it, so may be unconnected.
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > > 
> > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > errata effects (not that that means much).    
> > > 
> > > If you can reproduce quickly, might be worth trying anyway...
> > > 
> > > 							Thanx, Paul  
> > Errata fix is running already and was for all those tests.  
> 
> I was afraid of that...  ;-)
It's a pretty rare errata it seems.  Not actually managed to catch
one yet. 
> 
> > I'll have a dig into the timers today and see where I get to.  
> 
> Look forward to seeing what you find!
Nothing obvious turning up other than we don't seem to have issue
when we aren't running hrtimers.

On a plus side I just got a report that it is effecting our d03
boards which is good on the basis I couldn't tell what the difference
could be wrt to this issue!

It indeed looks like we are consistently missing a timer before
the rcu splat occurs.

J
> 
> 							Thanx, Paul
> 
> > Jonathan  
> > >   
> > > > Jonathan
> > > >     
> > > > > 
> > > > > Thomas, any debugging suggestions?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > >     
> > >   
> >   
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 15:27                                                                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 15:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Mon, 31 Jul 2017 08:04:11 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 12:03:50 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > I could get tracing up and running)
> > > > > >e
> > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > >>e another timer period is of interest.      
> > > > > > 
> > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > 
> > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > detected around 835.
> > > > > > 
> > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > may well be one in there.      
> > > > > 
> > > > > The dmesg says:
> > > > > 
> > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > 
> > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > 
> > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > 
> > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > It does show up off the bottom of what would fit in pastebin...
> > > > 
> > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=    
> > > 
> > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > than the requested one jiffy!
> > >   
> > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > this is the trace.
> > > > > 
> > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > any case).
> > > > > 
> > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > and SPARC.  ;-)    
> > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > 
> > > > There is an errata work around for the timers on this particular board.
> > > > I'm only vaguely aware of it, so may be unconnected.
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > 
> > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > errata effects (not that that means much).    
> > > 
> > > If you can reproduce quickly, might be worth trying anyway...
> > > 
> > > 							Thanx, Paul  
> > Errata fix is running already and was for all those tests.  
> 
> I was afraid of that...  ;-)
It's a pretty rare errata it seems.  Not actually managed to catch
one yet. 
> 
> > I'll have a dig into the timers today and see where I get to.  
> 
> Look forward to seeing what you find!
Nothing obvious turning up other than we don't seem to have issue
when we aren't running hrtimers.

On a plus side I just got a report that it is effecting our d03
boards which is good on the basis I couldn't tell what the difference
could be wrt to this issue!

It indeed looks like we are consistently missing a timer before
the rcu splat occurs.

J
> 
> 							Thanx, Paul
> 
> > Jonathan  
> > >   
> > > > Jonathan
> > > >     
> > > > > 
> > > > > Thomas, any debugging suggestions?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > >     
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-07-31 15:27                                                                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-07-31 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 31 Jul 2017 08:04:11 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 12:03:50 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > 
> > > > > [ . . . ]
> > > > >     
> > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > I could get tracing up and running)
> > > > > >e
> > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > >>e another timer period is of interest.      
> > > > > > 
> > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > 
> > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > detected around 835.
> > > > > > 
> > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > may well be one in there.      
> > > > > 
> > > > > The dmesg says:
> > > > > 
> > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > 
> > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > 
> > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > 
> > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > It does show up off the bottom of what would fit in pastebin...
> > > > 
> > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=    
> > > 
> > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > than the requested one jiffy!
> > >   
> > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > this is the trace.
> > > > > 
> > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > any case).
> > > > > 
> > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > and SPARC.  ;-)    
> > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > 
> > > > There is an errata work around for the timers on this particular board.
> > > > I'm only vaguely aware of it, so may be unconnected.
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > 
> > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > errata effects (not that that means much).    
> > > 
> > > If you can reproduce quickly, might be worth trying anyway...
> > > 
> > > 							Thanx, Paul  
> > Errata fix is running already and was for all those tests.  
> 
> I was afraid of that...  ;-)
It's a pretty rare errata it seems.  Not actually managed to catch
one yet. 
> 
> > I'll have a dig into the timers today and see where I get to.  
> 
> Look forward to seeing what you find!
Nothing obvious turning up other than we don't seem to have issue
when we aren't running hrtimers.

On a plus side I just got a report that it is effecting our d03
boards which is good on the basis I couldn't tell what the difference
could be wrt to this issue!

It indeed looks like we are consistently missing a timer before
the rcu splat occurs.

J
> 
> 							Thanx, Paul
> 
> > Jonathan  
> > >   
> > > > Jonathan
> > > >     
> > > > > 
> > > > > Thomas, any debugging suggestions?
> > > > > 
> > > > > 							Thanx, Paul
> > > > >     
> > > >     
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-31 11:55                                               ` Jonathan Cameron
  (?)
@ 2017-08-01 10:53                                                 ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-01 10:53 UTC (permalink / raw)
  To: linux-arm-kernel


Sorry - accidental send.  No content!

Jonathan

On Mon, 31 Jul 2017 12:55:48 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 31 Jul 2017 12:09:08 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 16:15:05 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > > >       
> > > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > > really is having an effect.      
> > > > 
> > > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > > code myself.
> > > > 
> > > > Just out of curiousity, what x86 idle method is your machine using?
> > > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > > Jonathan's system operates.      
> > > 
> > > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > > 
> > > 	CONFIG_NO_HZ_IDLE=y
> > > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > > 	CONFIG_CPU_IDLE=y
> > > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > > 	# CONFIG_INTEL_IDLE is not set
> > >     
> > > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > > local TICK register keeps advancing, and the local timer therefore
> > > > will still trigger.  Also, any externally generated interrupts
> > > > (including cross calls) will wake up the cpu as well.
> > > > 
> > > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > > case.  One of my running theories is that we miss scheduling a tick
> > > > due to a race.  That would be consistent with the behavior we see
> > > > in the RCU dumps, I think.      
> > > 
> > > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > > warning?  By default, your grace period needs to extend for more than
> > > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > > you mean that the ticks get shut off now and forever, as opposed to
> > > just losing one of them?
> > >     
> > > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > > about the revert of the revert (specifically
> > > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > > 
> > > > :-)      
> > > 
> > > I am running an overnight test in preparation for attempting to push
> > > some fixes for regressions into 4.12, but will try reverting this
> > > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > > 
> > > Jonathan, might the commit that Dave points out above be what reduces
> > > the probability of occurrence as you test older releases?    
> > I just got around to trying this out of curiosity.  Superficially it did
> > appear to possibly make the issue harder to hit took over 30 minutes
> > but the issue otherwise looks much the same with or without that patch.
> > 
> > Just out of curiosity, next thing on my list is to disable hrtimers entirely
> > and see what happens.
> > 
> > Jonathan  
> > > 
> > > 							Thanx, Paul
> > >     
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-01 10:53                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-01 10:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, npiggin, abdhalee, sparclinux, akpm,
	linuxppc-dev, David Miller, linux-arm-kernel


Sorry - accidental send.  No content!

Jonathan

On Mon, 31 Jul 2017 12:55:48 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 31 Jul 2017 12:09:08 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 16:15:05 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > > >       
> > > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > > really is having an effect.      
> > > > 
> > > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > > code myself.
> > > > 
> > > > Just out of curiousity, what x86 idle method is your machine using?
> > > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > > Jonathan's system operates.      
> > > 
> > > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > > 
> > > 	CONFIG_NO_HZ_IDLE=y
> > > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > > 	CONFIG_CPU_IDLE=y
> > > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > > 	# CONFIG_INTEL_IDLE is not set
> > >     
> > > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > > local TICK register keeps advancing, and the local timer therefore
> > > > will still trigger.  Also, any externally generated interrupts
> > > > (including cross calls) will wake up the cpu as well.
> > > > 
> > > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > > case.  One of my running theories is that we miss scheduling a tick
> > > > due to a race.  That would be consistent with the behavior we see
> > > > in the RCU dumps, I think.      
> > > 
> > > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > > warning?  By default, your grace period needs to extend for more than
> > > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > > you mean that the ticks get shut off now and forever, as opposed to
> > > just losing one of them?
> > >     
> > > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > > about the revert of the revert (specifically
> > > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > > 
> > > > :-)      
> > > 
> > > I am running an overnight test in preparation for attempting to push
> > > some fixes for regressions into 4.12, but will try reverting this
> > > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > > 
> > > Jonathan, might the commit that Dave points out above be what reduces
> > > the probability of occurrence as you test older releases?    
> > I just got around to trying this out of curiosity.  Superficially it did
> > appear to possibly make the issue harder to hit took over 30 minutes
> > but the issue otherwise looks much the same with or without that patch.
> > 
> > Just out of curiosity, next thing on my list is to disable hrtimers entirely
> > and see what happens.
> > 
> > Jonathan  
> > > 
> > > 							Thanx, Paul
> > >     
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-01 10:53                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-01 10:53 UTC (permalink / raw)
  To: linux-arm-kernel


Sorry - accidental send.  No content!

Jonathan

On Mon, 31 Jul 2017 12:55:48 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 31 Jul 2017 12:09:08 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Wed, 26 Jul 2017 16:15:05 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:    
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > > >       
> > > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > > really is having an effect.      
> > > > 
> > > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > > code myself.
> > > > 
> > > > Just out of curiousity, what x86 idle method is your machine using?
> > > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > > Jonathan's system operates.      
> > > 
> > > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > > 
> > > 	CONFIG_NO_HZ_IDLE=y
> > > 	CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > > 	# CONFIG_IDLE_PAGE_TRACKING is not set
> > > 	CONFIG_ACPI_PROCESSOR_IDLE=y
> > > 	CONFIG_CPU_IDLE=y
> > > 	# CONFIG_CPU_IDLE_GOV_LADDER is not set
> > > 	CONFIG_CPU_IDLE_GOV_MENU=y
> > > 	# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > > 	# CONFIG_INTEL_IDLE is not set
> > >     
> > > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > > local TICK register keeps advancing, and the local timer therefore
> > > > will still trigger.  Also, any externally generated interrupts
> > > > (including cross calls) will wake up the cpu as well.
> > > > 
> > > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > > case.  One of my running theories is that we miss scheduling a tick
> > > > due to a race.  That would be consistent with the behavior we see
> > > > in the RCU dumps, I think.      
> > > 
> > > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > > warning?  By default, your grace period needs to extend for more than
> > > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > > you mean that the ticks get shut off now and forever, as opposed to
> > > just losing one of them?
> > >     
> > > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > > about the revert of the revert (specifically
> > > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > > 
> > > > :-)      
> > > 
> > > I am running an overnight test in preparation for attempting to push
> > > some fixes for regressions into 4.12, but will try reverting this
> > > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > > 
> > > Jonathan, might the commit that Dave points out above be what reduces
> > > the probability of occurrence as you test older releases?    
> > I just got around to trying this out of curiosity.  Superficially it did
> > appear to possibly make the issue harder to hit took over 30 minutes
> > but the issue otherwise looks much the same with or without that patch.
> > 
> > Just out of curiosity, next thing on my list is to disable hrtimers entirely
> > and see what happens.
> > 
> > Jonathan  
> > > 
> > > 							Thanx, Paul
> > >     
> > 
> > _______________________________________________
> > linuxarm mailing list
> > linuxarm at huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-07-31 15:27                                                                         ` Jonathan Cameron
  (?)
@ 2017-08-01 18:46                                                                           ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-01 18:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> On Mon, 31 Jul 2017 08:04:11 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > > 
> > > > > > [ . . . ]
> > > > > >     
> > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > I could get tracing up and running)
> > > > > > >e
> > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > > >>e another timer period is of interest.      
> > > > > > > 
> > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > 
> > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > detected around 835.
> > > > > > > 
> > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > may well be one in there.      
> > > > > > 
> > > > > > The dmesg says:
> > > > > > 
> > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > 
> > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > 
> > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > > > 
> > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > 
> > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=    
> > > > 
> > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > than the requested one jiffy!
> > > >   
> > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > this is the trace.
> > > > > > 
> > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > any case).
> > > > > > 
> > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > and SPARC.  ;-)    
> > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > 
> > > > > There is an errata work around for the timers on this particular board.
> > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > 
> > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > errata effects (not that that means much).    
> > > > 
> > > > If you can reproduce quickly, might be worth trying anyway...
> > > > 
> > > > 							Thanx, Paul  
> > > Errata fix is running already and was for all those tests.  
> > 
> > I was afraid of that...  ;-)
> It's a pretty rare errata it seems.  Not actually managed to catch
> one yet. 
> > 
> > > I'll have a dig into the timers today and see where I get to.  
> > 
> > Look forward to seeing what you find!
> Nothing obvious turning up other than we don't seem to have issue
> when we aren't running hrtimers.
> 
> On a plus side I just got a report that it is effecting our d03
> boards which is good on the basis I couldn't tell what the difference
> could be wrt to this issue!
> 
> It indeed looks like we are consistently missing a timer before
> the rcu splat occurs.

And for my part, my tests with CONFIG_HZ_PERIODIC=y and
CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
as other runs.

Missing a timer can most certainly give RCU severe heartburn!  ;-)
Do you have what you need to track down the missing timer?  

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-01 18:46                                                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-01 18:46 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> On Mon, 31 Jul 2017 08:04:11 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > > 
> > > > > > [ . . . ]
> > > > > >     
> > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > I could get tracing up and running)
> > > > > > >e
> > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > > >>e another timer period is of interest.      
> > > > > > > 
> > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > 
> > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > detected around 835.
> > > > > > > 
> > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > may well be one in there.      
> > > > > > 
> > > > > > The dmesg says:
> > > > > > 
> > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > 
> > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > 
> > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > 
> > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > 
> > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=    
> > > > 
> > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > than the requested one jiffy!
> > > >   
> > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > this is the trace.
> > > > > > 
> > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > any case).
> > > > > > 
> > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > and SPARC.  ;-)    
> > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > 
> > > > > There is an errata work around for the timers on this particular board.
> > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > 
> > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > errata effects (not that that means much).    
> > > > 
> > > > If you can reproduce quickly, might be worth trying anyway...
> > > > 
> > > > 							Thanx, Paul  
> > > Errata fix is running already and was for all those tests.  
> > 
> > I was afraid of that...  ;-)
> It's a pretty rare errata it seems.  Not actually managed to catch
> one yet. 
> > 
> > > I'll have a dig into the timers today and see where I get to.  
> > 
> > Look forward to seeing what you find!
> Nothing obvious turning up other than we don't seem to have issue
> when we aren't running hrtimers.
> 
> On a plus side I just got a report that it is effecting our d03
> boards which is good on the basis I couldn't tell what the difference
> could be wrt to this issue!
> 
> It indeed looks like we are consistently missing a timer before
> the rcu splat occurs.

And for my part, my tests with CONFIG_HZ_PERIODIC=y and
CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
as other runs.

Missing a timer can most certainly give RCU severe heartburn!  ;-)
Do you have what you need to track down the missing timer?  

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-01 18:46                                                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-01 18:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> On Mon, 31 Jul 2017 08:04:11 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > > > > 
> > > > > > [ . . . ]
> > > > > >     
> > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > I could get tracing up and running)
> > > > > > >e
> > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if      
> > > > > > >>e another timer period is of interest.      
> > > > > > > 
> > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > 
> > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > detected around 835.
> > > > > > > 
> > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > may well be one in there.      
> > > > > > 
> > > > > > The dmesg says:
> > > > > > 
> > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > 
> > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > 
> > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > 
> > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.    
> > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > 
> > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=    
> > > > 
> > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > than the requested one jiffy!
> > > >   
> > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > this is the trace.
> > > > > > 
> > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > any case).
> > > > > > 
> > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > and SPARC.  ;-)    
> > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > 
> > > > > There is an errata work around for the timers on this particular board.
> > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > 
> > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > errata effects (not that that means much).    
> > > > 
> > > > If you can reproduce quickly, might be worth trying anyway...
> > > > 
> > > > 							Thanx, Paul  
> > > Errata fix is running already and was for all those tests.  
> > 
> > I was afraid of that...  ;-)
> It's a pretty rare errata it seems.  Not actually managed to catch
> one yet. 
> > 
> > > I'll have a dig into the timers today and see where I get to.  
> > 
> > Look forward to seeing what you find!
> Nothing obvious turning up other than we don't seem to have issue
> when we aren't running hrtimers.
> 
> On a plus side I just got a report that it is effecting our d03
> boards which is good on the basis I couldn't tell what the difference
> could be wrt to this issue!
> 
> It indeed looks like we are consistently missing a timer before
> the rcu splat occurs.

And for my part, my tests with CONFIG_HZ_PERIODIC=y and
CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
as other runs.

Missing a timer can most certainly give RCU severe heartburn!  ;-)
Do you have what you need to track down the missing timer?  

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-01 18:46                                                                           ` Paul E. McKenney
  (?)
@ 2017-08-02 16:25                                                                             ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-02 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 1 Aug 2017 11:46:46 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > On Mon, 31 Jul 2017 08:04:11 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > 
> > > > > > > [ . . . ]
> > > > > > >       
> > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > I could get tracing up and running)
> > > > > > > >e
> > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > >>e another timer period is of interest.        
> > > > > > > > 
> > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > 
> > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > detected around 835.
> > > > > > > > 
> > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > may well be one in there.        
> > > > > > > 
> > > > > > > The dmesg says:
> > > > > > > 
> > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > 
> > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > 
> > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > > > > 
> > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > 
> > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > 
> > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > than the requested one jiffy!
> > > > >     
> > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > this is the trace.
> > > > > > > 
> > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > any case).
> > > > > > > 
> > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > and SPARC.  ;-)      
> > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > 
> > > > > > There is an errata work around for the timers on this particular board.
> > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > 
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > 
> > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > errata effects (not that that means much).      
> > > > > 
> > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > 
> > > > > 							Thanx, Paul    
> > > > Errata fix is running already and was for all those tests.    
> > > 
> > > I was afraid of that...  ;-)  
> > It's a pretty rare errata it seems.  Not actually managed to catch
> > one yet.   
> > >   
> > > > I'll have a dig into the timers today and see where I get to.    
> > > 
> > > Look forward to seeing what you find!  
> > Nothing obvious turning up other than we don't seem to have issue
> > when we aren't running hrtimers.
> > 
> > On a plus side I just got a report that it is effecting our d03
> > boards which is good on the basis I couldn't tell what the difference
> > could be wrt to this issue!
> > 
> > It indeed looks like we are consistently missing a timer before
> > the rcu splat occurs.  
> 
> And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> as other runs.
> 
> Missing a timer can most certainly give RCU severe heartburn!  ;-)
> Do you have what you need to track down the missing timer?  

Not managed to make much progress yet.  Turning on any additional tracing
in that area seems to make the issue stop happening or at least
occur very infrequently. Which certainly makes it 'fun' to find.

As a long shot I applied a locking fix from another reported issue that
was causing rcu stalls and it seemed good for much longer, but
eventually still occurred.

(from the thread rcu_sched stall while waiting in csd_lock_wait())

Jonathan


> 
> 							Thanx, Paul
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-02 16:25                                                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-02 16:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Tue, 1 Aug 2017 11:46:46 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > On Mon, 31 Jul 2017 08:04:11 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > 
> > > > > > > [ . . . ]
> > > > > > >       
> > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > I could get tracing up and running)
> > > > > > > >e
> > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > >>e another timer period is of interest.        
> > > > > > > > 
> > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > 
> > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > detected around 835.
> > > > > > > > 
> > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > may well be one in there.        
> > > > > > > 
> > > > > > > The dmesg says:
> > > > > > > 
> > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > 
> > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > 
> > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > 
> > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > 
> > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > 
> > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > than the requested one jiffy!
> > > > >     
> > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > this is the trace.
> > > > > > > 
> > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > any case).
> > > > > > > 
> > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > and SPARC.  ;-)      
> > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > 
> > > > > > There is an errata work around for the timers on this particular board.
> > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > 
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > 
> > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > errata effects (not that that means much).      
> > > > > 
> > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > 
> > > > > 							Thanx, Paul    
> > > > Errata fix is running already and was for all those tests.    
> > > 
> > > I was afraid of that...  ;-)  
> > It's a pretty rare errata it seems.  Not actually managed to catch
> > one yet.   
> > >   
> > > > I'll have a dig into the timers today and see where I get to.    
> > > 
> > > Look forward to seeing what you find!  
> > Nothing obvious turning up other than we don't seem to have issue
> > when we aren't running hrtimers.
> > 
> > On a plus side I just got a report that it is effecting our d03
> > boards which is good on the basis I couldn't tell what the difference
> > could be wrt to this issue!
> > 
> > It indeed looks like we are consistently missing a timer before
> > the rcu splat occurs.  
> 
> And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> as other runs.
> 
> Missing a timer can most certainly give RCU severe heartburn!  ;-)
> Do you have what you need to track down the missing timer?  

Not managed to make much progress yet.  Turning on any additional tracing
in that area seems to make the issue stop happening or at least
occur very infrequently. Which certainly makes it 'fun' to find.

As a long shot I applied a locking fix from another reported issue that
was causing rcu stalls and it seemed good for much longer, but
eventually still occurred.

(from the thread rcu_sched stall while waiting in csd_lock_wait())

Jonathan


> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-02 16:25                                                                             ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-02 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 1 Aug 2017 11:46:46 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > On Mon, 31 Jul 2017 08:04:11 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > 
> > > > > > > [ . . . ]
> > > > > > >       
> > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > I could get tracing up and running)
> > > > > > > >e
> > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > >>e another timer period is of interest.        
> > > > > > > > 
> > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > 
> > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > detected around 835.
> > > > > > > > 
> > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > may well be one in there.        
> > > > > > > 
> > > > > > > The dmesg says:
> > > > > > > 
> > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > 
> > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > 
> > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > 
> > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > 
> > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > 
> > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > than the requested one jiffy!
> > > > >     
> > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > this is the trace.
> > > > > > > 
> > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > any case).
> > > > > > > 
> > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > and SPARC.  ;-)      
> > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > 
> > > > > > There is an errata work around for the timers on this particular board.
> > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > 
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > 
> > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > errata effects (not that that means much).      
> > > > > 
> > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > 
> > > > > 							Thanx, Paul    
> > > > Errata fix is running already and was for all those tests.    
> > > 
> > > I was afraid of that...  ;-)  
> > It's a pretty rare errata it seems.  Not actually managed to catch
> > one yet.   
> > >   
> > > > I'll have a dig into the timers today and see where I get to.    
> > > 
> > > Look forward to seeing what you find!  
> > Nothing obvious turning up other than we don't seem to have issue
> > when we aren't running hrtimers.
> > 
> > On a plus side I just got a report that it is effecting our d03
> > boards which is good on the basis I couldn't tell what the difference
> > could be wrt to this issue!
> > 
> > It indeed looks like we are consistently missing a timer before
> > the rcu splat occurs.  
> 
> And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> as other runs.
> 
> Missing a timer can most certainly give RCU severe heartburn!  ;-)
> Do you have what you need to track down the missing timer?  

Not managed to make much progress yet.  Turning on any additional tracing
in that area seems to make the issue stop happening or at least
occur very infrequently. Which certainly makes it 'fun' to find.

As a long shot I applied a locking fix from another reported issue that
was causing rcu stalls and it seemed good for much longer, but
eventually still occurred.

(from the thread rcu_sched stall while waiting in csd_lock_wait())

Jonathan


> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-02 16:25                                                                             ` Jonathan Cameron
  (?)
@ 2017-08-15 15:47                                                                               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-15 15:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> On Tue, 1 Aug 2017 11:46:46 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > >       
> > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > > 
> > > > > > > > [ . . . ]
> > > > > > > >       
> > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > I could get tracing up and running)
> > > > > > > > >e
> > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > > >>e another timer period is of interest.        
> > > > > > > > > 
> > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > 
> > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > detected around 835.
> > > > > > > > > 
> > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > may well be one in there.        
> > > > > > > > 
> > > > > > > > The dmesg says:
> > > > > > > > 
> > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > 
> > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > 
> > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > > > > > 
> > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > 
> > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > > 
> > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > than the requested one jiffy!
> > > > > >     
> > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > this is the trace.
> > > > > > > > 
> > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > any case).
> > > > > > > > 
> > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > and SPARC.  ;-)      
> > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > 
> > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > 
> > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > errata effects (not that that means much).      
> > > > > > 
> > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > 
> > > > > > 							Thanx, Paul    
> > > > > Errata fix is running already and was for all those tests.    
> > > > 
> > > > I was afraid of that...  ;-)  
> > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > one yet.   
> > > >   
> > > > > I'll have a dig into the timers today and see where I get to.    
> > > > 
> > > > Look forward to seeing what you find!  
> > > Nothing obvious turning up other than we don't seem to have issue
> > > when we aren't running hrtimers.
> > > 
> > > On a plus side I just got a report that it is effecting our d03
> > > boards which is good on the basis I couldn't tell what the difference
> > > could be wrt to this issue!
> > > 
> > > It indeed looks like we are consistently missing a timer before
> > > the rcu splat occurs.  
> > 
> > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > as other runs.
> > 
> > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > Do you have what you need to track down the missing timer?  
> 
> Not managed to make much progress yet.  Turning on any additional tracing
> in that area seems to make the issue stop happening or at least
> occur very infrequently. Which certainly makes it 'fun' to find.
> 
> As a long shot I applied a locking fix from another reported issue that
> was causing rcu stalls and it seemed good for much longer, but
> eventually still occurred.
> 
> (from the thread rcu_sched stall while waiting in csd_lock_wait())

On the perhaps unlikely off-chance that it helps locate something,
here is a patch that adds a trace_printk() to check how long a CPU
believes that it can sleep when going idle.  The thought is to check
to see if a CPU with a timer set to expire in one jiffy thinks that
can sleep for (say) 30 seconds.

Didn't find anything for my problem, but I believe that yours is
different, so...

							Thanx, Paul

------------------------------------------------------------------------

commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Aug 14 08:54:39 2017 -0700

    EXP: Trace tick return from tick_nohz_stop_sched_tick
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c5ce64..7358a5073dfb 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	 * (not only the tick).
 	 */
 	ts->sleep_length = ktime_sub(dev->next_event, now);
+	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
 	return tick;
 }
 


^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-15 15:47                                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-15 15:47 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> On Tue, 1 Aug 2017 11:46:46 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > >       
> > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > > 
> > > > > > > > [ . . . ]
> > > > > > > >       
> > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > I could get tracing up and running)
> > > > > > > > >e
> > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > > >>e another timer period is of interest.        
> > > > > > > > > 
> > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > 
> > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > detected around 835.
> > > > > > > > > 
> > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > may well be one in there.        
> > > > > > > > 
> > > > > > > > The dmesg says:
> > > > > > > > 
> > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > 
> > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > 
> > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > > 
> > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > 
> > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > > 
> > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > than the requested one jiffy!
> > > > > >     
> > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > this is the trace.
> > > > > > > > 
> > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > any case).
> > > > > > > > 
> > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > and SPARC.  ;-)      
> > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > 
> > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > 
> > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > errata effects (not that that means much).      
> > > > > > 
> > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > 
> > > > > > 							Thanx, Paul    
> > > > > Errata fix is running already and was for all those tests.    
> > > > 
> > > > I was afraid of that...  ;-)  
> > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > one yet.   
> > > >   
> > > > > I'll have a dig into the timers today and see where I get to.    
> > > > 
> > > > Look forward to seeing what you find!  
> > > Nothing obvious turning up other than we don't seem to have issue
> > > when we aren't running hrtimers.
> > > 
> > > On a plus side I just got a report that it is effecting our d03
> > > boards which is good on the basis I couldn't tell what the difference
> > > could be wrt to this issue!
> > > 
> > > It indeed looks like we are consistently missing a timer before
> > > the rcu splat occurs.  
> > 
> > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > as other runs.
> > 
> > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > Do you have what you need to track down the missing timer?  
> 
> Not managed to make much progress yet.  Turning on any additional tracing
> in that area seems to make the issue stop happening or at least
> occur very infrequently. Which certainly makes it 'fun' to find.
> 
> As a long shot I applied a locking fix from another reported issue that
> was causing rcu stalls and it seemed good for much longer, but
> eventually still occurred.
> 
> (from the thread rcu_sched stall while waiting in csd_lock_wait())

On the perhaps unlikely off-chance that it helps locate something,
here is a patch that adds a trace_printk() to check how long a CPU
believes that it can sleep when going idle.  The thought is to check
to see if a CPU with a timer set to expire in one jiffy thinks that
can sleep for (say) 30 seconds.

Didn't find anything for my problem, but I believe that yours is
different, so...

							Thanx, Paul

------------------------------------------------------------------------

commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Aug 14 08:54:39 2017 -0700

    EXP: Trace tick return from tick_nohz_stop_sched_tick
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c5ce64..7358a5073dfb 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	 * (not only the tick).
 	 */
 	ts->sleep_length = ktime_sub(dev->next_event, now);
+	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
 	return tick;
 }
 

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-15 15:47                                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-15 15:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> On Tue, 1 Aug 2017 11:46:46 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > >     
> > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:    
> > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > >       
> > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:      
> > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:        
> > > > > > > > 
> > > > > > > > [ . . . ]
> > > > > > > >       
> > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > I could get tracing up and running)
> > > > > > > > >e
> > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if        
> > > > > > > > >>e another timer period is of interest.        
> > > > > > > > > 
> > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > 
> > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > detected around 835.
> > > > > > > > > 
> > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > may well be one in there.        
> > > > > > > > 
> > > > > > > > The dmesg says:
> > > > > > > > 
> > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > 
> > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > 
> > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > > 
> > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.      
> > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > 
> > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=      
> > > > > > 
> > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > than the requested one jiffy!
> > > > > >     
> > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > this is the trace.
> > > > > > > > 
> > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > any case).
> > > > > > > > 
> > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > and SPARC.  ;-)      
> > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > 
> > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > 
> > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > errata effects (not that that means much).      
> > > > > > 
> > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > 
> > > > > > 							Thanx, Paul    
> > > > > Errata fix is running already and was for all those tests.    
> > > > 
> > > > I was afraid of that...  ;-)  
> > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > one yet.   
> > > >   
> > > > > I'll have a dig into the timers today and see where I get to.    
> > > > 
> > > > Look forward to seeing what you find!  
> > > Nothing obvious turning up other than we don't seem to have issue
> > > when we aren't running hrtimers.
> > > 
> > > On a plus side I just got a report that it is effecting our d03
> > > boards which is good on the basis I couldn't tell what the difference
> > > could be wrt to this issue!
> > > 
> > > It indeed looks like we are consistently missing a timer before
> > > the rcu splat occurs.  
> > 
> > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > as other runs.
> > 
> > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > Do you have what you need to track down the missing timer?  
> 
> Not managed to make much progress yet.  Turning on any additional tracing
> in that area seems to make the issue stop happening or at least
> occur very infrequently. Which certainly makes it 'fun' to find.
> 
> As a long shot I applied a locking fix from another reported issue that
> was causing rcu stalls and it seemed good for much longer, but
> eventually still occurred.
> 
> (from the thread rcu_sched stall while waiting in csd_lock_wait())

On the perhaps unlikely off-chance that it helps locate something,
here is a patch that adds a trace_printk() to check how long a CPU
believes that it can sleep when going idle.  The thought is to check
to see if a CPU with a timer set to expire in one jiffy thinks that
can sleep for (say) 30 seconds.

Didn't find anything for my problem, but I believe that yours is
different, so...

							Thanx, Paul

------------------------------------------------------------------------

commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Aug 14 08:54:39 2017 -0700

    EXP: Trace tick return from tick_nohz_stop_sched_tick
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c5ce64..7358a5073dfb 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	 * (not only the tick).
 	 */
 	ts->sleep_length = ktime_sub(dev->next_event, now);
+	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
 	return tick;
 }
 

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-15 15:47                                                                               ` Paul E. McKenney
  (?)
@ 2017-08-16  1:24                                                                                 ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-16  1:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 15 Aug 2017 08:47:43 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> > On Tue, 1 Aug 2017 11:46:46 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:  
> > > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > > >         
> > > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:        
> > > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:          
> > > > > > > > > 
> > > > > > > > > [ . . . ]
> > > > > > > > >         
> > > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > > I could get tracing up and running)
> > > > > > > > > >e
> > > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if          
> > > > > > > > > >>e another timer period is of interest.          
> > > > > > > > > > 
> > > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > > 
> > > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > > detected around 835.
> > > > > > > > > > 
> > > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > > may well be one in there.          
> > > > > > > > > 
> > > > > > > > > The dmesg says:
> > > > > > > > > 
> > > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > > 
> > > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > > 
> > > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timerÿff8017d5fc7da0
> > > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timerÿff8017d5fc7da0 function=process_timeout 
> > > > > > > > > 
> > > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.        
> > > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > > 
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timerÿff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timerÿff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timerÿff8017d5fc7da0 function=process_timeout expiresB95101298 [timeout=1] cpu=1 idx=0 flags=        
> > > > > > > 
> > > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > > than the requested one jiffy!
> > > > > > >       
> > > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > > this is the trace.
> > > > > > > > > 
> > > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > > any case).
> > > > > > > > > 
> > > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > > and SPARC.  ;-)        
> > > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > > 
> > > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > > 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id»42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > > 
> > > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > > errata effects (not that that means much).        
> > > > > > > 
> > > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > > 
> > > > > > > 							Thanx, Paul      
> > > > > > Errata fix is running already and was for all those tests.      
> > > > > 
> > > > > I was afraid of that...  ;-)    
> > > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > > one yet.     
> > > > >     
> > > > > > I'll have a dig into the timers today and see where I get to.      
> > > > > 
> > > > > Look forward to seeing what you find!    
> > > > Nothing obvious turning up other than we don't seem to have issue
> > > > when we aren't running hrtimers.
> > > > 
> > > > On a plus side I just got a report that it is effecting our d03
> > > > boards which is good on the basis I couldn't tell what the difference
> > > > could be wrt to this issue!
> > > > 
> > > > It indeed looks like we are consistently missing a timer before
> > > > the rcu splat occurs.    
> > > 
> > > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > > as other runs.
> > > 
> > > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > > Do you have what you need to track down the missing timer?    
> > 
> > Not managed to make much progress yet.  Turning on any additional tracing
> > in that area seems to make the issue stop happening or at least
> > occur very infrequently. Which certainly makes it 'fun' to find.
> > 
> > As a long shot I applied a locking fix from another reported issue that
> > was causing rcu stalls and it seemed good for much longer, but
> > eventually still occurred.
> > 
> > (from the thread rcu_sched stall while waiting in csd_lock_wait())  
> 
> On the perhaps unlikely off-chance that it helps locate something,
> here is a patch that adds a trace_printk() to check how long a CPU
> believes that it can sleep when going idle.  The thought is to check
> to see if a CPU with a timer set to expire in one jiffy thinks that
> can sleep for (say) 30 seconds.
> 
> Didn't find anything for my problem, but I believe that yours is
> different, so...
> 
> 							Thanx, Paul

Hi Paul,

Thanks for the suggestion.  I hadn't thought to look at the
expected time being wrong.

I have noted that adding other tracepoints (and turning them on)
in that function cause the problem to 'disappear' though so it
would seem very timing sensitive.  Fingers crossed this doesn't
have the same effect!

Our progress on this has been a bit limited partly as I have been
traveling and haven't sorted out remote hardware access.
May be the start of next week before I can try this out.

Agreed, the problems look to be different.

Interesting question is whether the other known cases fall
into one or the other category?

Thanks for the help and good luck with your variant!

Jonathan
> 
> ------------------------------------------------------------------------
> 
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
> 
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16  1:24                                                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-16  1:24 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, sparclinux,
	akpm, linuxppc-dev, David Miller, linux-arm-kernel, tglx

On Tue, 15 Aug 2017 08:47:43 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> > On Tue, 1 Aug 2017 11:46:46 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:  
> > > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > > >         
> > > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:        
> > > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:          
> > > > > > > > > 
> > > > > > > > > [ . . . ]
> > > > > > > > >         
> > > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > > I could get tracing up and running)
> > > > > > > > > >e
> > > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if          
> > > > > > > > > >>e another timer period is of interest.          
> > > > > > > > > > 
> > > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > > 
> > > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > > detected around 835.
> > > > > > > > > > 
> > > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > > may well be one in there.          
> > > > > > > > > 
> > > > > > > > > The dmesg says:
> > > > > > > > > 
> > > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > > 
> > > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > > 
> > > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > > > 
> > > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.        
> > > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > > 
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=        
> > > > > > > 
> > > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > > than the requested one jiffy!
> > > > > > >       
> > > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > > this is the trace.
> > > > > > > > > 
> > > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > > any case).
> > > > > > > > > 
> > > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > > and SPARC.  ;-)        
> > > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > > 
> > > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > > 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > > 
> > > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > > errata effects (not that that means much).        
> > > > > > > 
> > > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > > 
> > > > > > > 							Thanx, Paul      
> > > > > > Errata fix is running already and was for all those tests.      
> > > > > 
> > > > > I was afraid of that...  ;-)    
> > > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > > one yet.     
> > > > >     
> > > > > > I'll have a dig into the timers today and see where I get to.      
> > > > > 
> > > > > Look forward to seeing what you find!    
> > > > Nothing obvious turning up other than we don't seem to have issue
> > > > when we aren't running hrtimers.
> > > > 
> > > > On a plus side I just got a report that it is effecting our d03
> > > > boards which is good on the basis I couldn't tell what the difference
> > > > could be wrt to this issue!
> > > > 
> > > > It indeed looks like we are consistently missing a timer before
> > > > the rcu splat occurs.    
> > > 
> > > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > > as other runs.
> > > 
> > > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > > Do you have what you need to track down the missing timer?    
> > 
> > Not managed to make much progress yet.  Turning on any additional tracing
> > in that area seems to make the issue stop happening or at least
> > occur very infrequently. Which certainly makes it 'fun' to find.
> > 
> > As a long shot I applied a locking fix from another reported issue that
> > was causing rcu stalls and it seemed good for much longer, but
> > eventually still occurred.
> > 
> > (from the thread rcu_sched stall while waiting in csd_lock_wait())  
> 
> On the perhaps unlikely off-chance that it helps locate something,
> here is a patch that adds a trace_printk() to check how long a CPU
> believes that it can sleep when going idle.  The thought is to check
> to see if a CPU with a timer set to expire in one jiffy thinks that
> can sleep for (say) 30 seconds.
> 
> Didn't find anything for my problem, but I believe that yours is
> different, so...
> 
> 							Thanx, Paul

Hi Paul,

Thanks for the suggestion.  I hadn't thought to look at the
expected time being wrong.

I have noted that adding other tracepoints (and turning them on)
in that function cause the problem to 'disappear' though so it
would seem very timing sensitive.  Fingers crossed this doesn't
have the same effect!

Our progress on this has been a bit limited partly as I have been
traveling and haven't sorted out remote hardware access.
May be the start of next week before I can try this out.

Agreed, the problems look to be different.

Interesting question is whether the other known cases fall
into one or the other category?

Thanks for the help and good luck with your variant!

Jonathan
> 
> ------------------------------------------------------------------------
> 
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
> 
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16  1:24                                                                                 ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-16  1:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 15 Aug 2017 08:47:43 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> > On Tue, 1 Aug 2017 11:46:46 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:  
> > > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >     
> > > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:    
> > > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > >       
> > > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:      
> > > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > > >         
> > > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:        
> > > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:          
> > > > > > > > > 
> > > > > > > > > [ . . . ]
> > > > > > > > >         
> > > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
> > > > > > > > > > fewer timer events.  Issue became much easier to trigger (on some runs before
> > > > > > > > > > I could get tracing up and running)
> > > > > > > > > >e
> > > > > > > > > > So logs are large enough that pastebin doesn't like them - please shoet if          
> > > > > > > > > >>e another timer period is of interest.          
> > > > > > > > > > 
> > > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > > 
> > > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > > > > > > > detected around 835.
> > > > > > > > > > 
> > > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > > > > > > > may well be one in there.          
> > > > > > > > > 
> > > > > > > > > The dmesg says:
> > > > > > > > > 
> > > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > > 
> > > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > > 
> > > > > > > > > rcu_preempt-9     [019] ....   827.579114: timer_init: timer=ffff8017d5fc7da0
> > > > > > > > > rcu_preempt-9     [019] d..1   827.579115: timer_start: timer=ffff8017d5fc7da0 function=process_timeout 
> > > > > > > > > 
> > > > > > > > > Next look for "ffff8017d5fc7da0" and I don't find anything else.        
> > > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > > 
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681077: timer_cancel: timer=ffff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] ....   837.681086: timer_init: timer=ffff8017d5fc7da0
> > > > > > > >      rcu_preempt-9     [001] d..1   837.681087: timer_start: timer=ffff8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] cpu=1 idx=0 flags=        
> > > > > > > 
> > > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > > than the requested one jiffy!
> > > > > > >       
> > > > > > > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > > > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > > > > > this is the trace.
> > > > > > > > > 
> > > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > > > > any case).
> > > > > > > > > 
> > > > > > > > > The last time we saw something like this it was a timer HW/driver problem,
> > > > > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > > > > and SPARC.  ;-)        
> > > > > > > > Could be different issues, both of which were hidden by that lockup detector.
> > > > > > > > 
> > > > > > > > There is an errata work around for the timers on this particular board.
> > > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > > 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2&id=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > > 
> > > > > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > > > > errata effects (not that that means much).        
> > > > > > > 
> > > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > > 
> > > > > > > 							Thanx, Paul      
> > > > > > Errata fix is running already and was for all those tests.      
> > > > > 
> > > > > I was afraid of that...  ;-)    
> > > > It's a pretty rare errata it seems.  Not actually managed to catch
> > > > one yet.     
> > > > >     
> > > > > > I'll have a dig into the timers today and see where I get to.      
> > > > > 
> > > > > Look forward to seeing what you find!    
> > > > Nothing obvious turning up other than we don't seem to have issue
> > > > when we aren't running hrtimers.
> > > > 
> > > > On a plus side I just got a report that it is effecting our d03
> > > > boards which is good on the basis I couldn't tell what the difference
> > > > could be wrt to this issue!
> > > > 
> > > > It indeed looks like we are consistently missing a timer before
> > > > the rcu splat occurs.    
> > > 
> > > And for my part, my tests with CONFIG_HZ_PERIODIC=y and
> > > CONFIG_RCU_FAST_NO_HZ=n showed roughly the same failure rate
> > > as other runs.
> > > 
> > > Missing a timer can most certainly give RCU severe heartburn!  ;-)
> > > Do you have what you need to track down the missing timer?    
> > 
> > Not managed to make much progress yet.  Turning on any additional tracing
> > in that area seems to make the issue stop happening or at least
> > occur very infrequently. Which certainly makes it 'fun' to find.
> > 
> > As a long shot I applied a locking fix from another reported issue that
> > was causing rcu stalls and it seemed good for much longer, but
> > eventually still occurred.
> > 
> > (from the thread rcu_sched stall while waiting in csd_lock_wait())  
> 
> On the perhaps unlikely off-chance that it helps locate something,
> here is a patch that adds a trace_printk() to check how long a CPU
> believes that it can sleep when going idle.  The thought is to check
> to see if a CPU with a timer set to expire in one jiffy thinks that
> can sleep for (say) 30 seconds.
> 
> Didn't find anything for my problem, but I believe that yours is
> different, so...
> 
> 							Thanx, Paul

Hi Paul,

Thanks for the suggestion.  I hadn't thought to look at the
expected time being wrong.

I have noted that adding other tracepoints (and turning them on)
in that function cause the problem to 'disappear' though so it
would seem very timing sensitive.  Fingers crossed this doesn't
have the same effect!

Our progress on this has been a bit limited partly as I have been
traveling and haven't sorted out remote hardware access.
May be the start of next week before I can try this out.

Agreed, the problems look to be different.

Interesting question is whether the other known cases fall
into one or the other category?

Thanks for the help and good luck with your variant!

Jonathan
> 
> ------------------------------------------------------------------------
> 
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
> 
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-15 15:47                                                                               ` Paul E. McKenney
  (?)
@ 2017-08-16 12:43                                                                                 ` Michael Ellerman
  -1 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-16 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
...
>
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
>
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }

Should I be seeing negative values? A small sample:

          <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
          <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
          <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
          <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
          <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
          <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018


I have a full trace, I'll send it to you off-list.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 12:43                                                                                 ` Michael Ellerman
  0 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-16 12:43 UTC (permalink / raw)
  To: paulmck, Jonathan Cameron
  Cc: dzickus, sfr, linuxarm, Nicholas Piggin, abdhalee, tglx,
	sparclinux, akpm, linuxppc-dev, David Miller, linux-arm-kernel

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
...
>
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
>
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }

Should I be seeing negative values? A small sample:

          <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
          <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
          <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
          <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
          <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
          <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018


I have a full trace, I'll send it to you off-list.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 12:43                                                                                 ` Michael Ellerman
  0 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-16 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
...
>
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Aug 14 08:54:39 2017 -0700
>
>     EXP: Trace tick return from tick_nohz_stop_sched_tick
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  	 * (not only the tick).
>  	 */
>  	ts->sleep_length = ktime_sub(dev->next_event, now);
> +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>  	return tick;
>  }

Should I be seeing negative values? A small sample:

          <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
          <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
          <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
          <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
          <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
          <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
          <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
          <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018


I have a full trace, I'll send it to you off-list.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-16 12:43                                                                                 ` Michael Ellerman
  (?)
@ 2017-08-16 12:56                                                                                   ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  	 * (not only the tick).
> >  	 */
> >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> >  	return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
>           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
>           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
>           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
>           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
>           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 12:56                                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 12:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel

On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  	 * (not only the tick).
> >  	 */
> >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> >  	return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
>           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
>           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
>           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
>           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
>           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 12:56                                                                                   ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  	 * (not only the tick).
> >  	 */
> >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> >  	return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
>           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
>           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
>           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
>           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
>           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-16 12:56                                                                                   ` Paul E. McKenney
  (?)
@ 2017-08-16 15:31                                                                                     ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-16 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 16 Aug 2017 05:56:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...  
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }  
> > 
> > Should I be seeing negative values? A small sample:  
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.  
> 
> I will take a look!

I found this, I can't see that it would cause our symptoms, but it's
worth someone who knows the code taking a look at it.

--
cpuidle: fix broadcast control when broadcast can not be entered

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases, but otherwise does not appear to cause problems.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 drivers/cpuidle/cpuidle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 60bb64f4329d..4453e27f855e 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 			return -EBUSY;
 		}
 		target_state = &drv->states[index];
+		broadcast = false;
 	}
 
 	/* Take note of the planned idle state. */
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 15:31                                                                                     ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-16 15:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, Rafael J. Wysocki

On Wed, 16 Aug 2017 05:56:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...  
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }  
> > 
> > Should I be seeing negative values? A small sample:  
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.  
> 
> I will take a look!

I found this, I can't see that it would cause our symptoms, but it's
worth someone who knows the code taking a look at it.

--
cpuidle: fix broadcast control when broadcast can not be entered

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases, but otherwise does not appear to cause problems.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 drivers/cpuidle/cpuidle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 60bb64f4329d..4453e27f855e 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 			return -EBUSY;
 		}
 		target_state = &drv->states[index];
+		broadcast = false;
 	}
 
 	/* Take note of the planned idle state. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 15:31                                                                                     ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-16 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 16 Aug 2017 05:56:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...  
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }  
> > 
> > Should I be seeing negative values? A small sample:  
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.  
> 
> I will take a look!

I found this, I can't see that it would cause our symptoms, but it's
worth someone who knows the code taking a look at it.

--
cpuidle: fix broadcast control when broadcast can not be entered

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases, but otherwise does not appear to cause problems.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 drivers/cpuidle/cpuidle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 60bb64f4329d..4453e27f855e 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 			return -EBUSY;
 		}
 		target_state = &drv->states[index];
+		broadcast = false;
 	}
 
 	/* Take note of the planned idle state. */
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-16 12:56                                                                                   ` Paul E. McKenney
  (?)
@ 2017-08-16 16:27                                                                                     ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }
> > 
> > Should I be seeing negative values? A small sample:
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.
> 
> I will take a look!

And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
kthread that stalled.  This kthread was starved, based on this from your
dmesg:

[ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
idle-CPU and offline-CPU quiescent states, which means that its waits
will be accompanied by short timeouts.  The "starved for 2603 jiffies"
says that it has not run for one good long time.  The ->state is its
task_struct ->state field.

The immediately preceding dmesg line is as follows:

[ 1602.063851]  (detected by 53, t&03 jiffies, gr75, cr74, q`8)

In other words, the rcu_sched grace-period kthread has been starved
for the entire duration of the current grace period, as shown by the
t&03.

Lets turn now to the trace output, looking for the last bit of the
rcu_sched task's activity:

       rcu_sched-9     [054] d...  1576.030096: timer_start: timerÀ000007fae1bc20 function=process_timeout expiresB95094922 [timeout=1] cpuT idx=0 flags    ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhpÀ00000fcf8c4eb0 func=__d_free
       rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
    ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhpÀ00000fcff74ee0 func=proc_i_callback
       rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
       rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch

So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
but what is with "expiresB95094922"?)  and blocked ("rcu_utilization:
Start context switch" and "rcu_utilization: End context switch"),
recording its CPU's quiescent state in the process ("rcu_grace_period:
rcu_sched 7275 cpuqs").

Of course, the timer will have expired in the context of some other task,
but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
line above) shows nothing (to be painfully accurate, the search wraps back
to earlier uses of this timer by rcu_sched).  So the timer never did fire.

The next question is "what did CPU 054 do next?"  We find it entering idle:

          <idle>-0     [054] d...  1576.030167: tick_stop: success=1 dependency=NONE
          <idle>-0     [054] d...  1576.030167: hrtimer_cancel: hrtimerÀ00000fff88c680
          <idle>-0     [054] d...  1576.030168: hrtimer_start: hrtimerÀ00000fff88c680 function=tick_sched_timer expires\x1610710000000 softexpires\x1610710000000
          <idle>-0     [054] d...  1576.030170: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 34469831
          <idle>-0     [054] d...  1576.030171: rcu_dyntick: Start 140000000000000 0

So we have an hrtimer set for 1610710000000, whatever time that might
happen to map to.  And that is the last we hear from CPU 054, so it
apparently maps to something a bit too far into the future.  Let's
assume that this value is nanoseconds since boot, in which case we have
1,610.710000000 seconds, which is eight seconds after the stall-warning
message.  And -way- longer than the one-jiffy timeout requested!

Thomas, John, am I misinterpreting the timer trace event messages?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 16:27                                                                                     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 16:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }
> > 
> > Should I be seeing negative values? A small sample:
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.
> 
> I will take a look!

And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
kthread that stalled.  This kthread was starved, based on this from your
dmesg:

[ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
idle-CPU and offline-CPU quiescent states, which means that its waits
will be accompanied by short timeouts.  The "starved for 2603 jiffies"
says that it has not run for one good long time.  The ->state is its
task_struct ->state field.

The immediately preceding dmesg line is as follows:

[ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)

In other words, the rcu_sched grace-period kthread has been starved
for the entire duration of the current grace period, as shown by the
t=2603.

Lets turn now to the trace output, looking for the last bit of the
rcu_sched task's activity:

       rcu_sched-9     [054] d...  1576.030096: timer_start: timer=c0000007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] cpu=54 idx=0 flags=
    ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhp=c000000fcf8c4eb0 func=__d_free
       rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
    ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhp=c000000fcff74ee0 func=proc_i_callback
       rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
       rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch

So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
but what is with "expires=4295094922"?)  and blocked ("rcu_utilization:
Start context switch" and "rcu_utilization: End context switch"),
recording its CPU's quiescent state in the process ("rcu_grace_period:
rcu_sched 7275 cpuqs").

Of course, the timer will have expired in the context of some other task,
but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
line above) shows nothing (to be painfully accurate, the search wraps back
to earlier uses of this timer by rcu_sched).  So the timer never did fire.

The next question is "what did CPU 054 do next?"  We find it entering idle:

          <idle>-0     [054] d...  1576.030167: tick_stop: success=1 dependency=NONE
          <idle>-0     [054] d...  1576.030167: hrtimer_cancel: hrtimer=c000000fff88c680
          <idle>-0     [054] d...  1576.030168: hrtimer_start: hrtimer=c000000fff88c680 function=tick_sched_timer expires=1610710000000 softexpires=1610710000000
          <idle>-0     [054] d...  1576.030170: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 34469831
          <idle>-0     [054] d...  1576.030171: rcu_dyntick: Start 140000000000000 0

So we have an hrtimer set for 1610710000000, whatever time that might
happen to map to.  And that is the last we hear from CPU 054, so it
apparently maps to something a bit too far into the future.  Let's
assume that this value is nanoseconds since boot, in which case we have
1,610.710000000 seconds, which is eight seconds after the stall-warning
message.  And -way- longer than the one-jiffy timeout requested!

Thomas, John, am I misinterpreting the timer trace event messages?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-16 16:27                                                                                     ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-16 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> > ...
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> > >     
> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > >  	 * (not only the tick).
> > >  	 */
> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > >  	return tick;
> > >  }
> > 
> > Should I be seeing negative values? A small sample:
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> >           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> >           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> >           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> >           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> >           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> >           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> >           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.
> 
> I will take a look!

And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
kthread that stalled.  This kthread was starved, based on this from your
dmesg:

[ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
idle-CPU and offline-CPU quiescent states, which means that its waits
will be accompanied by short timeouts.  The "starved for 2603 jiffies"
says that it has not run for one good long time.  The ->state is its
task_struct ->state field.

The immediately preceding dmesg line is as follows:

[ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)

In other words, the rcu_sched grace-period kthread has been starved
for the entire duration of the current grace period, as shown by the
t=2603.

Lets turn now to the trace output, looking for the last bit of the
rcu_sched task's activity:

       rcu_sched-9     [054] d...  1576.030096: timer_start: timer=c0000007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] cpu=54 idx=0 flags=
    ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhp=c000000fcf8c4eb0 func=__d_free
       rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
    ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhp=c000000fcff74ee0 func=proc_i_callback
       rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
       rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch

So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
but what is with "expires=4295094922"?)  and blocked ("rcu_utilization:
Start context switch" and "rcu_utilization: End context switch"),
recording its CPU's quiescent state in the process ("rcu_grace_period:
rcu_sched 7275 cpuqs").

Of course, the timer will have expired in the context of some other task,
but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
line above) shows nothing (to be painfully accurate, the search wraps back
to earlier uses of this timer by rcu_sched).  So the timer never did fire.

The next question is "what did CPU 054 do next?"  We find it entering idle:

          <idle>-0     [054] d...  1576.030167: tick_stop: success=1 dependency=NONE
          <idle>-0     [054] d...  1576.030167: hrtimer_cancel: hrtimer=c000000fff88c680
          <idle>-0     [054] d...  1576.030168: hrtimer_start: hrtimer=c000000fff88c680 function=tick_sched_timer expires=1610710000000 softexpires=1610710000000
          <idle>-0     [054] d...  1576.030170: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 34469831
          <idle>-0     [054] d...  1576.030171: rcu_dyntick: Start 140000000000000 0

So we have an hrtimer set for 1610710000000, whatever time that might
happen to map to.  And that is the last we hear from CPU 054, so it
apparently maps to something a bit too far into the future.  Let's
assume that this value is nanoseconds since boot, in which case we have
1,610.710000000 seconds, which is eight seconds after the stall-warning
message.  And -way- longer than the one-jiffy timeout requested!

Thomas, John, am I misinterpreting the timer trace event messages?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-16 16:27                                                                                     ` Paul E. McKenney
  (?)
@ 2017-08-17 13:55                                                                                       ` Michael Ellerman
  -1 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-17 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:

> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
>> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> > ...
>> > >
>> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
>> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > > Date:   Mon Aug 14 08:54:39 2017 -0700
>> > >
>> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
>> > >     
>> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > >
>> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > > index c7a899c5ce64..7358a5073dfb 100644
>> > > --- a/kernel/time/tick-sched.c
>> > > +++ b/kernel/time/tick-sched.c
>> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>> > >  	 * (not only the tick).
>> > >  	 */
>> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
>> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>> > >  	return tick;
>> > >  }
>> > 
>> > Should I be seeing negative values? A small sample:
>> 
>> Maybe due to hypervisor preemption delays, but I confess that I am
>> surprised to see them this large.  1,602,250,019 microseconds is something
>> like a half hour, which could result in stall warnings all by itself.

Hmm. This is a bare metal machine. So no hypervisor.

>> I will take a look!
>
> And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
> kthread that stalled.  This kthread was starved, based on this from your
> dmesg:
>
> [ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>
> The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
> idle-CPU and offline-CPU quiescent states, which means that its waits
> will be accompanied by short timeouts.  The "starved for 2603 jiffies"
> says that it has not run for one good long time.  The ->state is its
> task_struct ->state field.
>
> The immediately preceding dmesg line is as follows:
>
> [ 1602.063851]  (detected by 53, t&03 jiffies, gr75, cr74, q`8)
>
> In other words, the rcu_sched grace-period kthread has been starved
> for the entire duration of the current grace period, as shown by the
> t&03.
>
> Lets turn now to the trace output, looking for the last bit of the
> rcu_sched task's activity:
>
>        rcu_sched-9     [054] d...  1576.030096: timer_start: timerÀ000007fae1bc20 function=process_timeout expiresB95094922 [timeout=1] cpuT idx=0 flags>     ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhpÀ00000fcf8c4eb0 func=__d_free
>        rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
>     ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhpÀ00000fcff74ee0 func=proc_i_callback
>        rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
>        rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch
>
> So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
> but what is with "expiresB95094922"?)

That's a good one.

I have HZ\x100, and therefore:

INITIAL_JIFFIES = (1 << 32) - (300 * 100) = 4294937296

So the expires value of 4295094922 is:

4295094922 - 4294937296 = 157626

Jiffies since boot.

Or 1576,260,000,000 ns = 1576.26 s.

> Of course, the timer will have expired in the context of some other task,
> but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
> line above) shows nothing (to be painfully accurate, the search wraps back
> to earlier uses of this timer by rcu_sched).  So the timer never did fire.

Or it just wasn't in the trace ?

I'll try and get it to trace a bit longer and see if that is helpful.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-17 13:55                                                                                       ` Michael Ellerman
  0 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-17 13:55 UTC (permalink / raw)
  To: paulmck
  Cc: Jonathan Cameron, dzickus, sfr, linuxarm, Nicholas Piggin,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:

> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
>> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> > ...
>> > >
>> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
>> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > > Date:   Mon Aug 14 08:54:39 2017 -0700
>> > >
>> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
>> > >     
>> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > >
>> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > > index c7a899c5ce64..7358a5073dfb 100644
>> > > --- a/kernel/time/tick-sched.c
>> > > +++ b/kernel/time/tick-sched.c
>> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>> > >  	 * (not only the tick).
>> > >  	 */
>> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
>> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>> > >  	return tick;
>> > >  }
>> > 
>> > Should I be seeing negative values? A small sample:
>> 
>> Maybe due to hypervisor preemption delays, but I confess that I am
>> surprised to see them this large.  1,602,250,019 microseconds is something
>> like a half hour, which could result in stall warnings all by itself.

Hmm. This is a bare metal machine. So no hypervisor.

>> I will take a look!
>
> And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
> kthread that stalled.  This kthread was starved, based on this from your
> dmesg:
>
> [ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>
> The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
> idle-CPU and offline-CPU quiescent states, which means that its waits
> will be accompanied by short timeouts.  The "starved for 2603 jiffies"
> says that it has not run for one good long time.  The ->state is its
> task_struct ->state field.
>
> The immediately preceding dmesg line is as follows:
>
> [ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)
>
> In other words, the rcu_sched grace-period kthread has been starved
> for the entire duration of the current grace period, as shown by the
> t=2603.
>
> Lets turn now to the trace output, looking for the last bit of the
> rcu_sched task's activity:
>
>        rcu_sched-9     [054] d...  1576.030096: timer_start: timer=c0000007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] cpu=54 idx=0 flags=
>     ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhp=c000000fcf8c4eb0 func=__d_free
>        rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
>     ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhp=c000000fcff74ee0 func=proc_i_callback
>        rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
>        rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch
>
> So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
> but what is with "expires=4295094922"?)

That's a good one.

I have HZ=100, and therefore:

INITIAL_JIFFIES = (1 << 32) - (300 * 100) = 4294937296

So the expires value of 4295094922 is:

4295094922 - 4294937296 = 157626

Jiffies since boot.

Or 1576,260,000,000 ns == 1576.26 s.

> Of course, the timer will have expired in the context of some other task,
> but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
> line above) shows nothing (to be painfully accurate, the search wraps back
> to earlier uses of this timer by rcu_sched).  So the timer never did fire.

Or it just wasn't in the trace ?

I'll try and get it to trace a bit longer and see if that is helpful.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-17 13:55                                                                                       ` Michael Ellerman
  0 siblings, 0 replies; 241+ messages in thread
From: Michael Ellerman @ 2017-08-17 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:

> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
>> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> > ...
>> > >
>> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
>> > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > > Date:   Mon Aug 14 08:54:39 2017 -0700
>> > >
>> > >     EXP: Trace tick return from tick_nohz_stop_sched_tick
>> > >     
>> > >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > >
>> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > > index c7a899c5ce64..7358a5073dfb 100644
>> > > --- a/kernel/time/tick-sched.c
>> > > +++ b/kernel/time/tick-sched.c
>> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>> > >  	 * (not only the tick).
>> > >  	 */
>> > >  	ts->sleep_length = ktime_sub(dev->next_event, now);
>> > > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>> > >  	return tick;
>> > >  }
>> > 
>> > Should I be seeing negative values? A small sample:
>> 
>> Maybe due to hypervisor preemption delays, but I confess that I am
>> surprised to see them this large.  1,602,250,019 microseconds is something
>> like a half hour, which could result in stall warnings all by itself.

Hmm. This is a bare metal machine. So no hypervisor.

>> I will take a look!
>
> And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
> kthread that stalled.  This kthread was starved, based on this from your
> dmesg:
>
> [ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>
> The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
> idle-CPU and offline-CPU quiescent states, which means that its waits
> will be accompanied by short timeouts.  The "starved for 2603 jiffies"
> says that it has not run for one good long time.  The ->state is its
> task_struct ->state field.
>
> The immediately preceding dmesg line is as follows:
>
> [ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)
>
> In other words, the rcu_sched grace-period kthread has been starved
> for the entire duration of the current grace period, as shown by the
> t=2603.
>
> Lets turn now to the trace output, looking for the last bit of the
> rcu_sched task's activity:
>
>        rcu_sched-9     [054] d...  1576.030096: timer_start: timer=c0000007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] cpu=54 idx=0 flags=
>     ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched rhp=c000000fcf8c4eb0 func=__d_free
>        rcu_sched-9     [054] d...  1576.030097: rcu_utilization: Start context switch
>     ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched rhp=c000000fcff74ee0 func=proc_i_callback
>        rcu_sched-9     [054] d...  1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
>        rcu_sched-9     [054] d...  1576.030099: rcu_utilization: End context switch
>
> So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
> but what is with "expires=4295094922"?)

That's a good one.

I have HZ=100, and therefore:

INITIAL_JIFFIES = (1 << 32) - (300 * 100) = 4294937296

So the expires value of 4295094922 is:

4295094922 - 4294937296 = 157626

Jiffies since boot.

Or 1576,260,000,000 ns == 1576.26 s.

> Of course, the timer will have expired in the context of some other task,
> but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
> line above) shows nothing (to be painfully accurate, the search wraps back
> to earlier uses of this timer by rcu_sched).  So the timer never did fire.

Or it just wasn't in the trace ?

I'll try and get it to trace a bit longer and see if that is helpful.

cheers

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-16 16:27                                                                                     ` Paul E. McKenney
  (?)
@ 2017-08-20  4:45                                                                                       ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20  4:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 16 Aug 2017 09:27:31 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> 
> Thomas, John, am I misinterpreting the timer trace event messages?

So I did some digging, and what you find is that rcu_sched seems to do a
simple scheudle_timeout(1) and just goes out to lunch for many seconds.
The process_timeout timer never fires (when it finally does wake after
one of these events, it usually removes the timer with del_timer_sync).

So this patch seems to fix it. Testing, comments welcome.

Thanks,
Nick

[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by always forwarding the base when adding a new timer if it is
more than 1 jiffy behind. Another approach I looked at first was to note
if the base was idle but not yet run or forwarded, however this just
seemed to add more branches and complexity when it seems we can just
cover it with this test.

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..8f69b3105b8f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -859,10 +859,10 @@ static inline void forward_timer_base(struct timer_base *base)
 	unsigned long jnow = READ_ONCE(jiffies);
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if ((long) (jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +938,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires = expires)
 			return 1;
 
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  4:45                                                                                       ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20  4:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Wed, 16 Aug 2017 09:27:31 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> 
> Thomas, John, am I misinterpreting the timer trace event messages?

So I did some digging, and what you find is that rcu_sched seems to do a
simple scheudle_timeout(1) and just goes out to lunch for many seconds.
The process_timeout timer never fires (when it finally does wake after
one of these events, it usually removes the timer with del_timer_sync).

So this patch seems to fix it. Testing, comments welcome.

Thanks,
Nick

[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by always forwarding the base when adding a new timer if it is
more than 1 jiffy behind. Another approach I looked at first was to note
if the base was idle but not yet run or forwarded, however this just
seemed to add more branches and complexity when it seems we can just
cover it with this test.

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..8f69b3105b8f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -859,10 +859,10 @@ static inline void forward_timer_base(struct timer_base *base)
 	unsigned long jnow = READ_ONCE(jiffies);
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if ((long) (jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +938,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  4:45                                                                                       ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20  4:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 16 Aug 2017 09:27:31 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> 
> Thomas, John, am I misinterpreting the timer trace event messages?

So I did some digging, and what you find is that rcu_sched seems to do a
simple scheudle_timeout(1) and just goes out to lunch for many seconds.
The process_timeout timer never fires (when it finally does wake after
one of these events, it usually removes the timer with del_timer_sync).

So this patch seems to fix it. Testing, comments welcome.

Thanks,
Nick

[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by always forwarding the base when adding a new timer if it is
more than 1 jiffy behind. Another approach I looked at first was to note
if the base was idle but not yet run or forwarded, however this just
seemed to add more branches and complexity when it seems we can just
cover it with this test.

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..8f69b3105b8f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -859,10 +859,10 @@ static inline void forward_timer_base(struct timer_base *base)
 	unsigned long jnow = READ_ONCE(jiffies);
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if ((long) (jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +938,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20  4:45                                                                                       ` Nicholas Piggin
  (?)
@ 2017-08-20  5:01                                                                                         ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-20  5:01 UTC (permalink / raw)
  To: linux-arm-kernel

From: Nicholas Piggin <npiggin@gmail.com>
Date: Sun, 20 Aug 2017 14:45:53 +1000

> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle

I just booted into this on my sparc64 box, let's see if it does the
trick :-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  5:01                                                                                         ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-20  5:01 UTC (permalink / raw)
  To: npiggin
  Cc: paulmck, mpe, Jonathan.Cameron, dzickus, sfr, linuxarm, abdhalee,
	tglx, sparclinux, akpm, linuxppc-dev, linux-arm-kernel,
	john.stultz

From: Nicholas Piggin <npiggin@gmail.com>
Date: Sun, 20 Aug 2017 14:45:53 +1000

> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle

I just booted into this on my sparc64 box, let's see if it does the
trick :-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  5:01                                                                                         ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-20  5:01 UTC (permalink / raw)
  To: linux-arm-kernel

From: Nicholas Piggin <npiggin@gmail.com>
Date: Sun, 20 Aug 2017 14:45:53 +1000

> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle

I just booted into this on my sparc64 box, let's see if it does the
trick :-)

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20  4:45                                                                                       ` Nicholas Piggin
  (?)
@ 2017-08-20  5:04                                                                                         ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20  5:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 02:45:53PM +1000, Nicholas Piggin wrote:
> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Fired up some tests, which should complete in about 12 hours.

Here is hoping!  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  5:04                                                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20  5:04 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Sun, Aug 20, 2017 at 02:45:53PM +1000, Nicholas Piggin wrote:
> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Fired up some tests, which should complete in about 12 hours.

Here is hoping!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20  5:04                                                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20  5:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 02:45:53PM +1000, Nicholas Piggin wrote:
> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Fired up some tests, which should complete in about 12 hours.

Here is hoping!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20  4:45                                                                                       ` Nicholas Piggin
  (?)
@ 2017-08-20 13:00                                                                                         ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 20 Aug 2017 14:45:53 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?  
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Okay this had a problem of trying to forward the timer from a timer
callback function.

This was my other approach which also fixes the RCU warnings, but it's
a little more complex. I reworked it a bit so the mod_timer fast path
hopefully doesn't have much more overhead (actually by reading jiffies
only when needed, it probably saves a load).

Thanks,
Nick

--
[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by keeping track of whether the timer has been idle since it was
last run or forwarded, and allow forwarding in the case that is true (even
if it is not currently idle).

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..ee7b8b688b48 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle;
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires = expires)
 			return 1;
 
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
 	struct hlist_head heads[LVL_DEPTH];
 	int levels;
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 */
+	base->was_idle = false;
+
 	if (!time_after_eq(jiffies, base->clk))
 		return;
 
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 13:00                                                                                         ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20 13:00 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Sun, 20 Aug 2017 14:45:53 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?  
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Okay this had a problem of trying to forward the timer from a timer
callback function.

This was my other approach which also fixes the RCU warnings, but it's
a little more complex. I reworked it a bit so the mod_timer fast path
hopefully doesn't have much more overhead (actually by reading jiffies
only when needed, it probably saves a load).

Thanks,
Nick

--
[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by keeping track of whether the timer has been idle since it was
last run or forwarded, and allow forwarding in the case that is true (even
if it is not currently idle).

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..ee7b8b688b48 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle;
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
 	struct hlist_head heads[LVL_DEPTH];
 	int levels;
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 */
+	base->was_idle = false;
+
 	if (!time_after_eq(jiffies, base->clk))
 		return;
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 13:00                                                                                         ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 20 Aug 2017 14:45:53 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?  
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Okay this had a problem of trying to forward the timer from a timer
callback function.

This was my other approach which also fixes the RCU warnings, but it's
a little more complex. I reworked it a bit so the mod_timer fast path
hopefully doesn't have much more overhead (actually by reading jiffies
only when needed, it probably saves a load).

Thanks,
Nick

--
[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by keeping track of whether the timer has been idle since it was
last run or forwarded, and allow forwarding in the case that is true (even
if it is not currently idle).

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..ee7b8b688b48 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle;
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
 	struct hlist_head heads[LVL_DEPTH];
 	int levels;
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 */
+	base->was_idle = false;
+
 	if (!time_after_eq(jiffies, base->clk))
 		return;
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20 13:00                                                                                         ` Nicholas Piggin
  (?)
@ 2017-08-20 18:35                                                                                           ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:45:53 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Wed, 16 Aug 2017 09:27:31 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > 
> > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > 
> > So I did some digging, and what you find is that rcu_sched seems to do a
> > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > The process_timeout timer never fires (when it finally does wake after
> > one of these events, it usually removes the timer with del_timer_sync).
> > 
> > So this patch seems to fix it. Testing, comments welcome.
> 
> Okay this had a problem of trying to forward the timer from a timer
> callback function.
> 
> This was my other approach which also fixes the RCU warnings, but it's
> a little more complex. I reworked it a bit so the mod_timer fast path
> hopefully doesn't have much more overhead (actually by reading jiffies
> only when needed, it probably saves a load).

Giving this one a whirl!

							Thanx, Paul

> Thanks,
> Nick
> 
> --
> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle
> 
> When a timer base is idle, it is forwarded when a new timer is added to
> ensure that granularity does not become excessive. When not idle, the
> timer tick is expected to increment the base.
> 
> However there is a window after a timer is restarted from nohz, when it
> is marked not-idle, and before the timer tick on this CPU, where a timer
> may be added on an ancient base that does not get forwarded (beacause
> the timer appears not-idle).
> 
> This results in excessive granularity. So much so that a 1 jiffy timeout
> has blown out to 10s of seconds and triggered the RCU stall warning
> detector.
> 
> Fix this by keeping track of whether the timer has been idle since it was
> last run or forwarded, and allow forwarding in the case that is true (even
> if it is not currently idle).
> 
> Also add a comment noting a case where we could get an unexpectedly
> large granularity for a timer. I debugged this problem by adding
> warnings for such cases, but it seems we can't add them in general due
> to this corner case.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
>  1 file changed, 27 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..ee7b8b688b48 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle;
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
> 
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
> 
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
> 
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires = expires)
>  			return 1;
> 
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
> 
> @@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
>  	struct hlist_head heads[LVL_DEPTH];
>  	int levels;
> 
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 */
> +	base->was_idle = false;
> +
>  	if (!time_after_eq(jiffies, base->clk))
>  		return;
> 
> -- 
> 2.13.3
> 


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 18:35                                                                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 18:35 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:45:53 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Wed, 16 Aug 2017 09:27:31 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > 
> > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > 
> > So I did some digging, and what you find is that rcu_sched seems to do a
> > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > The process_timeout timer never fires (when it finally does wake after
> > one of these events, it usually removes the timer with del_timer_sync).
> > 
> > So this patch seems to fix it. Testing, comments welcome.
> 
> Okay this had a problem of trying to forward the timer from a timer
> callback function.
> 
> This was my other approach which also fixes the RCU warnings, but it's
> a little more complex. I reworked it a bit so the mod_timer fast path
> hopefully doesn't have much more overhead (actually by reading jiffies
> only when needed, it probably saves a load).

Giving this one a whirl!

							Thanx, Paul

> Thanks,
> Nick
> 
> --
> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle
> 
> When a timer base is idle, it is forwarded when a new timer is added to
> ensure that granularity does not become excessive. When not idle, the
> timer tick is expected to increment the base.
> 
> However there is a window after a timer is restarted from nohz, when it
> is marked not-idle, and before the timer tick on this CPU, where a timer
> may be added on an ancient base that does not get forwarded (beacause
> the timer appears not-idle).
> 
> This results in excessive granularity. So much so that a 1 jiffy timeout
> has blown out to 10s of seconds and triggered the RCU stall warning
> detector.
> 
> Fix this by keeping track of whether the timer has been idle since it was
> last run or forwarded, and allow forwarding in the case that is true (even
> if it is not currently idle).
> 
> Also add a comment noting a case where we could get an unexpectedly
> large granularity for a timer. I debugged this problem by adding
> warnings for such cases, but it seems we can't add them in general due
> to this corner case.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
>  1 file changed, 27 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..ee7b8b688b48 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle;
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
> 
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
> 
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
> 
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires == expires)
>  			return 1;
> 
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
> 
> @@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
>  	struct hlist_head heads[LVL_DEPTH];
>  	int levels;
> 
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 */
> +	base->was_idle = false;
> +
>  	if (!time_after_eq(jiffies, base->clk))
>  		return;
> 
> -- 
> 2.13.3
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 18:35                                                                                           ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:45:53 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Wed, 16 Aug 2017 09:27:31 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > 
> > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > 
> > So I did some digging, and what you find is that rcu_sched seems to do a
> > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > The process_timeout timer never fires (when it finally does wake after
> > one of these events, it usually removes the timer with del_timer_sync).
> > 
> > So this patch seems to fix it. Testing, comments welcome.
> 
> Okay this had a problem of trying to forward the timer from a timer
> callback function.
> 
> This was my other approach which also fixes the RCU warnings, but it's
> a little more complex. I reworked it a bit so the mod_timer fast path
> hopefully doesn't have much more overhead (actually by reading jiffies
> only when needed, it probably saves a load).

Giving this one a whirl!

							Thanx, Paul

> Thanks,
> Nick
> 
> --
> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle
> 
> When a timer base is idle, it is forwarded when a new timer is added to
> ensure that granularity does not become excessive. When not idle, the
> timer tick is expected to increment the base.
> 
> However there is a window after a timer is restarted from nohz, when it
> is marked not-idle, and before the timer tick on this CPU, where a timer
> may be added on an ancient base that does not get forwarded (beacause
> the timer appears not-idle).
> 
> This results in excessive granularity. So much so that a 1 jiffy timeout
> has blown out to 10s of seconds and triggered the RCU stall warning
> detector.
> 
> Fix this by keeping track of whether the timer has been idle since it was
> last run or forwarded, and allow forwarding in the case that is true (even
> if it is not currently idle).
> 
> Also add a comment noting a case where we could get an unexpectedly
> large granularity for a timer. I debugged this problem by adding
> warnings for such cases, but it seems we can't add them in general due
> to this corner case.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  kernel/time/timer.c | 32 +++++++++++++++++++++++++++-----
>  1 file changed, 27 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..ee7b8b688b48 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle;
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
> 
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
> 
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
> 
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires == expires)
>  			return 1;
> 
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
> 
> @@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
>  	struct hlist_head heads[LVL_DEPTH];
>  	int levels;
> 
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 */
> +	base->was_idle = false;
> +
>  	if (!time_after_eq(jiffies, base->clk))
>  		return;
> 
> -- 
> 2.13.3
> 

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20 18:35                                                                                           ` Paul E. McKenney
  (?)
@ 2017-08-20 21:14                                                                                             ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 21:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > On Sun, 20 Aug 2017 14:45:53 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> > 
> > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > 
> > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > The process_timeout timer never fires (when it finally does wake after
> > > one of these events, it usually removes the timer with del_timer_sync).
> > > 
> > > So this patch seems to fix it. Testing, comments welcome.
> > 
> > Okay this had a problem of trying to forward the timer from a timer
> > callback function.
> > 
> > This was my other approach which also fixes the RCU warnings, but it's
> > a little more complex. I reworked it a bit so the mod_timer fast path
> > hopefully doesn't have much more overhead (actually by reading jiffies
> > only when needed, it probably saves a load).
> 
> Giving this one a whirl!

No joy here, but then again there are other reasons to believe that I
am seeing a different bug than Dave and Jonathan are.

OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
not statistically different from what I see without either patch.

But no statistical difference compared to without patch, and I still
see the "rcu_sched kthread starved" messages.  For whatever it is worth,
by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
Hmmm...  I am also seeing that without any of your patches.  Might
be hypervisor preemption, I guess.

							Thanx, Paul

PS.  I will be off the grid for the next day or so.  Eclipse day here...


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 21:14                                                                                             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 21:14 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > On Sun, 20 Aug 2017 14:45:53 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> > 
> > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > 
> > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > The process_timeout timer never fires (when it finally does wake after
> > > one of these events, it usually removes the timer with del_timer_sync).
> > > 
> > > So this patch seems to fix it. Testing, comments welcome.
> > 
> > Okay this had a problem of trying to forward the timer from a timer
> > callback function.
> > 
> > This was my other approach which also fixes the RCU warnings, but it's
> > a little more complex. I reworked it a bit so the mod_timer fast path
> > hopefully doesn't have much more overhead (actually by reading jiffies
> > only when needed, it probably saves a load).
> 
> Giving this one a whirl!

No joy here, but then again there are other reasons to believe that I
am seeing a different bug than Dave and Jonathan are.

OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
not statistically different from what I see without either patch.

But no statistical difference compared to without patch, and I still
see the "rcu_sched kthread starved" messages.  For whatever it is worth,
by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
Hmmm...  I am also seeing that without any of your patches.  Might
be hypervisor preemption, I guess.

							Thanx, Paul

PS.  I will be off the grid for the next day or so.  Eclipse day here...

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-20 21:14                                                                                             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-20 21:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > On Sun, 20 Aug 2017 14:45:53 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> > 
> > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > 
> > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > The process_timeout timer never fires (when it finally does wake after
> > > one of these events, it usually removes the timer with del_timer_sync).
> > > 
> > > So this patch seems to fix it. Testing, comments welcome.
> > 
> > Okay this had a problem of trying to forward the timer from a timer
> > callback function.
> > 
> > This was my other approach which also fixes the RCU warnings, but it's
> > a little more complex. I reworked it a bit so the mod_timer fast path
> > hopefully doesn't have much more overhead (actually by reading jiffies
> > only when needed, it probably saves a load).
> 
> Giving this one a whirl!

No joy here, but then again there are other reasons to believe that I
am seeing a different bug than Dave and Jonathan are.

OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
not statistically different from what I see without either patch.

But no statistical difference compared to without patch, and I still
see the "rcu_sched kthread starved" messages.  For whatever it is worth,
by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
Hmmm...  I am also seeing that without any of your patches.  Might
be hypervisor preemption, I guess.

							Thanx, Paul

PS.  I will be off the grid for the next day or so.  Eclipse day here...

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-20 21:14                                                                                             ` Paul E. McKenney
  (?)
@ 2017-08-21  0:52                                                                                               ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  0:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 20 Aug 2017 14:14:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >   
> > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > 
> > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > The process_timeout timer never fires (when it finally does wake after
> > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > 
> > > > So this patch seems to fix it. Testing, comments welcome.  
> > > 
> > > Okay this had a problem of trying to forward the timer from a timer
> > > callback function.
> > > 
> > > This was my other approach which also fixes the RCU warnings, but it's
> > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > only when needed, it probably saves a load).  
> > 
> > Giving this one a whirl!  
> 
> No joy here, but then again there are other reasons to believe that I
> am seeing a different bug than Dave and Jonathan are.
> 
> OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> not statistically different from what I see without either patch.
> 
> But no statistical difference compared to without patch, and I still
> see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> Hmmm...  I am also seeing that without any of your patches.  Might
> be hypervisor preemption, I guess.

Okay it makes the warnings go away for me, but I'm just booting then
leaving the system idle. You're doing some CPU hotplug activity?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21  0:52                                                                                               ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  0:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Sun, 20 Aug 2017 14:14:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >   
> > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > 
> > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > The process_timeout timer never fires (when it finally does wake after
> > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > 
> > > > So this patch seems to fix it. Testing, comments welcome.  
> > > 
> > > Okay this had a problem of trying to forward the timer from a timer
> > > callback function.
> > > 
> > > This was my other approach which also fixes the RCU warnings, but it's
> > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > only when needed, it probably saves a load).  
> > 
> > Giving this one a whirl!  
> 
> No joy here, but then again there are other reasons to believe that I
> am seeing a different bug than Dave and Jonathan are.
> 
> OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> not statistically different from what I see without either patch.
> 
> But no statistical difference compared to without patch, and I still
> see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> Hmmm...  I am also seeing that without any of your patches.  Might
> be hypervisor preemption, I guess.

Okay it makes the warnings go away for me, but I'm just booting then
leaving the system idle. You're doing some CPU hotplug activity?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21  0:52                                                                                               ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  0:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 20 Aug 2017 14:14:29 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >   
> > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > 
> > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > The process_timeout timer never fires (when it finally does wake after
> > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > 
> > > > So this patch seems to fix it. Testing, comments welcome.  
> > > 
> > > Okay this had a problem of trying to forward the timer from a timer
> > > callback function.
> > > 
> > > This was my other approach which also fixes the RCU warnings, but it's
> > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > only when needed, it probably saves a load).  
> > 
> > Giving this one a whirl!  
> 
> No joy here, but then again there are other reasons to believe that I
> am seeing a different bug than Dave and Jonathan are.
> 
> OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> not statistically different from what I see without either patch.
> 
> But no statistical difference compared to without patch, and I still
> see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> Hmmm...  I am also seeing that without any of your patches.  Might
> be hypervisor preemption, I guess.

Okay it makes the warnings go away for me, but I'm just booting then
leaving the system idle. You're doing some CPU hotplug activity?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21  0:52                                                                                               ` Nicholas Piggin
  (?)
@ 2017-08-21  6:06                                                                                                 ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  6:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 10:52:58 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:  
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:    
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >     
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:    
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?      
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.    
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).    
> > > 
> > > Giving this one a whirl!    
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.  
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Okay found a bug in the patch (it was not forwarding properly before
adding the first timer after an idle) and a few other concerns.

There's still a problem of a timer function doing a mod timer from
within expire_timers. It can't forward the base, which might currently
be quite a way behind. I *think* after we close these gaps and get
timely wakeups for timers on there, it should not get too far behind
for standard timers.

Deferrable is a different story. Firstly it has no idle tracking so we
never forward it. Even if we wanted to, we can't do it reliably because
it could contain timers way behind the base. They are "deferrable", so
you get what you pay for, but this still means there's a window where
you can add a deferrable timer and get a far later expiry than you
asked for despite the CPU never going idle after you added it.

All these problems would seem to go away if mod_timer just queued up
the timer to a single list on the base then pushed them into the
wheel during your wheel processing softirq... Although maybe you end
up with excessive passes over big queue of timers. Anyway that
wouldn't be suitable for 4.13 even if it could work.

I'll send out an updated minimal fix after some more testing...

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21  6:06                                                                                                 ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  6:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Mon, 21 Aug 2017 10:52:58 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:  
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:    
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >     
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:    
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?      
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.    
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).    
> > > 
> > > Giving this one a whirl!    
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.  
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Okay found a bug in the patch (it was not forwarding properly before
adding the first timer after an idle) and a few other concerns.

There's still a problem of a timer function doing a mod timer from
within expire_timers. It can't forward the base, which might currently
be quite a way behind. I *think* after we close these gaps and get
timely wakeups for timers on there, it should not get too far behind
for standard timers.

Deferrable is a different story. Firstly it has no idle tracking so we
never forward it. Even if we wanted to, we can't do it reliably because
it could contain timers way behind the base. They are "deferrable", so
you get what you pay for, but this still means there's a window where
you can add a deferrable timer and get a far later expiry than you
asked for despite the CPU never going idle after you added it.

All these problems would seem to go away if mod_timer just queued up
the timer to a single list on the base then pushed them into the
wheel during your wheel processing softirq... Although maybe you end
up with excessive passes over big queue of timers. Anyway that
wouldn't be suitable for 4.13 even if it could work.

I'll send out an updated minimal fix after some more testing...

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21  6:06                                                                                                 ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21  6:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 10:52:58 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:  
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:    
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >     
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:    
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?      
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.    
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).    
> > > 
> > > Giving this one a whirl!    
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.  
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Okay found a bug in the patch (it was not forwarding properly before
adding the first timer after an idle) and a few other concerns.

There's still a problem of a timer function doing a mod timer from
within expire_timers. It can't forward the base, which might currently
be quite a way behind. I *think* after we close these gaps and get
timely wakeups for timers on there, it should not get too far behind
for standard timers.

Deferrable is a different story. Firstly it has no idle tracking so we
never forward it. Even if we wanted to, we can't do it reliably because
it could contain timers way behind the base. They are "deferrable", so
you get what you pay for, but this still means there's a window where
you can add a deferrable timer and get a far later expiry than you
asked for despite the CPU never going idle after you added it.

All these problems would seem to go away if mod_timer just queued up
the timer to a single list on the base then pushed them into the
wheel during your wheel processing softirq... Although maybe you end
up with excessive passes over big queue of timers. Anyway that
wouldn't be suitable for 4.13 even if it could work.

I'll send out an updated minimal fix after some more testing...

Thanks,
Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21  6:06                                                                                                 ` Nicholas Piggin
  (?)
@ 2017-08-21 10:18                                                                                                   ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 10:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 16:06:05 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 10:52:58 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Sun, 20 Aug 2017 14:14:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:    
> > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:      
> > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > >       
> > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:      
> > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > 
> > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?        
> > > > > > 
> > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > 
> > > > > > So this patch seems to fix it. Testing, comments welcome.      
> > > > > 
> > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > callback function.
> > > > > 
> > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > only when needed, it probably saves a load).      
> > > > 
> > > > Giving this one a whirl!      
> > > 
> > > No joy here, but then again there are other reasons to believe that I
> > > am seeing a different bug than Dave and Jonathan are.
> > > 
> > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > not statistically different from what I see without either patch.
> > > 
> > > But no statistical difference compared to without patch, and I still
> > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > be hypervisor preemption, I guess.    
> > 
> > Okay it makes the warnings go away for me, but I'm just booting then
> > leaving the system idle. You're doing some CPU hotplug activity?  
> 
> Okay found a bug in the patch (it was not forwarding properly before
> adding the first timer after an idle) and a few other concerns.
> 
> There's still a problem of a timer function doing a mod timer from
> within expire_timers. It can't forward the base, which might currently
> be quite a way behind. I *think* after we close these gaps and get
> timely wakeups for timers on there, it should not get too far behind
> for standard timers.
> 
> Deferrable is a different story. Firstly it has no idle tracking so we
> never forward it. Even if we wanted to, we can't do it reliably because
> it could contain timers way behind the base. They are "deferrable", so
> you get what you pay for, but this still means there's a window where
> you can add a deferrable timer and get a far later expiry than you
> asked for despite the CPU never going idle after you added it.
> 
> All these problems would seem to go away if mod_timer just queued up
> the timer to a single list on the base then pushed them into the
> wheel during your wheel processing softirq... Although maybe you end
> up with excessive passes over big queue of timers. Anyway that
> wouldn't be suitable for 4.13 even if it could work.
> 
> I'll send out an updated minimal fix after some more testing...

Hi All,

I'm back in the office with hardware access on our D05 64 core ARM64
boards.

I think we still have by far the quickest test cases for this so
feel free to ping me anything you want tested quickly (we were
looking at an average of less than 10 minutes to trigger
with machine idling).

Nick, I'm currently running your previous version and we are over an
hour so even without any instances of the issue so it looks like a
considerable improvement.  I'll see if I can line a couple of boards
up for an overnight run if you have your updated version out by then.

Be great to finally put this one to bed.

Thanks,

Jonathan

> 
> Thanks,
> Nick


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 10:18                                                                                                   ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 10:18 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Paul E. McKenney, Michael Ellerman, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Mon, 21 Aug 2017 16:06:05 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 10:52:58 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Sun, 20 Aug 2017 14:14:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:    
> > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:      
> > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > >       
> > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:      
> > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > 
> > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?        
> > > > > > 
> > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > 
> > > > > > So this patch seems to fix it. Testing, comments welcome.      
> > > > > 
> > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > callback function.
> > > > > 
> > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > only when needed, it probably saves a load).      
> > > > 
> > > > Giving this one a whirl!      
> > > 
> > > No joy here, but then again there are other reasons to believe that I
> > > am seeing a different bug than Dave and Jonathan are.
> > > 
> > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > not statistically different from what I see without either patch.
> > > 
> > > But no statistical difference compared to without patch, and I still
> > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > be hypervisor preemption, I guess.    
> > 
> > Okay it makes the warnings go away for me, but I'm just booting then
> > leaving the system idle. You're doing some CPU hotplug activity?  
> 
> Okay found a bug in the patch (it was not forwarding properly before
> adding the first timer after an idle) and a few other concerns.
> 
> There's still a problem of a timer function doing a mod timer from
> within expire_timers. It can't forward the base, which might currently
> be quite a way behind. I *think* after we close these gaps and get
> timely wakeups for timers on there, it should not get too far behind
> for standard timers.
> 
> Deferrable is a different story. Firstly it has no idle tracking so we
> never forward it. Even if we wanted to, we can't do it reliably because
> it could contain timers way behind the base. They are "deferrable", so
> you get what you pay for, but this still means there's a window where
> you can add a deferrable timer and get a far later expiry than you
> asked for despite the CPU never going idle after you added it.
> 
> All these problems would seem to go away if mod_timer just queued up
> the timer to a single list on the base then pushed them into the
> wheel during your wheel processing softirq... Although maybe you end
> up with excessive passes over big queue of timers. Anyway that
> wouldn't be suitable for 4.13 even if it could work.
> 
> I'll send out an updated minimal fix after some more testing...

Hi All,

I'm back in the office with hardware access on our D05 64 core ARM64
boards.

I think we still have by far the quickest test cases for this so
feel free to ping me anything you want tested quickly (we were
looking at an average of less than 10 minutes to trigger
with machine idling).

Nick, I'm currently running your previous version and we are over an
hour so even without any instances of the issue so it looks like a
considerable improvement.  I'll see if I can line a couple of boards
up for an overnight run if you have your updated version out by then.

Be great to finally put this one to bed.

Thanks,

Jonathan

> 
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 10:18                                                                                                   ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 10:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 16:06:05 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 10:52:58 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Sun, 20 Aug 2017 14:14:29 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:    
> > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:      
> > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > >       
> > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:      
> > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > 
> > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?        
> > > > > > 
> > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > 
> > > > > > So this patch seems to fix it. Testing, comments welcome.      
> > > > > 
> > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > callback function.
> > > > > 
> > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > only when needed, it probably saves a load).      
> > > > 
> > > > Giving this one a whirl!      
> > > 
> > > No joy here, but then again there are other reasons to believe that I
> > > am seeing a different bug than Dave and Jonathan are.
> > > 
> > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > not statistically different from what I see without either patch.
> > > 
> > > But no statistical difference compared to without patch, and I still
> > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > be hypervisor preemption, I guess.    
> > 
> > Okay it makes the warnings go away for me, but I'm just booting then
> > leaving the system idle. You're doing some CPU hotplug activity?  
> 
> Okay found a bug in the patch (it was not forwarding properly before
> adding the first timer after an idle) and a few other concerns.
> 
> There's still a problem of a timer function doing a mod timer from
> within expire_timers. It can't forward the base, which might currently
> be quite a way behind. I *think* after we close these gaps and get
> timely wakeups for timers on there, it should not get too far behind
> for standard timers.
> 
> Deferrable is a different story. Firstly it has no idle tracking so we
> never forward it. Even if we wanted to, we can't do it reliably because
> it could contain timers way behind the base. They are "deferrable", so
> you get what you pay for, but this still means there's a window where
> you can add a deferrable timer and get a far later expiry than you
> asked for despite the CPU never going idle after you added it.
> 
> All these problems would seem to go away if mod_timer just queued up
> the timer to a single list on the base then pushed them into the
> wheel during your wheel processing softirq... Although maybe you end
> up with excessive passes over big queue of timers. Anyway that
> wouldn't be suitable for 4.13 even if it could work.
> 
> I'll send out an updated minimal fix after some more testing...

Hi All,

I'm back in the office with hardware access on our D05 64 core ARM64
boards.

I think we still have by far the quickest test cases for this so
feel free to ping me anything you want tested quickly (we were
looking at an average of less than 10 minutes to trigger
with machine idling).

Nick, I'm currently running your previous version and we are over an
hour so even without any instances of the issue so it looks like a
considerable improvement.  I'll see if I can line a couple of boards
up for an overnight run if you have your updated version out by then.

Be great to finally put this one to bed.

Thanks,

Jonathan

> 
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21 10:18                                                                                                   ` Jonathan Cameron
  (?)
@ 2017-08-21 14:19                                                                                                     ` Nicholas Piggin
  -1 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 11:18:33 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 21 Aug 2017 16:06:05 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Mon, 21 Aug 2017 10:52:58 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:      
> > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:        
> > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > >         
> > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:        
> > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > 
> > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?          
> > > > > > > 
> > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > 
> > > > > > > So this patch seems to fix it. Testing, comments welcome.        
> > > > > > 
> > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > callback function.
> > > > > > 
> > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > only when needed, it probably saves a load).        
> > > > > 
> > > > > Giving this one a whirl!        
> > > > 
> > > > No joy here, but then again there are other reasons to believe that I
> > > > am seeing a different bug than Dave and Jonathan are.
> > > > 
> > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > not statistically different from what I see without either patch.
> > > > 
> > > > But no statistical difference compared to without patch, and I still
> > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > be hypervisor preemption, I guess.      
> > > 
> > > Okay it makes the warnings go away for me, but I'm just booting then
> > > leaving the system idle. You're doing some CPU hotplug activity?    
> > 
> > Okay found a bug in the patch (it was not forwarding properly before
> > adding the first timer after an idle) and a few other concerns.
> > 
> > There's still a problem of a timer function doing a mod timer from
> > within expire_timers. It can't forward the base, which might currently
> > be quite a way behind. I *think* after we close these gaps and get
> > timely wakeups for timers on there, it should not get too far behind
> > for standard timers.
> > 
> > Deferrable is a different story. Firstly it has no idle tracking so we
> > never forward it. Even if we wanted to, we can't do it reliably because
> > it could contain timers way behind the base. They are "deferrable", so
> > you get what you pay for, but this still means there's a window where
> > you can add a deferrable timer and get a far later expiry than you
> > asked for despite the CPU never going idle after you added it.
> > 
> > All these problems would seem to go away if mod_timer just queued up
> > the timer to a single list on the base then pushed them into the
> > wheel during your wheel processing softirq... Although maybe you end
> > up with excessive passes over big queue of timers. Anyway that
> > wouldn't be suitable for 4.13 even if it could work.
> > 
> > I'll send out an updated minimal fix after some more testing...  
> 
> Hi All,
> 
> I'm back in the office with hardware access on our D05 64 core ARM64
> boards.
> 
> I think we still have by far the quickest test cases for this so
> feel free to ping me anything you want tested quickly (we were
> looking at an average of less than 10 minutes to trigger
> with machine idling).
> 
> Nick, I'm currently running your previous version and we are over an
> hour so even without any instances of the issue so it looks like a
> considerable improvement.  I'll see if I can line a couple of boards
> up for an overnight run if you have your updated version out by then.
> 
> Be great to finally put this one to bed.

Hi Jonathan,

Thanks here's an updated version with a couple more bugs fixed. If
you could try testing, that would be much appreciated.

Thanks,
Nick

---
 kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..2b9d2cdb3fac 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle; /* was it idle since last run/fwded */
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires = expires)
 			return 1;
 
@@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		 * dequeue/enqueue dance.
 		 */
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 
 		clk = base->clk;
 		idx = calc_wheel_index(expires, clk);
@@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		}
 	} else {
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 	}
 
 	ret = detach_if_pending(timer, base, false);
@@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 			raw_spin_lock(&base->lock);
 			WRITE_ONCE(timer->flags,
 				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
+			forward_timer_base(base);
 		}
 	}
 
-	/* Try to forward a stale timer base clock */
-	forward_timer_base(base);
-
 	timer->expires = expires;
 	/*
 	 * If 'idx' was calculated above and the base time did not advance
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
 {
 	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 *
+	 * The deferrable base does not do idle tracking at all, so we do
+	 * not forward it. This can result in very large variations in
+	 * granularity for deferrable timers, but they can be deferred for
+	 * long periods due to idle.
+	 */
+	base->was_idle = false;
+
 	__run_timers(base);
 	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
 		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));
-- 
2.13.3



^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 14:19                                                                                                     ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21 14:19 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Paul E. McKenney, Michael Ellerman, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Mon, 21 Aug 2017 11:18:33 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 21 Aug 2017 16:06:05 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Mon, 21 Aug 2017 10:52:58 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:      
> > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:        
> > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > >         
> > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:        
> > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > 
> > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?          
> > > > > > > 
> > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > 
> > > > > > > So this patch seems to fix it. Testing, comments welcome.        
> > > > > > 
> > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > callback function.
> > > > > > 
> > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > only when needed, it probably saves a load).        
> > > > > 
> > > > > Giving this one a whirl!        
> > > > 
> > > > No joy here, but then again there are other reasons to believe that I
> > > > am seeing a different bug than Dave and Jonathan are.
> > > > 
> > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > not statistically different from what I see without either patch.
> > > > 
> > > > But no statistical difference compared to without patch, and I still
> > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > be hypervisor preemption, I guess.      
> > > 
> > > Okay it makes the warnings go away for me, but I'm just booting then
> > > leaving the system idle. You're doing some CPU hotplug activity?    
> > 
> > Okay found a bug in the patch (it was not forwarding properly before
> > adding the first timer after an idle) and a few other concerns.
> > 
> > There's still a problem of a timer function doing a mod timer from
> > within expire_timers. It can't forward the base, which might currently
> > be quite a way behind. I *think* after we close these gaps and get
> > timely wakeups for timers on there, it should not get too far behind
> > for standard timers.
> > 
> > Deferrable is a different story. Firstly it has no idle tracking so we
> > never forward it. Even if we wanted to, we can't do it reliably because
> > it could contain timers way behind the base. They are "deferrable", so
> > you get what you pay for, but this still means there's a window where
> > you can add a deferrable timer and get a far later expiry than you
> > asked for despite the CPU never going idle after you added it.
> > 
> > All these problems would seem to go away if mod_timer just queued up
> > the timer to a single list on the base then pushed them into the
> > wheel during your wheel processing softirq... Although maybe you end
> > up with excessive passes over big queue of timers. Anyway that
> > wouldn't be suitable for 4.13 even if it could work.
> > 
> > I'll send out an updated minimal fix after some more testing...  
> 
> Hi All,
> 
> I'm back in the office with hardware access on our D05 64 core ARM64
> boards.
> 
> I think we still have by far the quickest test cases for this so
> feel free to ping me anything you want tested quickly (we were
> looking at an average of less than 10 minutes to trigger
> with machine idling).
> 
> Nick, I'm currently running your previous version and we are over an
> hour so even without any instances of the issue so it looks like a
> considerable improvement.  I'll see if I can line a couple of boards
> up for an overnight run if you have your updated version out by then.
> 
> Be great to finally put this one to bed.

Hi Jonathan,

Thanks here's an updated version with a couple more bugs fixed. If
you could try testing, that would be much appreciated.

Thanks,
Nick

---
 kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..2b9d2cdb3fac 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle; /* was it idle since last run/fwded */
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
@@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		 * dequeue/enqueue dance.
 		 */
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 
 		clk = base->clk;
 		idx = calc_wheel_index(expires, clk);
@@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		}
 	} else {
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 	}
 
 	ret = detach_if_pending(timer, base, false);
@@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 			raw_spin_lock(&base->lock);
 			WRITE_ONCE(timer->flags,
 				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
+			forward_timer_base(base);
 		}
 	}
 
-	/* Try to forward a stale timer base clock */
-	forward_timer_base(base);
-
 	timer->expires = expires;
 	/*
 	 * If 'idx' was calculated above and the base time did not advance
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
 {
 	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 *
+	 * The deferrable base does not do idle tracking at all, so we do
+	 * not forward it. This can result in very large variations in
+	 * granularity for deferrable timers, but they can be deferred for
+	 * long periods due to idle.
+	 */
+	base->was_idle = false;
+
 	__run_timers(base);
 	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
 		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 14:19                                                                                                     ` Nicholas Piggin
  0 siblings, 0 replies; 241+ messages in thread
From: Nicholas Piggin @ 2017-08-21 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 11:18:33 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 21 Aug 2017 16:06:05 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > On Mon, 21 Aug 2017 10:52:58 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >     
> > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:      
> > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:        
> > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > >         
> > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:        
> > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > 
> > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?          
> > > > > > > 
> > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > 
> > > > > > > So this patch seems to fix it. Testing, comments welcome.        
> > > > > > 
> > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > callback function.
> > > > > > 
> > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > only when needed, it probably saves a load).        
> > > > > 
> > > > > Giving this one a whirl!        
> > > > 
> > > > No joy here, but then again there are other reasons to believe that I
> > > > am seeing a different bug than Dave and Jonathan are.
> > > > 
> > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > not statistically different from what I see without either patch.
> > > > 
> > > > But no statistical difference compared to without patch, and I still
> > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > be hypervisor preemption, I guess.      
> > > 
> > > Okay it makes the warnings go away for me, but I'm just booting then
> > > leaving the system idle. You're doing some CPU hotplug activity?    
> > 
> > Okay found a bug in the patch (it was not forwarding properly before
> > adding the first timer after an idle) and a few other concerns.
> > 
> > There's still a problem of a timer function doing a mod timer from
> > within expire_timers. It can't forward the base, which might currently
> > be quite a way behind. I *think* after we close these gaps and get
> > timely wakeups for timers on there, it should not get too far behind
> > for standard timers.
> > 
> > Deferrable is a different story. Firstly it has no idle tracking so we
> > never forward it. Even if we wanted to, we can't do it reliably because
> > it could contain timers way behind the base. They are "deferrable", so
> > you get what you pay for, but this still means there's a window where
> > you can add a deferrable timer and get a far later expiry than you
> > asked for despite the CPU never going idle after you added it.
> > 
> > All these problems would seem to go away if mod_timer just queued up
> > the timer to a single list on the base then pushed them into the
> > wheel during your wheel processing softirq... Although maybe you end
> > up with excessive passes over big queue of timers. Anyway that
> > wouldn't be suitable for 4.13 even if it could work.
> > 
> > I'll send out an updated minimal fix after some more testing...  
> 
> Hi All,
> 
> I'm back in the office with hardware access on our D05 64 core ARM64
> boards.
> 
> I think we still have by far the quickest test cases for this so
> feel free to ping me anything you want tested quickly (we were
> looking at an average of less than 10 minutes to trigger
> with machine idling).
> 
> Nick, I'm currently running your previous version and we are over an
> hour so even without any instances of the issue so it looks like a
> considerable improvement.  I'll see if I can line a couple of boards
> up for an overnight run if you have your updated version out by then.
> 
> Be great to finally put this one to bed.

Hi Jonathan,

Thanks here's an updated version with a couple more bugs fixed. If
you could try testing, that would be much appreciated.

Thanks,
Nick

---
 kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..2b9d2cdb3fac 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
 	bool			migration_enabled;
 	bool			nohz_active;
 	bool			is_idle;
+	bool			was_idle; /* was it idle since last run/fwded */
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
 	struct hlist_head	vectors[WHEEL_SIZE];
 } ____cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-	unsigned long jnow = READ_ONCE(jiffies);
+	unsigned long jnow;
 
 	/*
-	 * We only forward the base when it's idle and we have a delta between
-	 * base clock and jiffies.
+	 * We only forward the base when we are idle or have just come out
+	 * of idle (was_idle logic), and have a delta between base clock
+	 * and jiffies. In the common case, run_timers will take care of it.
 	 */
-	if (!base->is_idle || (long) (jnow - base->clk) < 2)
+	if (likely(!base->was_idle))
+		return;
+
+	jnow = READ_ONCE(jiffies);
+	base->was_idle = base->is_idle;
+	if ((long)(jnow - base->clk) < 2)
 		return;
 
 	/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 	 * same array bucket then just return:
 	 */
 	if (timer_pending(timer)) {
+		/*
+		 * The downside of this optimization is that it can result in
+		 * larger granularity than you would get from adding a new
+		 * timer with this expiry. Would a timer flag for networking
+		 * be appropriate, then we can try to keep expiry of general
+		 * timers within ~1/8th of their interval?
+		 */
 		if (timer->expires == expires)
 			return 1;
 
@@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		 * dequeue/enqueue dance.
 		 */
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 
 		clk = base->clk;
 		idx = calc_wheel_index(expires, clk);
@@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 		}
 	} else {
 		base = lock_timer_base(timer, &flags);
+		forward_timer_base(base);
 	}
 
 	ret = detach_if_pending(timer, base, false);
@@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
 			raw_spin_lock(&base->lock);
 			WRITE_ONCE(timer->flags,
 				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
+			forward_timer_base(base);
 		}
 	}
 
-	/* Try to forward a stale timer base clock */
-	forward_timer_base(base);
-
 	timer->expires = expires;
 	/*
 	 * If 'idx' was calculated above and the base time did not advance
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 		/*
 		 * If we expect to sleep more than a tick, mark the base idle:
 		 */
-		if ((expires - basem) > TICK_NSEC)
+		if ((expires - basem) > TICK_NSEC) {
+			base->was_idle = true;
 			base->is_idle = true;
+		}
 	}
 	raw_spin_unlock(&base->lock);
 
@@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
 {
 	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
 
+	/*
+	 * was_idle must be cleared before running timers so that any timer
+	 * functions that call mod_timer will not try to forward the base.
+	 *
+	 * The deferrable base does not do idle tracking at all, so we do
+	 * not forward it. This can result in very large variations in
+	 * granularity for deferrable timers, but they can be deferred for
+	 * long periods due to idle.
+	 */
+	base->was_idle = false;
+
 	__run_timers(base);
 	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
 		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21 14:19                                                                                                     ` Nicholas Piggin
  (?)
@ 2017-08-21 15:02                                                                                                       ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 15:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 22 Aug 2017 00:19:28 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 11:18:33 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Mon, 21 Aug 2017 16:06:05 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Mon, 21 Aug 2017 10:52:58 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >     
> > > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >       
> > > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:        
> > > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:          
> > > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > > >           
> > > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:          
> > > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > > 
> > > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?            
> > > > > > > > 
> > > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > > 
> > > > > > > > So this patch seems to fix it. Testing, comments welcome.          
> > > > > > > 
> > > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > > callback function.
> > > > > > > 
> > > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > > only when needed, it probably saves a load).          
> > > > > > 
> > > > > > Giving this one a whirl!          
> > > > > 
> > > > > No joy here, but then again there are other reasons to believe that I
> > > > > am seeing a different bug than Dave and Jonathan are.
> > > > > 
> > > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > > not statistically different from what I see without either patch.
> > > > > 
> > > > > But no statistical difference compared to without patch, and I still
> > > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > > be hypervisor preemption, I guess.        
> > > > 
> > > > Okay it makes the warnings go away for me, but I'm just booting then
> > > > leaving the system idle. You're doing some CPU hotplug activity?      
> > > 
> > > Okay found a bug in the patch (it was not forwarding properly before
> > > adding the first timer after an idle) and a few other concerns.
> > > 
> > > There's still a problem of a timer function doing a mod timer from
> > > within expire_timers. It can't forward the base, which might currently
> > > be quite a way behind. I *think* after we close these gaps and get
> > > timely wakeups for timers on there, it should not get too far behind
> > > for standard timers.
> > > 
> > > Deferrable is a different story. Firstly it has no idle tracking so we
> > > never forward it. Even if we wanted to, we can't do it reliably because
> > > it could contain timers way behind the base. They are "deferrable", so
> > > you get what you pay for, but this still means there's a window where
> > > you can add a deferrable timer and get a far later expiry than you
> > > asked for despite the CPU never going idle after you added it.
> > > 
> > > All these problems would seem to go away if mod_timer just queued up
> > > the timer to a single list on the base then pushed them into the
> > > wheel during your wheel processing softirq... Although maybe you end
> > > up with excessive passes over big queue of timers. Anyway that
> > > wouldn't be suitable for 4.13 even if it could work.
> > > 
> > > I'll send out an updated minimal fix after some more testing...    
> > 
> > Hi All,
> > 
> > I'm back in the office with hardware access on our D05 64 core ARM64
> > boards.
> > 
> > I think we still have by far the quickest test cases for this so
> > feel free to ping me anything you want tested quickly (we were
> > looking at an average of less than 10 minutes to trigger
> > with machine idling).
> > 
> > Nick, I'm currently running your previous version and we are over an
> > hour so even without any instances of the issue so it looks like a
> > considerable improvement.  I'll see if I can line a couple of boards
> > up for an overnight run if you have your updated version out by then.
> > 
> > Be great to finally put this one to bed.  
> 
> Hi Jonathan,
> 
> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.
> 
> Thanks,
> Nick

Running now on 1 board. I'll grab another in a few hours and report back
in the morning if we don't see issues before I head off.

We got to about 5 hours on previous version without a problem vs
sub 10 minutes on the two baseline tests I ran without it, so even
with bugs that seemed to have dealt with the issue itself.

On 15 mins so far and all good.

Jonathan

> 
> ---
>  kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
>  1 file changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..2b9d2cdb3fac 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle; /* was it idle since last run/fwded */
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
>  
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
>  
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
>  
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires = expires)
>  			return 1;
>  
> @@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		 * dequeue/enqueue dance.
>  		 */
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  
>  		clk = base->clk;
>  		idx = calc_wheel_index(expires, clk);
> @@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		}
>  	} else {
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  	}
>  
>  	ret = detach_if_pending(timer, base, false);
> @@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  			raw_spin_lock(&base->lock);
>  			WRITE_ONCE(timer->flags,
>  				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
> +			forward_timer_base(base);
>  		}
>  	}
>  
> -	/* Try to forward a stale timer base clock */
> -	forward_timer_base(base);
> -
>  	timer->expires = expires;
>  	/*
>  	 * If 'idx' was calculated above and the base time did not advance
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
>  
> @@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
>  {
>  	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
>  
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 *
> +	 * The deferrable base does not do idle tracking at all, so we do
> +	 * not forward it. This can result in very large variations in
> +	 * granularity for deferrable timers, but they can be deferred for
> +	 * long periods due to idle.
> +	 */
> +	base->was_idle = false;
> +
>  	__run_timers(base);
>  	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
>  		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 15:02                                                                                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 15:02 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Paul E. McKenney, Michael Ellerman, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Tue, 22 Aug 2017 00:19:28 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 11:18:33 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Mon, 21 Aug 2017 16:06:05 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Mon, 21 Aug 2017 10:52:58 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >     
> > > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >       
> > > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:        
> > > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:          
> > > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > > >           
> > > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:          
> > > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > > 
> > > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?            
> > > > > > > > 
> > > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > > 
> > > > > > > > So this patch seems to fix it. Testing, comments welcome.          
> > > > > > > 
> > > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > > callback function.
> > > > > > > 
> > > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > > only when needed, it probably saves a load).          
> > > > > > 
> > > > > > Giving this one a whirl!          
> > > > > 
> > > > > No joy here, but then again there are other reasons to believe that I
> > > > > am seeing a different bug than Dave and Jonathan are.
> > > > > 
> > > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > > not statistically different from what I see without either patch.
> > > > > 
> > > > > But no statistical difference compared to without patch, and I still
> > > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > > be hypervisor preemption, I guess.        
> > > > 
> > > > Okay it makes the warnings go away for me, but I'm just booting then
> > > > leaving the system idle. You're doing some CPU hotplug activity?      
> > > 
> > > Okay found a bug in the patch (it was not forwarding properly before
> > > adding the first timer after an idle) and a few other concerns.
> > > 
> > > There's still a problem of a timer function doing a mod timer from
> > > within expire_timers. It can't forward the base, which might currently
> > > be quite a way behind. I *think* after we close these gaps and get
> > > timely wakeups for timers on there, it should not get too far behind
> > > for standard timers.
> > > 
> > > Deferrable is a different story. Firstly it has no idle tracking so we
> > > never forward it. Even if we wanted to, we can't do it reliably because
> > > it could contain timers way behind the base. They are "deferrable", so
> > > you get what you pay for, but this still means there's a window where
> > > you can add a deferrable timer and get a far later expiry than you
> > > asked for despite the CPU never going idle after you added it.
> > > 
> > > All these problems would seem to go away if mod_timer just queued up
> > > the timer to a single list on the base then pushed them into the
> > > wheel during your wheel processing softirq... Although maybe you end
> > > up with excessive passes over big queue of timers. Anyway that
> > > wouldn't be suitable for 4.13 even if it could work.
> > > 
> > > I'll send out an updated minimal fix after some more testing...    
> > 
> > Hi All,
> > 
> > I'm back in the office with hardware access on our D05 64 core ARM64
> > boards.
> > 
> > I think we still have by far the quickest test cases for this so
> > feel free to ping me anything you want tested quickly (we were
> > looking at an average of less than 10 minutes to trigger
> > with machine idling).
> > 
> > Nick, I'm currently running your previous version and we are over an
> > hour so even without any instances of the issue so it looks like a
> > considerable improvement.  I'll see if I can line a couple of boards
> > up for an overnight run if you have your updated version out by then.
> > 
> > Be great to finally put this one to bed.  
> 
> Hi Jonathan,
> 
> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.
> 
> Thanks,
> Nick

Running now on 1 board. I'll grab another in a few hours and report back
in the morning if we don't see issues before I head off.

We got to about 5 hours on previous version without a problem vs
sub 10 minutes on the two baseline tests I ran without it, so even
with bugs that seemed to have dealt with the issue itself.

On 15 mins so far and all good.

Jonathan

> 
> ---
>  kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
>  1 file changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..2b9d2cdb3fac 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle; /* was it idle since last run/fwded */
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
>  
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
>  
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
>  
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires == expires)
>  			return 1;
>  
> @@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		 * dequeue/enqueue dance.
>  		 */
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  
>  		clk = base->clk;
>  		idx = calc_wheel_index(expires, clk);
> @@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		}
>  	} else {
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  	}
>  
>  	ret = detach_if_pending(timer, base, false);
> @@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  			raw_spin_lock(&base->lock);
>  			WRITE_ONCE(timer->flags,
>  				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
> +			forward_timer_base(base);
>  		}
>  	}
>  
> -	/* Try to forward a stale timer base clock */
> -	forward_timer_base(base);
> -
>  	timer->expires = expires;
>  	/*
>  	 * If 'idx' was calculated above and the base time did not advance
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
>  
> @@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
>  {
>  	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
>  
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 *
> +	 * The deferrable base does not do idle tracking at all, so we do
> +	 * not forward it. This can result in very large variations in
> +	 * granularity for deferrable timers, but they can be deferred for
> +	 * long periods due to idle.
> +	 */
> +	base->was_idle = false;
> +
>  	__run_timers(base);
>  	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
>  		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 15:02                                                                                                       ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-21 15:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 22 Aug 2017 00:19:28 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Mon, 21 Aug 2017 11:18:33 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Mon, 21 Aug 2017 16:06:05 +1000
> > Nicholas Piggin <npiggin@gmail.com> wrote:
> >   
> > > On Mon, 21 Aug 2017 10:52:58 +1000
> > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > >     
> > > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >       
> > > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:        
> > > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:          
> > > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > > > > >           
> > > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:          
> > > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > > > 
> > > > > > > > > Thomas, John, am I misinterpreting the timer trace event messages?            
> > > > > > > > 
> > > > > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > > > > 
> > > > > > > > So this patch seems to fix it. Testing, comments welcome.          
> > > > > > > 
> > > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > > callback function.
> > > > > > > 
> > > > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > > > only when needed, it probably saves a load).          
> > > > > > 
> > > > > > Giving this one a whirl!          
> > > > > 
> > > > > No joy here, but then again there are other reasons to believe that I
> > > > > am seeing a different bug than Dave and Jonathan are.
> > > > > 
> > > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > > not statistically different from what I see without either patch.
> > > > > 
> > > > > But no statistical difference compared to without patch, and I still
> > > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > > be hypervisor preemption, I guess.        
> > > > 
> > > > Okay it makes the warnings go away for me, but I'm just booting then
> > > > leaving the system idle. You're doing some CPU hotplug activity?      
> > > 
> > > Okay found a bug in the patch (it was not forwarding properly before
> > > adding the first timer after an idle) and a few other concerns.
> > > 
> > > There's still a problem of a timer function doing a mod timer from
> > > within expire_timers. It can't forward the base, which might currently
> > > be quite a way behind. I *think* after we close these gaps and get
> > > timely wakeups for timers on there, it should not get too far behind
> > > for standard timers.
> > > 
> > > Deferrable is a different story. Firstly it has no idle tracking so we
> > > never forward it. Even if we wanted to, we can't do it reliably because
> > > it could contain timers way behind the base. They are "deferrable", so
> > > you get what you pay for, but this still means there's a window where
> > > you can add a deferrable timer and get a far later expiry than you
> > > asked for despite the CPU never going idle after you added it.
> > > 
> > > All these problems would seem to go away if mod_timer just queued up
> > > the timer to a single list on the base then pushed them into the
> > > wheel during your wheel processing softirq... Although maybe you end
> > > up with excessive passes over big queue of timers. Anyway that
> > > wouldn't be suitable for 4.13 even if it could work.
> > > 
> > > I'll send out an updated minimal fix after some more testing...    
> > 
> > Hi All,
> > 
> > I'm back in the office with hardware access on our D05 64 core ARM64
> > boards.
> > 
> > I think we still have by far the quickest test cases for this so
> > feel free to ping me anything you want tested quickly (we were
> > looking at an average of less than 10 minutes to trigger
> > with machine idling).
> > 
> > Nick, I'm currently running your previous version and we are over an
> > hour so even without any instances of the issue so it looks like a
> > considerable improvement.  I'll see if I can line a couple of boards
> > up for an overnight run if you have your updated version out by then.
> > 
> > Be great to finally put this one to bed.  
> 
> Hi Jonathan,
> 
> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.
> 
> Thanks,
> Nick

Running now on 1 board. I'll grab another in a few hours and report back
in the morning if we don't see issues before I head off.

We got to about 5 hours on previous version without a problem vs
sub 10 minutes on the two baseline tests I ran without it, so even
with bugs that seemed to have dealt with the issue itself.

On 15 mins so far and all good.

Jonathan

> 
> ---
>  kernel/time/timer.c | 43 +++++++++++++++++++++++++++++++++++--------
>  1 file changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..2b9d2cdb3fac 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>  	bool			migration_enabled;
>  	bool			nohz_active;
>  	bool			is_idle;
> +	bool			was_idle; /* was it idle since last run/fwded */
>  	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>  	struct hlist_head	vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
>  
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> -	unsigned long jnow = READ_ONCE(jiffies);
> +	unsigned long jnow;
>  
>  	/*
> -	 * We only forward the base when it's idle and we have a delta between
> -	 * base clock and jiffies.
> +	 * We only forward the base when we are idle or have just come out
> +	 * of idle (was_idle logic), and have a delta between base clock
> +	 * and jiffies. In the common case, run_timers will take care of it.
>  	 */
> -	if (!base->is_idle || (long) (jnow - base->clk) < 2)
> +	if (likely(!base->was_idle))
> +		return;
> +
> +	jnow = READ_ONCE(jiffies);
> +	base->was_idle = base->is_idle;
> +	if ((long)(jnow - base->clk) < 2)
>  		return;
>  
>  	/*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  	 * same array bucket then just return:
>  	 */
>  	if (timer_pending(timer)) {
> +		/*
> +		 * The downside of this optimization is that it can result in
> +		 * larger granularity than you would get from adding a new
> +		 * timer with this expiry. Would a timer flag for networking
> +		 * be appropriate, then we can try to keep expiry of general
> +		 * timers within ~1/8th of their interval?
> +		 */
>  		if (timer->expires == expires)
>  			return 1;
>  
> @@ -948,6 +962,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		 * dequeue/enqueue dance.
>  		 */
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  
>  		clk = base->clk;
>  		idx = calc_wheel_index(expires, clk);
> @@ -964,6 +979,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  		}
>  	} else {
>  		base = lock_timer_base(timer, &flags);
> +		forward_timer_base(base);
>  	}
>  
>  	ret = detach_if_pending(timer, base, false);
> @@ -991,12 +1007,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
>  			raw_spin_lock(&base->lock);
>  			WRITE_ONCE(timer->flags,
>  				   (timer->flags & ~TIMER_BASEMASK) | base->cpu);
> +			forward_timer_base(base);
>  		}
>  	}
>  
> -	/* Try to forward a stale timer base clock */
> -	forward_timer_base(base);
> -
>  	timer->expires = expires;
>  	/*
>  	 * If 'idx' was calculated above and the base time did not advance
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  		/*
>  		 * If we expect to sleep more than a tick, mark the base idle:
>  		 */
> -		if ((expires - basem) > TICK_NSEC)
> +		if ((expires - basem) > TICK_NSEC) {
> +			base->was_idle = true;
>  			base->is_idle = true;
> +		}
>  	}
>  	raw_spin_unlock(&base->lock);
>  
> @@ -1611,6 +1627,17 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h)
>  {
>  	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
>  
> +	/*
> +	 * was_idle must be cleared before running timers so that any timer
> +	 * functions that call mod_timer will not try to forward the base.
> +	 *
> +	 * The deferrable base does not do idle tracking at all, so we do
> +	 * not forward it. This can result in very large variations in
> +	 * granularity for deferrable timers, but they can be deferred for
> +	 * long periods due to idle.
> +	 */
> +	base->was_idle = false;
> +
>  	__run_timers(base);
>  	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
>  		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21 14:19                                                                                                     ` Nicholas Piggin
  (?)
@ 2017-08-21 20:55                                                                                                       ` David Miller
  -1 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-21 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

From: Nicholas Piggin <npiggin@gmail.com>
Date: Tue, 22 Aug 2017 00:19:28 +1000

> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.

I'm not getting RCU stalls on sparc64 any longer with this patch.

I'm really happy you guys were able to figure out what was going
wrong. :-)

Feel free to add my Tested-by:


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 20:55                                                                                                       ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-21 20:55 UTC (permalink / raw)
  To: npiggin
  Cc: Jonathan.Cameron, paulmck, mpe, dzickus, sfr, linuxarm, abdhalee,
	tglx, sparclinux, akpm, linuxppc-dev, linux-arm-kernel,
	john.stultz

From: Nicholas Piggin <npiggin@gmail.com>
Date: Tue, 22 Aug 2017 00:19:28 +1000

> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.

I'm not getting RCU stalls on sparc64 any longer with this patch.

I'm really happy you guys were able to figure out what was going
wrong. :-)

Feel free to add my Tested-by:

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-21 20:55                                                                                                       ` David Miller
  0 siblings, 0 replies; 241+ messages in thread
From: David Miller @ 2017-08-21 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

From: Nicholas Piggin <npiggin@gmail.com>
Date: Tue, 22 Aug 2017 00:19:28 +1000

> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.

I'm not getting RCU stalls on sparc64 any longer with this patch.

I'm really happy you guys were able to figure out what was going
wrong. :-)

Feel free to add my Tested-by:

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21  0:52                                                                                               ` Nicholas Piggin
  (?)
@ 2017-08-22  0:38                                                                                                 ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 10:52:58AM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >   
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.  
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).  
> > > 
> > > Giving this one a whirl!  
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Yes, along with rcutorture, so a very different workload.

						Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  0:38                                                                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22  0:38 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michael Ellerman, Jonathan Cameron, dzickus, sfr, linuxarm,
	abdhalee, tglx, sparclinux, akpm, linuxppc-dev, David Miller,
	linux-arm-kernel, john.stultz

On Mon, Aug 21, 2017 at 10:52:58AM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >   
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.  
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).  
> > > 
> > > Giving this one a whirl!  
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Yes, along with rcutorture, so a very different workload.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  0:38                                                                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 10:52:58AM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin <npiggin@gmail.com> wrote:
> > > >   
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:  
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?    
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.  
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).  
> > > 
> > > Giving this one a whirl!  
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Yes, along with rcutorture, so a very different workload.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-21 20:55                                                                                                       ` David Miller
  (?)
@ 2017-08-22  7:49                                                                                                         ` Jonathan Cameron
  -1 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-22  7:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 13:55:04 -0700
David Miller <davem@davemloft.net> wrote:

> From: Nicholas Piggin <npiggin@gmail.com>
> Date: Tue, 22 Aug 2017 00:19:28 +1000
> 
> > Thanks here's an updated version with a couple more bugs fixed. If
> > you could try testing, that would be much appreciated.  
> 
> I'm not getting RCU stalls on sparc64 any longer with this patch.
> 
> I'm really happy you guys were able to figure out what was going
> wrong. :-)
> 
> Feel free to add my Tested-by:
> 

Like wise - 16 hours of clean run with the latest

Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for all the hard work everyone put into this one, great to
cross it off the list!

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  7:49                                                                                                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-22  7:49 UTC (permalink / raw)
  To: David Miller
  Cc: npiggin, paulmck, mpe, dzickus, sfr, linuxarm, abdhalee, tglx,
	sparclinux, akpm, linuxppc-dev, linux-arm-kernel, john.stultz

On Mon, 21 Aug 2017 13:55:04 -0700
David Miller <davem@davemloft.net> wrote:

> From: Nicholas Piggin <npiggin@gmail.com>
> Date: Tue, 22 Aug 2017 00:19:28 +1000
> 
> > Thanks here's an updated version with a couple more bugs fixed. If
> > you could try testing, that would be much appreciated.  
> 
> I'm not getting RCU stalls on sparc64 any longer with this patch.
> 
> I'm really happy you guys were able to figure out what was going
> wrong. :-)
> 
> Feel free to add my Tested-by:
> 

Like wise - 16 hours of clean run with the latest

Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for all the hard work everyone put into this one, great to
cross it off the list!

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  7:49                                                                                                         ` Jonathan Cameron
  0 siblings, 0 replies; 241+ messages in thread
From: Jonathan Cameron @ 2017-08-22  7:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 21 Aug 2017 13:55:04 -0700
David Miller <davem@davemloft.net> wrote:

> From: Nicholas Piggin <npiggin@gmail.com>
> Date: Tue, 22 Aug 2017 00:19:28 +1000
> 
> > Thanks here's an updated version with a couple more bugs fixed. If
> > you could try testing, that would be much appreciated.  
> 
> I'm not getting RCU stalls on sparc64 any longer with this patch.
> 
> I'm really happy you guys were able to figure out what was going
> wrong. :-)
> 
> Feel free to add my Tested-by:
> 

Like wise - 16 hours of clean run with the latest

Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for all the hard work everyone put into this one, great to
cross it off the list!

Jonathan

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-22  7:49                                                                                                         ` Jonathan Cameron
  (?)
@ 2017-08-22  8:51                                                                                                           ` Abdul Haleem
  -1 siblings, 0 replies; 241+ messages in thread
From: Abdul Haleem @ 2017-08-22  8:51 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Miller, npiggin, paulmck, mpe, dzickus, sfr, linuxarm,
	tglx, sparclinux, akpm, linuxppc-dev, linux-arm-kernel,
	john.stultz

On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> On Mon, 21 Aug 2017 13:55:04 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Nicholas Piggin <npiggin@gmail.com>
> > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > 
> > > Thanks here's an updated version with a couple more bugs fixed. If
> > > you could try testing, that would be much appreciated.  
> > 
> > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > 
> > I'm really happy you guys were able to figure out what was going
> > wrong. :-)
> > 
> > Feel free to add my Tested-by:
> > 
> 
> Like wise - 16 hours of clean run with the latest
> 
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Thanks for all the hard work everyone put into this one, great to
> cross it off the list!
> 
> Jonathan
> 

No more RCU stalls on PowerPC, system is clean when idle or with some
test runs.

Thank you all for your time and efforts in fixing this.

Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  8:51                                                                                                           ` Abdul Haleem
  0 siblings, 0 replies; 241+ messages in thread
From: Abdul Haleem @ 2017-08-22  8:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> On Mon, 21 Aug 2017 13:55:04 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Nicholas Piggin <npiggin@gmail.com>
> > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > 
> > > Thanks here's an updated version with a couple more bugs fixed. If
> > > you could try testing, that would be much appreciated.  
> > 
> > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > 
> > I'm really happy you guys were able to figure out what was going
> > wrong. :-)
> > 
> > Feel free to add my Tested-by:
> > 
> 
> Like wise - 16 hours of clean run with the latest
> 
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Thanks for all the hard work everyone put into this one, great to
> cross it off the list!
> 
> Jonathan
> 

No more RCU stalls on PowerPC, system is clean when idle or with some
test runs.

Thank you all for your time and efforts in fixing this.

Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22  8:51                                                                                                           ` Abdul Haleem
  0 siblings, 0 replies; 241+ messages in thread
From: Abdul Haleem @ 2017-08-22  8:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> On Mon, 21 Aug 2017 13:55:04 -0700
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Nicholas Piggin <npiggin@gmail.com>
> > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > 
> > > Thanks here's an updated version with a couple more bugs fixed. If
> > > you could try testing, that would be much appreciated.  
> > 
> > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > 
> > I'm really happy you guys were able to figure out what was going
> > wrong. :-)
> > 
> > Feel free to add my Tested-by:
> > 
> 
> Like wise - 16 hours of clean run with the latest
> 
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Thanks for all the hard work everyone put into this one, great to
> cross it off the list!
> 
> Jonathan
> 

No more RCU stalls on PowerPC, system is clean when idle or with some
test runs.

Thank you all for your time and efforts in fixing this.

Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre




^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-22  8:51                                                                                                           ` Abdul Haleem
  (?)
@ 2017-08-22 15:26                                                                                                             ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> > On Mon, 21 Aug 2017 13:55:04 -0700
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Nicholas Piggin <npiggin@gmail.com>
> > > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > > 
> > > > Thanks here's an updated version with a couple more bugs fixed. If
> > > > you could try testing, that would be much appreciated.  
> > > 
> > > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > > 
> > > I'm really happy you guys were able to figure out what was going
> > > wrong. :-)
> > > 
> > > Feel free to add my Tested-by:
> > > 
> > 
> > Like wise - 16 hours of clean run with the latest
> > 
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > Thanks for all the hard work everyone put into this one, great to
> > cross it off the list!
> > 
> > Jonathan
> > 
> 
> No more RCU stalls on PowerPC, system is clean when idle or with some
> test runs.
> 
> Thank you all for your time and efforts in fixing this.
> 
> Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

I am still seeing failures, but then again I am running rcutorture with
lots of CPU hotplug activity.  So I am probably seeing some other bug,
though it still looks a lot like a lost timer.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22 15:26                                                                                                             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22 15:26 UTC (permalink / raw)
  To: Abdul Haleem
  Cc: Jonathan Cameron, David Miller, npiggin, mpe, dzickus, sfr,
	linuxarm, tglx, sparclinux, akpm, linuxppc-dev, linux-arm-kernel,
	john.stultz

On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> > On Mon, 21 Aug 2017 13:55:04 -0700
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Nicholas Piggin <npiggin@gmail.com>
> > > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > > 
> > > > Thanks here's an updated version with a couple more bugs fixed. If
> > > > you could try testing, that would be much appreciated.  
> > > 
> > > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > > 
> > > I'm really happy you guys were able to figure out what was going
> > > wrong. :-)
> > > 
> > > Feel free to add my Tested-by:
> > > 
> > 
> > Like wise - 16 hours of clean run with the latest
> > 
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > Thanks for all the hard work everyone put into this one, great to
> > cross it off the list!
> > 
> > Jonathan
> > 
> 
> No more RCU stalls on PowerPC, system is clean when idle or with some
> test runs.
> 
> Thank you all for your time and efforts in fixing this.
> 
> Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

I am still seeing failures, but then again I am running rcutorture with
lots of CPU hotplug activity.  So I am probably seeing some other bug,
though it still looks a lot like a lost timer.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-08-22 15:26                                                                                                             ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-08-22 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> > On Mon, 21 Aug 2017 13:55:04 -0700
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Nicholas Piggin <npiggin@gmail.com>
> > > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > > 
> > > > Thanks here's an updated version with a couple more bugs fixed. If
> > > > you could try testing, that would be much appreciated.  
> > > 
> > > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > > 
> > > I'm really happy you guys were able to figure out what was going
> > > wrong. :-)
> > > 
> > > Feel free to add my Tested-by:
> > > 
> > 
> > Like wise - 16 hours of clean run with the latest
> > 
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > Thanks for all the hard work everyone put into this one, great to
> > cross it off the list!
> > 
> > Jonathan
> > 
> 
> No more RCU stalls on PowerPC, system is clean when idle or with some
> test runs.
> 
> Thank you all for your time and efforts in fixing this.
> 
> Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>

I am still seeing failures, but then again I am running rcutorture with
lots of CPU hotplug activity.  So I am probably seeing some other bug,
though it still looks a lot like a lost timer.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
  2017-08-22 15:26                                                                                                             ` Paul E. McKenney
  (?)
@ 2017-09-06 12:28                                                                                                               ` Paul E. McKenney
  -1 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-09-06 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 22, 2017 at 08:26:37AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> > On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:

[ . . . ]

> > No more RCU stalls on PowerPC, system is clean when idle or with some
> > test runs.
> > 
> > Thank you all for your time and efforts in fixing this.
> > 
> > Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> 
> I am still seeing failures, but then again I am running rcutorture with
> lots of CPU hotplug activity.  So I am probably seeing some other bug,
> though it still looks a lot like a lost timer.

So one problem appears to be a timing-related deadlock between RCU and
timers.  The way that this can happen is that the outgoing CPU goes
offline (as in cpuhp_report_idle_dead() invoked from do_idle()) with
one of RCU's grace-period kthread's timers queued.  Now, if someone
waits for a grace period, either directly or indirectly, in a way that
blocks the hotplug notifiers, execution will never reach timers_dead_cpu(),
which means that the grace-period kthread will never wake, which will
mean that the grace period will never complete.  Classic deadlock.

I currently have an extremely ugly workaround for this deadlock, which
is to periodically and (usually) redundantly wake up all the RCU
grace-period kthreads from the scheduling-interrupt handler.  This is
of course completely inappropriate for mainline, but it does reliably
prevent the "kthread starved for %ld jiffies!" type of RCU CPU stall
warning that I would otherwise see.

To mainline this, one approach would be to make the timers switch to
add_timer_on() to a surviving CPU once the offlining process starts.
Alternatively, I suppose that RCU could do the redundant-wakeup kludge,
but with checks to prevent it from happening unless (1) there is a CPU
in the process of going offline (2) there is an RCU grace period in
progress, and (3) the RCU grace period kthread has been blocked for
(say) three times longer than it should have.

Unfortunately, this is not sufficient to make rcutorture run reliably,
though it does help, which is of course to say that it makes debugging
slower.  ;-)

What happens now is that random rcutorture kthreads will hang waiting for
timeouts to complete.  This confused me for awhile because I expected
that the timeouts would be delayed during offline processing, but that
my crude deadlock-resolution approach would eventually get things going.
My current suspicion is that the problem is due to a potential delay
between the time an outgoing CPU hits cpuhp_report_idle_dead() and the
timers get migrated from timers_dead_cpu().  This means that the CPU
adopting the timers might be a few ticks ahead of where the outgoing CPU
last processed timers.  My current guess is that any timers queued in
intervening indexes are going to wait one good long time.  And I don't see
any code in the timers_dead_cpu() that would account for this possibility,
though I of course cannot claim to fully understand this code..

Is this plausible, or am I confused?  (Either way, -something- besides
just me is rather thoroughly confused!)

If this is plausible, my guess is that timers_dead_cpu() needs to check
for mismatched indexes (in timer->flags?) and force any intervening
timers to expire if so.

Thoughts?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 241+ messages in thread

* Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-09-06 12:28                                                                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-09-06 12:28 UTC (permalink / raw)
  To: Abdul Haleem
  Cc: Jonathan Cameron, David Miller, npiggin, mpe, dzickus, sfr,
	linuxarm, tglx, sparclinux, akpm, linuxppc-dev, linux-arm-kernel,
	john.stultz, anna-maria

On Tue, Aug 22, 2017 at 08:26:37AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> > On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:

[ . . . ]

> > No more RCU stalls on PowerPC, system is clean when idle or with some
> > test runs.
> > 
> > Thank you all for your time and efforts in fixing this.
> > 
> > Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> 
> I am still seeing failures, but then again I am running rcutorture with
> lots of CPU hotplug activity.  So I am probably seeing some other bug,
> though it still looks a lot like a lost timer.

So one problem appears to be a timing-related deadlock between RCU and
timers.  The way that this can happen is that the outgoing CPU goes
offline (as in cpuhp_report_idle_dead() invoked from do_idle()) with
one of RCU's grace-period kthread's timers queued.  Now, if someone
waits for a grace period, either directly or indirectly, in a way that
blocks the hotplug notifiers, execution will never reach timers_dead_cpu(),
which means that the grace-period kthread will never wake, which will
mean that the grace period will never complete.  Classic deadlock.

I currently have an extremely ugly workaround for this deadlock, which
is to periodically and (usually) redundantly wake up all the RCU
grace-period kthreads from the scheduling-interrupt handler.  This is
of course completely inappropriate for mainline, but it does reliably
prevent the "kthread starved for %ld jiffies!" type of RCU CPU stall
warning that I would otherwise see.

To mainline this, one approach would be to make the timers switch to
add_timer_on() to a surviving CPU once the offlining process starts.
Alternatively, I suppose that RCU could do the redundant-wakeup kludge,
but with checks to prevent it from happening unless (1) there is a CPU
in the process of going offline (2) there is an RCU grace period in
progress, and (3) the RCU grace period kthread has been blocked for
(say) three times longer than it should have.

Unfortunately, this is not sufficient to make rcutorture run reliably,
though it does help, which is of course to say that it makes debugging
slower.  ;-)

What happens now is that random rcutorture kthreads will hang waiting for
timeouts to complete.  This confused me for awhile because I expected
that the timeouts would be delayed during offline processing, but that
my crude deadlock-resolution approach would eventually get things going.
My current suspicion is that the problem is due to a potential delay
between the time an outgoing CPU hits cpuhp_report_idle_dead() and the
timers get migrated from timers_dead_cpu().  This means that the CPU
adopting the timers might be a few ticks ahead of where the outgoing CPU
last processed timers.  My current guess is that any timers queued in
intervening indexes are going to wait one good long time.  And I don't see
any code in the timers_dead_cpu() that would account for this possibility,
though I of course cannot claim to fully understand this code..

Is this plausible, or am I confused?  (Either way, -something- besides
just me is rather thoroughly confused!)

If this is plausible, my guess is that timers_dead_cpu() needs to check
for mismatched indexes (in timer->flags?) and force any intervening
timers to expire if so.

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

* RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
@ 2017-09-06 12:28                                                                                                               ` Paul E. McKenney
  0 siblings, 0 replies; 241+ messages in thread
From: Paul E. McKenney @ 2017-09-06 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 22, 2017 at 08:26:37AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> > On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:

[ . . . ]

> > No more RCU stalls on PowerPC, system is clean when idle or with some
> > test runs.
> > 
> > Thank you all for your time and efforts in fixing this.
> > 
> > Reported-and-Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
> 
> I am still seeing failures, but then again I am running rcutorture with
> lots of CPU hotplug activity.  So I am probably seeing some other bug,
> though it still looks a lot like a lost timer.

So one problem appears to be a timing-related deadlock between RCU and
timers.  The way that this can happen is that the outgoing CPU goes
offline (as in cpuhp_report_idle_dead() invoked from do_idle()) with
one of RCU's grace-period kthread's timers queued.  Now, if someone
waits for a grace period, either directly or indirectly, in a way that
blocks the hotplug notifiers, execution will never reach timers_dead_cpu(),
which means that the grace-period kthread will never wake, which will
mean that the grace period will never complete.  Classic deadlock.

I currently have an extremely ugly workaround for this deadlock, which
is to periodically and (usually) redundantly wake up all the RCU
grace-period kthreads from the scheduling-interrupt handler.  This is
of course completely inappropriate for mainline, but it does reliably
prevent the "kthread starved for %ld jiffies!" type of RCU CPU stall
warning that I would otherwise see.

To mainline this, one approach would be to make the timers switch to
add_timer_on() to a surviving CPU once the offlining process starts.
Alternatively, I suppose that RCU could do the redundant-wakeup kludge,
but with checks to prevent it from happening unless (1) there is a CPU
in the process of going offline (2) there is an RCU grace period in
progress, and (3) the RCU grace period kthread has been blocked for
(say) three times longer than it should have.

Unfortunately, this is not sufficient to make rcutorture run reliably,
though it does help, which is of course to say that it makes debugging
slower.  ;-)

What happens now is that random rcutorture kthreads will hang waiting for
timeouts to complete.  This confused me for awhile because I expected
that the timeouts would be delayed during offline processing, but that
my crude deadlock-resolution approach would eventually get things going.
My current suspicion is that the problem is due to a potential delay
between the time an outgoing CPU hits cpuhp_report_idle_dead() and the
timers get migrated from timers_dead_cpu().  This means that the CPU
adopting the timers might be a few ticks ahead of where the outgoing CPU
last processed timers.  My current guess is that any timers queued in
intervening indexes are going to wait one good long time.  And I don't see
any code in the timers_dead_cpu() that would account for this possibility,
though I of course cannot claim to fully understand this code..

Is this plausible, or am I confused?  (Either way, -something- besides
just me is rather thoroughly confused!)

If this is plausible, my guess is that timers_dead_cpu() needs to check
for mismatched indexes (in timer->flags?) and force any intervening
timers to expire if so.

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 241+ messages in thread

end of thread, other threads:[~2017-09-06 12:28 UTC | newest]

Thread overview: 241+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-25 11:32 RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Jonathan Cameron
2017-07-25 12:26 ` Nicholas Piggin
2017-07-25 12:26   ` Nicholas Piggin
2017-07-25 13:46   ` Paul E. McKenney
2017-07-25 13:46     ` Paul E. McKenney
2017-07-25 13:46     ` Paul E. McKenney
2017-07-25 14:42     ` Jonathan Cameron
2017-07-25 14:42       ` Jonathan Cameron
2017-07-25 14:42       ` Jonathan Cameron
2017-07-25 15:12       ` Paul E. McKenney
2017-07-25 15:12         ` Paul E. McKenney
2017-07-25 15:12         ` Paul E. McKenney
2017-07-25 16:52         ` Jonathan Cameron
2017-07-25 16:52           ` Jonathan Cameron
2017-07-25 16:52           ` Jonathan Cameron
2017-07-25 21:10           ` David Miller
2017-07-25 21:10             ` David Miller
2017-07-25 21:10             ` David Miller
2017-07-26  3:55             ` Paul E. McKenney
2017-07-26  3:55               ` Paul E. McKenney
2017-07-26  3:55               ` Paul E. McKenney
2017-07-26  4:02               ` David Miller
2017-07-26  4:02                 ` David Miller
2017-07-26  4:02                 ` David Miller
2017-07-26  4:12                 ` Paul E. McKenney
2017-07-26  4:12                   ` Paul E. McKenney
2017-07-26  4:12                   ` Paul E. McKenney
2017-07-26  8:16                   ` Jonathan Cameron
2017-07-26  8:16                     ` Jonathan Cameron
2017-07-26  8:16                     ` Jonathan Cameron
2017-07-26  9:32                     ` Jonathan Cameron
2017-07-26  9:32                       ` Jonathan Cameron
2017-07-26  9:32                       ` Jonathan Cameron
2017-07-26 12:28                       ` Jonathan Cameron
2017-07-26 12:28                         ` Jonathan Cameron
2017-07-26 12:28                         ` Jonathan Cameron
2017-07-26 12:49                         ` Jonathan Cameron
2017-07-26 12:49                           ` Jonathan Cameron
2017-07-26 12:49                           ` Jonathan Cameron
2017-07-26 14:14                         ` Paul E. McKenney
2017-07-26 14:14                           ` Paul E. McKenney
2017-07-26 14:14                           ` Paul E. McKenney
2017-07-26 14:23                           ` Jonathan Cameron
2017-07-26 14:23                             ` Jonathan Cameron
2017-07-26 14:23                             ` Jonathan Cameron
2017-07-26 15:33                             ` Jonathan Cameron
2017-07-26 15:33                               ` Jonathan Cameron
2017-07-26 15:33                               ` Jonathan Cameron
2017-07-26 15:49                               ` Paul E. McKenney
2017-07-26 15:49                                 ` Paul E. McKenney
2017-07-26 15:49                                 ` Paul E. McKenney
2017-07-26 16:54                                 ` David Miller
2017-07-26 16:54                                   ` David Miller
2017-07-26 16:54                                   ` David Miller
2017-07-26 17:13                                   ` Jonathan Cameron
2017-07-26 17:13                                     ` Jonathan Cameron
2017-07-26 17:13                                     ` Jonathan Cameron
2017-07-27  7:41                                     ` Jonathan Cameron
2017-07-27  7:41                                       ` Jonathan Cameron
2017-07-27  7:41                                       ` Jonathan Cameron
2017-07-26 17:50                                   ` Paul E. McKenney
2017-07-26 17:50                                     ` Paul E. McKenney
2017-07-26 17:50                                     ` Paul E. McKenney
2017-07-26 22:36                                     ` Paul E. McKenney
2017-07-26 22:36                                       ` Paul E. McKenney
2017-07-26 22:36                                       ` Paul E. McKenney
2017-07-26 22:45                                       ` David Miller
2017-07-26 22:45                                         ` David Miller
2017-07-26 22:45                                         ` David Miller
2017-07-26 23:15                                         ` Paul E. McKenney
2017-07-26 23:15                                           ` Paul E. McKenney
2017-07-26 23:15                                           ` Paul E. McKenney
2017-07-26 23:22                                           ` David Miller
2017-07-26 23:22                                             ` David Miller
2017-07-26 23:22                                             ` David Miller
2017-07-27  1:42                                             ` Paul E. McKenney
2017-07-27  1:42                                               ` Paul E. McKenney
2017-07-27  1:42                                               ` Paul E. McKenney
2017-07-27  4:34                                               ` Nicholas Piggin
2017-07-27  4:34                                                 ` Nicholas Piggin
2017-07-27  4:34                                                 ` Nicholas Piggin
2017-07-27 12:49                                                 ` Paul E. McKenney
2017-07-27 12:49                                                   ` Paul E. McKenney
2017-07-27 12:49                                                   ` Paul E. McKenney
2017-07-27 13:49                                                   ` Jonathan Cameron
2017-07-27 13:49                                                     ` Jonathan Cameron
2017-07-27 13:49                                                     ` Jonathan Cameron
2017-07-27 16:39                                                     ` Jonathan Cameron
2017-07-27 16:39                                                       ` Jonathan Cameron
2017-07-27 16:39                                                       ` Jonathan Cameron
2017-07-27 16:52                                                       ` Paul E. McKenney
2017-07-27 16:52                                                         ` Paul E. McKenney
2017-07-27 16:52                                                         ` Paul E. McKenney
2017-07-28  7:44                                                         ` Jonathan Cameron
2017-07-28  7:44                                                           ` Jonathan Cameron
2017-07-28  7:44                                                           ` Jonathan Cameron
2017-07-28 12:54                                                           ` Boqun Feng
2017-07-28 12:54                                                             ` Boqun Feng
2017-07-28 12:54                                                             ` Boqun Feng
2017-07-28 13:13                                                             ` Jonathan Cameron
2017-07-28 13:13                                                               ` Jonathan Cameron
2017-07-28 13:13                                                               ` Jonathan Cameron
2017-07-28 14:55                                                             ` Paul E. McKenney
2017-07-28 14:55                                                               ` Paul E. McKenney
2017-07-28 14:55                                                               ` Paul E. McKenney
2017-07-28 18:41                                                               ` Paul E. McKenney
2017-07-28 18:41                                                                 ` Paul E. McKenney
2017-07-28 18:41                                                                 ` Paul E. McKenney
2017-07-28 19:09                                                                 ` Paul E. McKenney
2017-07-28 19:09                                                                   ` Paul E. McKenney
2017-07-28 19:09                                                                   ` Paul E. McKenney
2017-07-30 13:37                                                                   ` Boqun Feng
2017-07-30 13:37                                                                     ` Boqun Feng
2017-07-30 13:37                                                                     ` Boqun Feng
2017-07-30 16:59                                                                     ` Paul E. McKenney
2017-07-30 16:59                                                                       ` Paul E. McKenney
2017-07-30 16:59                                                                       ` Paul E. McKenney
2017-07-29  1:20                                                                 ` Boqun Feng
2017-07-29  1:20                                                                   ` Boqun Feng
2017-07-29  1:20                                                                   ` Boqun Feng
2017-07-28 18:42                                                             ` David Miller
2017-07-28 18:42                                                               ` David Miller
2017-07-28 18:42                                                               ` David Miller
2017-07-28 13:08                                                           ` Jonathan Cameron
2017-07-28 13:08                                                             ` Jonathan Cameron
2017-07-28 13:24                                                           ` Jonathan Cameron
2017-07-28 13:24                                                             ` Jonathan Cameron
2017-07-28 13:24                                                             ` Jonathan Cameron
2017-07-28 16:55                                                             ` Paul E. McKenney
2017-07-28 16:55                                                               ` Paul E. McKenney
2017-07-28 17:27                                                               ` Jonathan Cameron
2017-07-28 17:27                                                                 ` Jonathan Cameron
2017-07-28 17:27                                                                 ` Jonathan Cameron
2017-07-28 19:03                                                                 ` Paul E. McKenney
2017-07-28 19:03                                                                   ` Paul E. McKenney
2017-07-28 19:03                                                                   ` Paul E. McKenney
2017-07-31 11:08                                                                   ` Jonathan Cameron
2017-07-31 11:08                                                                     ` Jonathan Cameron
2017-07-31 11:08                                                                     ` Jonathan Cameron
2017-07-31 15:04                                                                     ` Paul E. McKenney
2017-07-31 15:04                                                                       ` Paul E. McKenney
2017-07-31 15:04                                                                       ` Paul E. McKenney
2017-07-31 15:27                                                                       ` Jonathan Cameron
2017-07-31 15:27                                                                         ` Jonathan Cameron
2017-07-31 15:27                                                                         ` Jonathan Cameron
2017-08-01 18:46                                                                         ` Paul E. McKenney
2017-08-01 18:46                                                                           ` Paul E. McKenney
2017-08-01 18:46                                                                           ` Paul E. McKenney
2017-08-02 16:25                                                                           ` Jonathan Cameron
2017-08-02 16:25                                                                             ` Jonathan Cameron
2017-08-02 16:25                                                                             ` Jonathan Cameron
2017-08-15 15:47                                                                             ` Paul E. McKenney
2017-08-15 15:47                                                                               ` Paul E. McKenney
2017-08-15 15:47                                                                               ` Paul E. McKenney
2017-08-16  1:24                                                                               ` Jonathan Cameron
2017-08-16  1:24                                                                                 ` Jonathan Cameron
2017-08-16  1:24                                                                                 ` Jonathan Cameron
2017-08-16 12:43                                                                               ` Michael Ellerman
2017-08-16 12:43                                                                                 ` Michael Ellerman
2017-08-16 12:43                                                                                 ` Michael Ellerman
2017-08-16 12:56                                                                                 ` Paul E. McKenney
2017-08-16 12:56                                                                                   ` Paul E. McKenney
2017-08-16 12:56                                                                                   ` Paul E. McKenney
2017-08-16 15:31                                                                                   ` Nicholas Piggin
2017-08-16 15:31                                                                                     ` Nicholas Piggin
2017-08-16 15:31                                                                                     ` Nicholas Piggin
2017-08-16 16:27                                                                                   ` Paul E. McKenney
2017-08-16 16:27                                                                                     ` Paul E. McKenney
2017-08-16 16:27                                                                                     ` Paul E. McKenney
2017-08-17 13:55                                                                                     ` Michael Ellerman
2017-08-17 13:55                                                                                       ` Michael Ellerman
2017-08-17 13:55                                                                                       ` Michael Ellerman
2017-08-20  4:45                                                                                     ` Nicholas Piggin
2017-08-20  4:45                                                                                       ` Nicholas Piggin
2017-08-20  4:45                                                                                       ` Nicholas Piggin
2017-08-20  5:01                                                                                       ` David Miller
2017-08-20  5:01                                                                                         ` David Miller
2017-08-20  5:01                                                                                         ` David Miller
2017-08-20  5:04                                                                                       ` Paul E. McKenney
2017-08-20  5:04                                                                                         ` Paul E. McKenney
2017-08-20  5:04                                                                                         ` Paul E. McKenney
2017-08-20 13:00                                                                                       ` Nicholas Piggin
2017-08-20 13:00                                                                                         ` Nicholas Piggin
2017-08-20 13:00                                                                                         ` Nicholas Piggin
2017-08-20 18:35                                                                                         ` Paul E. McKenney
2017-08-20 18:35                                                                                           ` Paul E. McKenney
2017-08-20 18:35                                                                                           ` Paul E. McKenney
2017-08-20 21:14                                                                                           ` Paul E. McKenney
2017-08-20 21:14                                                                                             ` Paul E. McKenney
2017-08-20 21:14                                                                                             ` Paul E. McKenney
2017-08-21  0:52                                                                                             ` Nicholas Piggin
2017-08-21  0:52                                                                                               ` Nicholas Piggin
2017-08-21  0:52                                                                                               ` Nicholas Piggin
2017-08-21  6:06                                                                                               ` Nicholas Piggin
2017-08-21  6:06                                                                                                 ` Nicholas Piggin
2017-08-21  6:06                                                                                                 ` Nicholas Piggin
2017-08-21 10:18                                                                                                 ` Jonathan Cameron
2017-08-21 10:18                                                                                                   ` Jonathan Cameron
2017-08-21 10:18                                                                                                   ` Jonathan Cameron
2017-08-21 14:19                                                                                                   ` Nicholas Piggin
2017-08-21 14:19                                                                                                     ` Nicholas Piggin
2017-08-21 14:19                                                                                                     ` Nicholas Piggin
2017-08-21 15:02                                                                                                     ` Jonathan Cameron
2017-08-21 15:02                                                                                                       ` Jonathan Cameron
2017-08-21 15:02                                                                                                       ` Jonathan Cameron
2017-08-21 20:55                                                                                                     ` David Miller
2017-08-21 20:55                                                                                                       ` David Miller
2017-08-21 20:55                                                                                                       ` David Miller
2017-08-22  7:49                                                                                                       ` Jonathan Cameron
2017-08-22  7:49                                                                                                         ` Jonathan Cameron
2017-08-22  7:49                                                                                                         ` Jonathan Cameron
2017-08-22  8:51                                                                                                         ` Abdul Haleem
2017-08-22  8:51                                                                                                           ` Abdul Haleem
2017-08-22  8:51                                                                                                           ` Abdul Haleem
2017-08-22 15:26                                                                                                           ` Paul E. McKenney
2017-08-22 15:26                                                                                                             ` Paul E. McKenney
2017-08-22 15:26                                                                                                             ` Paul E. McKenney
2017-09-06 12:28                                                                                                             ` Paul E. McKenney
2017-09-06 12:28                                                                                                               ` Paul E. McKenney
2017-09-06 12:28                                                                                                               ` Paul E. McKenney
2017-08-22  0:38                                                                                               ` Paul E. McKenney
2017-08-22  0:38                                                                                                 ` Paul E. McKenney
2017-08-22  0:38                                                                                                 ` Paul E. McKenney
2017-07-31 11:09                                           ` Jonathan Cameron
2017-07-31 11:09                                             ` Jonathan Cameron
2017-07-31 11:09                                             ` Jonathan Cameron
2017-07-31 11:55                                             ` Jonathan Cameron
2017-07-31 11:55                                               ` Jonathan Cameron
2017-07-31 11:55                                               ` Jonathan Cameron
2017-08-01 10:53                                               ` Jonathan Cameron
2017-08-01 10:53                                                 ` Jonathan Cameron
2017-08-01 10:53                                                 ` Jonathan Cameron
2017-07-26 16:48                           ` David Miller
2017-07-26 16:48                             ` David Miller
2017-07-26 16:48                             ` David Miller
2017-07-26  3:53           ` Paul E. McKenney
2017-07-26  3:53             ` Paul E. McKenney
2017-07-26  3:53             ` Paul E. McKenney
2017-07-26  7:51             ` Jonathan Cameron
2017-07-26  7:51               ` Jonathan Cameron
2017-07-26  7:51               ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.