All of lore.kernel.org
 help / color / mirror / Atom feed
* Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-12  9:18 ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12  9:18 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Vincent,

I have observed issues running on linus/master from a few days back [1]. 
I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue 
by X forwarding the v4l2 test application qv4l2 over ssh and moving the 
courser around in the GUI (best test case description award...). I'm 
sorry about the really bad way I trigger this but I can't do it in any 
other way, I'm happy to try other methods if you got some ideas. The 
symptom of the issue is a complete hang of the system for more then 30 
seconds and then this information is printed in the console:

[  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
[  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
[  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
[  142.868042] Sending NMI from CPU 0 to CPUs 1:
[  142.872443] NMI backtrace for cpu 1
[  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872473] PC is at arch_cpu_idle+0x28/0x44
[  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
[  142.872488] pc : [<c01088a4>]    lr : [<c01747a8>]    psr: 20070013
[  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
[  142.872495] r10: 00000000  r9 : 413fc0f2  r8 : 4000406a
[  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : ffffe000  r4 : 00000002
[  142.872502] r3 : eb0b6ec0  r2 : 00000000  r1 : 00000004  r0 : 00000001
[  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 00000051
[  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872522] Backtrace:
[  142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
[  142.872540]  r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
[  142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
[  142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
[  142.872563]  r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
[  142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
[  142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
[  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
[  142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
[  142.872599]  r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
[  142.872602]  r4:f0802000
[  142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
[  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
[  142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
[  142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
[  142.872627] 9f80: 20070013 ffffffff
[  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
[  142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
[  142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
[  142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
[  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
[  142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
[  142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
[  142.872676]  r5:00000051 r4:6b0a406a
[  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[  143.135040] RCU grace-period kthread stack dump:
[  143.139695] rcu_sched       I    0     9      2 0x00000000
[  143.145234] Backtrace:
[  143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
[  143.154823]  r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
[  143.162712]  r4:ffffe000
[  143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
[  143.173076]  r5:00000000 r4:eb79c4c0
[  143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
[  143.185108]  r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
[  143.192997]  r4:c0c16300
[  143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
[  143.203099]  r7:c0c16300
[  143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
[  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
[  143.218030] bfa0:                                     00000000 00000000 00000000 00000000
[  143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  143.241177]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
[  143.249065]  r4:eb037b00

After the freeze the system becomes responsive again and I can sometimes 
trigger the hang multiple times. I tried to bisect the problem and I 
found that by reverting [2] I can no longer reproduce the issue. I can 
also not reproduce the issue on v4.16.  I can't figure out if reverting 
[2] is just treating a symptom or the root cause of my troubles and 
would appreciate your input. Also my "test-case" do not trigger every 
time but I have tested this scenario quiet a lot and the result seems to 
be constant.

My test setup involves a NFS root filesystem, I ssh in and launch the 
GUI application over X forwarding. From what I know the application is 
not doing any ioctl calls to the v4l2 framework it's just sitting there 
idle as I move the courser around showing tool tips and such as I hover 
over elements and then it freeze up. I have not observed this issue by 
just booting the system and leaving it idle, movement in the GUI seems 
to be the key to trigger this.

I'm a bit lost on how to progress with this issue and would appreciate 
any help you can provide to help me figure this out.

1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-12  9:18 ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12  9:18 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Vincent,

I have observed issues running on linus/master from a few days back [1]. 
I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue 
by X forwarding the v4l2 test application qv4l2 over ssh and moving the 
courser around in the GUI (best test case description award...). I'm 
sorry about the really bad way I trigger this but I can't do it in any 
other way, I'm happy to try other methods if you got some ideas. The 
symptom of the issue is a complete hang of the system for more then 30 
seconds and then this information is printed in the console:

[  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
[  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
[  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
[  142.868042] Sending NMI from CPU 0 to CPUs 1:
[  142.872443] NMI backtrace for cpu 1
[  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872473] PC is at arch_cpu_idle+0x28/0x44
[  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
[  142.872488] pc : [<c01088a4>]    lr : [<c01747a8>]    psr: 20070013
[  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
[  142.872495] r10: 00000000  r9 : 413fc0f2  r8 : 4000406a
[  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : ffffe000  r4 : 00000002
[  142.872502] r3 : eb0b6ec0  r2 : 00000000  r1 : 00000004  r0 : 00000001
[  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 00000051
[  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872522] Backtrace:
[  142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
[  142.872540]  r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
[  142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
[  142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
[  142.872563]  r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
[  142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
[  142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
[  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
[  142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
[  142.872599]  r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
[  142.872602]  r4:f0802000
[  142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
[  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
[  142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
[  142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
[  142.872627] 9f80: 20070013 ffffffff
[  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
[  142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
[  142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
[  142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
[  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
[  142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
[  142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
[  142.872676]  r5:00000051 r4:6b0a406a
[  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[  143.135040] RCU grace-period kthread stack dump:
[  143.139695] rcu_sched       I    0     9      2 0x00000000
[  143.145234] Backtrace:
[  143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
[  143.154823]  r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
[  143.162712]  r4:ffffe000
[  143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
[  143.173076]  r5:00000000 r4:eb79c4c0
[  143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
[  143.185108]  r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
[  143.192997]  r4:c0c16300
[  143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
[  143.203099]  r7:c0c16300
[  143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
[  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
[  143.218030] bfa0:                                     00000000 00000000 00000000 00000000
[  143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  143.241177]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
[  143.249065]  r4:eb037b00

After the freeze the system becomes responsive again and I can sometimes 
trigger the hang multiple times. I tried to bisect the problem and I 
found that by reverting [2] I can no longer reproduce the issue. I can 
also not reproduce the issue on v4.16.  I can't figure out if reverting 
[2] is just treating a symptom or the root cause of my troubles and 
would appreciate your input. Also my "test-case" do not trigger every 
time but I have tested this scenario quiet a lot and the result seems to 
be constant.

My test setup involves a NFS root filesystem, I ssh in and launch the 
GUI application over X forwarding. From what I know the application is 
not doing any ioctl calls to the v4l2 framework it's just sitting there 
idle as I move the courser around showing tool tips and such as I hover 
over elements and then it freeze up. I have not observed this issue by 
just booting the system and leaving it idle, movement in the GUI seems 
to be the key to trigger this.

I'm a bit lost on how to progress with this issue and would appreciate 
any help you can provide to help me figure this out.

1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12  9:18 ` Niklas Söderlund
  (?)
@ 2018-04-12 10:33 ` Vincent Guittot
  2018-04-12 11:15     ` Niklas Söderlund
  2018-04-12 22:06     ` Niklas Söderlund
  -1 siblings, 2 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-12 10:33 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Niklas,

On 12 April 2018 at 11:18, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> Hi Vincent,
>
> I have observed issues running on linus/master from a few days back [1].
> I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> courser around in the GUI (best test case description award...). I'm
> sorry about the really bad way I trigger this but I can't do it in any
> other way, I'm happy to try other methods if you got some ideas. The
> symptom of the issue is a complete hang of the system for more then 30
> seconds and then this information is printed in the console:

Heiner (edded cc) also reported similar problem with his platform: a
dual core celeron

Do you confirm that your platform is a dual cortex-A15 ? At least that
what I have seen on web
This would confirm that dual system is a key point.

The ssh connection is also common with Heiner's setup

>
> [  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
> [  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
> [  142.868042] Sending NMI from CPU 0 to CPUs 1:
> [  142.872443] NMI backtrace for cpu 1
> [  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> [  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  142.872473] PC is at arch_cpu_idle+0x28/0x44
> [  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
> [  142.872488] pc : [<c01088a4>]    lr : [<c01747a8>]    psr: 20070013
> [  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
> [  142.872495] r10: 00000000  r9 : 413fc0f2  r8 : 4000406a
> [  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : ffffe000  r4 : 00000002
> [  142.872502] r3 : eb0b6ec0  r2 : 00000000  r1 : 00000004  r0 : 00000001
> [  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 00000051
> [  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> [  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  142.872522] Backtrace:
> [  142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
> [  142.872540]  r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
> [  142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
> [  142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
> [  142.872563]  r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
> [  142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
> [  142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
> [  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
> [  142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
> [  142.872599]  r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
> [  142.872602]  r4:f0802000
> [  142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
> [  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
> [  142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
> [  142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
> [  142.872627] 9f80: 20070013 ffffffff
> [  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
> [  142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
> [  142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
> [  142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
> [  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
> [  142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
> [  142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
> [  142.872676]  r5:00000051 r4:6b0a406a
> [  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> [  143.135040] RCU grace-period kthread stack dump:
> [  143.139695] rcu_sched       I    0     9      2 0x00000000
> [  143.145234] Backtrace:
> [  143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
> [  143.154823]  r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
> [  143.162712]  r4:ffffe000
> [  143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
> [  143.173076]  r5:00000000 r4:eb79c4c0
> [  143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
> [  143.185108]  r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
> [  143.192997]  r4:c0c16300
> [  143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
> [  143.203099]  r7:c0c16300
> [  143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> [  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
> [  143.218030] bfa0:                                     00000000 00000000 00000000 00000000
> [  143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [  143.241177]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
> [  143.249065]  r4:eb037b00
>
> After the freeze the system becomes responsive again and I can sometimes
> trigger the hang multiple times. I tried to bisect the problem and I
> found that by reverting [2] I can no longer reproduce the issue. I can
> also not reproduce the issue on v4.16.  I can't figure out if reverting
> [2] is just treating a symptom or the root cause of my troubles and
> would appreciate your input. Also my "test-case" do not trigger every
> time but I have tested this scenario quiet a lot and the result seems to
> be constant.
>
> My test setup involves a NFS root filesystem, I ssh in and launch the
> GUI application over X forwarding. From what I know the application is
> not doing any ioctl calls to the v4l2 framework it's just sitting there
> idle as I move the courser around showing tool tips and such as I hover
> over elements and then it freeze up. I have not observed this issue by
> just booting the system and leaving it idle, movement in the GUI seems
> to be the key to trigger this.
>
> I'm a bit lost on how to progress with this issue and would appreciate
> any help you can provide to help me figure this out.

Can you send me your config ?

I'm going to prepare a debug patch to spy what's happening when entering idle

Regards,
Vincent
>
> 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
>
> --
> Regards,
> Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12 10:33 ` Vincent Guittot
@ 2018-04-12 11:15     ` Niklas Söderlund
  2018-04-12 22:06     ` Niklas Söderlund
  1 sibling, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 11:15 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

Thanks for your feedback.

On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> On 12 April 2018 at 11:18, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> > Hi Vincent,
> >
> > I have observed issues running on linus/master from a few days back [1].
> > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > courser around in the GUI (best test case description award...). I'm
> > sorry about the really bad way I trigger this but I can't do it in any
> > other way, I'm happy to try other methods if you got some ideas. The
> > symptom of the issue is a complete hang of the system for more then 30
> > seconds and then this information is printed in the console:
> 
> Heiner (edded cc) also reported similar problem with his platform: a
> dual core celeron
> 
> Do you confirm that your platform is a dual cortex-A15 ? At least that
> what I have seen on web
> This would confirm that dual system is a key point.

I can confirm that my platform is a dual core.

> 
> The ssh connection is also common with Heiner's setup

Interesting, I found Heiner's mail and I can confirm that I too 
experience ssh sessions lockups. I ssh into the system and by repeatedly 
hitting the return key I can lockup the board, while locked up starting 
another ssh session unblocks the first. If I don't start another ssh 
session but keep hitting return key sporadically in the first one I can 
get the trace I reported in my first mail to be printed on the serial 
console.

When locked up the symptoms are that both the single ssh session is dead 
and the serial console. But attempting another ssh connection 
immediately unblocks both ssh and serial console. And if I allow enough 
time before starting the second ssh connection I can trigger a trace to 
be printed on the serial console, it's similar but different from the 
first I reported.

[  207.548610] 	1-...!: (0 ticks this GP) idle=79a/1/1073741824 softirq=2146/2146 fqs=0 
[  207.556442] 	(detected by 0, t=12645 jiffies, g=333, c=332, q=20)
[  207.562546] rcu_sched kthread starved for 12645 jiffies! g333 c332 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
[  207.572548] RCU grace-period kthread stack dump:

[  207.577166] rcu_sched       R  running task        0     9      2 0x00000000
[  207.584389] Backtrace: 
[  207.586849] [<c0760450>] (__schedule) from [<c0760ba4>] (schedule+0x94/0xb8)
[  207.593901]  r10:e77813c0 r9:e77813c0 r8:ffffffff r7:e709bed4 r6:ffffaa80 r5:00000000
[  207.601732]  r4:ffffe000
[  207.604269] [<c0760b10>] (schedule) from [<c0764560>] (schedule_timeout+0x380/0x3dc)
[  207.612013]  r5:00000000 r4:00000000
[  207.615596] [<c07641e0>] (schedule_timeout) from [<c017b698>] (rcu_gp_kthread+0x668/0xe2c)
[  207.623863]  r10:c0b79018 r9:0000014d r8:0000014c r7:00000001 r6:00000000 r5:c0b10ad0
[  207.631693]  r4:c0b10980
[  207.634230] [<c017b030>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[  207.641712]  r7:c0b10980
[  207.644249] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  207.651472] Exception stack(0xe709bfb0 to 0xe709bff8)
[  207.656527] bfa0:                                     00000000 00000000 00000000 00000000
[  207.664709] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  207.672890] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  207.679508]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[  207.687340]  r4:e7026f4

Continuing the anecdotal testing, I can't seem to be able to trigger the 
lockup if i have ever had two ssh sessions open to the systems. And 
about half the time I can't trigger it at all but after a reset of the 
system it triggers with just hitting the return key 2-5 times of opening 
a ssh session and just hitting the return key. But please take this part 
with a grain of salt as it's done by the monkey testing method :-)

All tests above have been run base on c18bb396d3d261eb ("Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").

> 
> >
> > [  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
> > [  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
> > [  142.868042] Sending NMI from CPU 0 to CPUs 1:
> > [  142.872443] NMI backtrace for cpu 1
> > [  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [  142.872473] PC is at arch_cpu_idle+0x28/0x44
> > [  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
> > [  142.872488] pc : [<c01088a4>]    lr : [<c01747a8>]    psr: 20070013
> > [  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
> > [  142.872495] r10: 00000000  r9 : 413fc0f2  r8 : 4000406a
> > [  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : ffffe000  r4 : 00000002
> > [  142.872502] r3 : eb0b6ec0  r2 : 00000000  r1 : 00000004  r0 : 00000001
> > [  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 00000051
> > [  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [  142.872522] Backtrace:
> > [  142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
> > [  142.872540]  r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
> > [  142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
> > [  142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
> > [  142.872563]  r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
> > [  142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
> > [  142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
> > [  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
> > [  142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
> > [  142.872599]  r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
> > [  142.872602]  r4:f0802000
> > [  142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
> > [  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
> > [  142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
> > [  142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
> > [  142.872627] 9f80: 20070013 ffffffff
> > [  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
> > [  142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
> > [  142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
> > [  142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
> > [  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
> > [  142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
> > [  142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
> > [  142.872676]  r5:00000051 r4:6b0a406a
> > [  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> > [  143.135040] RCU grace-period kthread stack dump:
> > [  143.139695] rcu_sched       I    0     9      2 0x00000000
> > [  143.145234] Backtrace:
> > [  143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
> > [  143.154823]  r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
> > [  143.162712]  r4:ffffe000
> > [  143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
> > [  143.173076]  r5:00000000 r4:eb79c4c0
> > [  143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
> > [  143.185108]  r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
> > [  143.192997]  r4:c0c16300
> > [  143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
> > [  143.203099]  r7:c0c16300
> > [  143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > [  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
> > [  143.218030] bfa0:                                     00000000 00000000 00000000 00000000
> > [  143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [  143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > [  143.241177]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
> > [  143.249065]  r4:eb037b00
> >
> > After the freeze the system becomes responsive again and I can sometimes
> > trigger the hang multiple times. I tried to bisect the problem and I
> > found that by reverting [2] I can no longer reproduce the issue. I can
> > also not reproduce the issue on v4.16.  I can't figure out if reverting
> > [2] is just treating a symptom or the root cause of my troubles and
> > would appreciate your input. Also my "test-case" do not trigger every
> > time but I have tested this scenario quiet a lot and the result seems to
> > be constant.
> >
> > My test setup involves a NFS root filesystem, I ssh in and launch the
> > GUI application over X forwarding. From what I know the application is
> > not doing any ioctl calls to the v4l2 framework it's just sitting there
> > idle as I move the courser around showing tool tips and such as I hover
> > over elements and then it freeze up. I have not observed this issue by
> > just booting the system and leaving it idle, movement in the GUI seems
> > to be the key to trigger this.
> >
> > I'm a bit lost on how to progress with this issue and would appreciate
> > any help you can provide to help me figure this out.
> 
> Can you send me your config ?
> 
> I'm going to prepare a debug patch to spy what's happening when entering idle
> 
> Regards,
> Vincent
> >
> > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> >
> > --
> > Regards,
> > Niklas Söderlund

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-12 11:15     ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 11:15 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

Thanks for your feedback.

On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> On 12 April 2018 at 11:18, Niklas S�derlund
> <niklas.soderlund@ragnatech.se> wrote:
> > Hi Vincent,
> >
> > I have observed issues running on linus/master from a few days back [1].
> > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > courser around in the GUI (best test case description award...). I'm
> > sorry about the really bad way I trigger this but I can't do it in any
> > other way, I'm happy to try other methods if you got some ideas. The
> > symptom of the issue is a complete hang of the system for more then 30
> > seconds and then this information is printed in the console:
> 
> Heiner (edded cc) also reported similar problem with his platform: a
> dual core celeron
> 
> Do you confirm that your platform is a dual cortex-A15 ? At least that
> what I have seen on web
> This would confirm that dual system is a key point.

I can confirm that my platform is a dual core.

> 
> The ssh connection is also common with Heiner's setup

Interesting, I found Heiner's mail and I can confirm that I too 
experience ssh sessions lockups. I ssh into the system and by repeatedly 
hitting the return key I can lockup the board, while locked up starting 
another ssh session unblocks the first. If I don't start another ssh 
session but keep hitting return key sporadically in the first one I can 
get the trace I reported in my first mail to be printed on the serial 
console.

When locked up the symptoms are that both the single ssh session is dead 
and the serial console. But attempting another ssh connection 
immediately unblocks both ssh and serial console. And if I allow enough 
time before starting the second ssh connection I can trigger a trace to 
be printed on the serial console, it's similar but different from the 
first I reported.

[  207.548610] 	1-...!: (0 ticks this GP) idle=79a/1/1073741824 softirq=2146/2146 fqs=0 
[  207.556442] 	(detected by 0, t=12645 jiffies, g=333, c=332, q=20)
[  207.562546] rcu_sched kthread starved for 12645 jiffies! g333 c332 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
[  207.572548] RCU grace-period kthread stack dump:

[  207.577166] rcu_sched       R  running task        0     9      2 0x00000000
[  207.584389] Backtrace: 
[  207.586849] [<c0760450>] (__schedule) from [<c0760ba4>] (schedule+0x94/0xb8)
[  207.593901]  r10:e77813c0 r9:e77813c0 r8:ffffffff r7:e709bed4 r6:ffffaa80 r5:00000000
[  207.601732]  r4:ffffe000
[  207.604269] [<c0760b10>] (schedule) from [<c0764560>] (schedule_timeout+0x380/0x3dc)
[  207.612013]  r5:00000000 r4:00000000
[  207.615596] [<c07641e0>] (schedule_timeout) from [<c017b698>] (rcu_gp_kthread+0x668/0xe2c)
[  207.623863]  r10:c0b79018 r9:0000014d r8:0000014c r7:00000001 r6:00000000 r5:c0b10ad0
[  207.631693]  r4:c0b10980
[  207.634230] [<c017b030>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[  207.641712]  r7:c0b10980
[  207.644249] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  207.651472] Exception stack(0xe709bfb0 to 0xe709bff8)
[  207.656527] bfa0:                                     00000000 00000000 00000000 00000000
[  207.664709] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  207.672890] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  207.679508]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[  207.687340]  r4:e7026f4

Continuing the anecdotal testing, I can't seem to be able to trigger the 
lockup if i have ever had two ssh sessions open to the systems. And 
about half the time I can't trigger it at all but after a reset of the 
system it triggers with just hitting the return key 2-5 times of opening 
a ssh session and just hitting the return key. But please take this part 
with a grain of salt as it's done by the monkey testing method :-)

All tests above have been run base on c18bb396d3d261eb ("Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").

> 
> >
> > [  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
> > [  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
> > [  142.868042] Sending NMI from CPU 0 to CPUs 1:
> > [  142.872443] NMI backtrace for cpu 1
> > [  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [  142.872473] PC is at arch_cpu_idle+0x28/0x44
> > [  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
> > [  142.872488] pc : [<c01088a4>]    lr : [<c01747a8>]    psr: 20070013
> > [  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
> > [  142.872495] r10: 00000000  r9 : 413fc0f2  r8 : 4000406a
> > [  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : ffffe000  r4 : 00000002
> > [  142.872502] r3 : eb0b6ec0  r2 : 00000000  r1 : 00000004  r0 : 00000001
> > [  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 00000051
> > [  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [  142.872522] Backtrace:
> > [  142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
> > [  142.872540]  r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
> > [  142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
> > [  142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
> > [  142.872563]  r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
> > [  142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
> > [  142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
> > [  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
> > [  142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
> > [  142.872599]  r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
> > [  142.872602]  r4:f0802000
> > [  142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
> > [  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
> > [  142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
> > [  142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
> > [  142.872627] 9f80: 20070013 ffffffff
> > [  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
> > [  142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
> > [  142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
> > [  142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
> > [  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
> > [  142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
> > [  142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
> > [  142.872676]  r5:00000051 r4:6b0a406a
> > [  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> > [  143.135040] RCU grace-period kthread stack dump:
> > [  143.139695] rcu_sched       I    0     9      2 0x00000000
> > [  143.145234] Backtrace:
> > [  143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
> > [  143.154823]  r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
> > [  143.162712]  r4:ffffe000
> > [  143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
> > [  143.173076]  r5:00000000 r4:eb79c4c0
> > [  143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
> > [  143.185108]  r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
> > [  143.192997]  r4:c0c16300
> > [  143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
> > [  143.203099]  r7:c0c16300
> > [  143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > [  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
> > [  143.218030] bfa0:                                     00000000 00000000 00000000 00000000
> > [  143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [  143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > [  143.241177]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
> > [  143.249065]  r4:eb037b00
> >
> > After the freeze the system becomes responsive again and I can sometimes
> > trigger the hang multiple times. I tried to bisect the problem and I
> > found that by reverting [2] I can no longer reproduce the issue. I can
> > also not reproduce the issue on v4.16.  I can't figure out if reverting
> > [2] is just treating a symptom or the root cause of my troubles and
> > would appreciate your input. Also my "test-case" do not trigger every
> > time but I have tested this scenario quiet a lot and the result seems to
> > be constant.
> >
> > My test setup involves a NFS root filesystem, I ssh in and launch the
> > GUI application over X forwarding. From what I know the application is
> > not doing any ioctl calls to the v4l2 framework it's just sitting there
> > idle as I move the courser around showing tool tips and such as I hover
> > over elements and then it freeze up. I have not observed this issue by
> > just booting the system and leaving it idle, movement in the GUI seems
> > to be the key to trigger this.
> >
> > I'm a bit lost on how to progress with this issue and would appreciate
> > any help you can provide to help me figure this out.
> 
> Can you send me your config ?
> 
> I'm going to prepare a debug patch to spy what's happening when entering idle
> 
> Regards,
> Vincent
> >
> > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> >
> > --
> > Regards,
> > Niklas S�derlund

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
       [not found]     ` <20180412133031.GA551@linaro.org>
@ 2018-04-12 19:43       ` Heiner Kallweit
  2018-04-14 11:21         ` Vincent Guittot
  2018-04-12 22:39         ` Niklas Söderlund
  1 sibling, 1 reply; 33+ messages in thread
From: Heiner Kallweit @ 2018-04-12 19:43 UTC (permalink / raw)
  To: Vincent Guittot, Niklas Söderlund, i
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Am 12.04.2018 um 15:30 schrieb Vincent Guittot:
> Heiner, Niklas,
> 
> Le Thursday 12 Apr 2018 à 13:15:19 (+0200), Niklas Söderlund a écrit :
>> Hi Vincent,
>>
>> Thanks for your feedback.
>>
>> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
>>> Hi Niklas,
>>>
>>> On 12 April 2018 at 11:18, Niklas Söderlund
>>> <niklas.soderlund@ragnatech.se> wrote:
>>>> Hi Vincent,
>>>>
>>>> I have observed issues running on linus/master from a few days back [1].
>>>> I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
>>>> by X forwarding the v4l2 test application qv4l2 over ssh and moving the
>>>> courser around in the GUI (best test case description award...). I'm
>>>> sorry about the really bad way I trigger this but I can't do it in any
>>>> other way, I'm happy to try other methods if you got some ideas. The
>>>> symptom of the issue is a complete hang of the system for more then 30
>>>> seconds and then this information is printed in the console:
>>>
>>> Heiner (edded cc) also reported similar problem with his platform: a
>>> dual core celeron
>>>
>>> Do you confirm that your platform is a dual cortex-A15 ? At least that
>>> what I have seen on web
>>> This would confirm that dual system is a key point.
>>
>> I can confirm that my platform is a dual core.
>>
>>>
>>> The ssh connection is also common with Heiner's setup
>>
>> Interesting, I found Heiner's mail and I can confirm that I too 
>> experience ssh sessions lockups. I ssh into the system and by repeatedly 
>> hitting the return key I can lockup the board, while locked up starting 
>> another ssh session unblocks the first. If I don't start another ssh 
>> session but keep hitting return key sporadically in the first one I can 
>> get the trace I reported in my first mail to be printed on the serial 
>> console.
>>
>> When locked up the symptoms are that both the single ssh session is dead 
>> and the serial console. But attempting another ssh connection 
>> immediately unblocks both ssh and serial console. And if I allow enough 
>> time before starting the second ssh connection I can trigger a trace to 
>> be printed on the serial console, it's similar but different from the 
>> first I reported.
>>
>> [  207.548610] 	1-...!: (0 ticks this GP) idle=79a/1/1073741824 softirq=2146/2146 fqs=0 
>> [  207.556442] 	(detected by 0, t=12645 jiffies, g=333, c=332, q=20)
>> [  207.562546] rcu_sched kthread starved for 12645 jiffies! g333 c332 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
>> [  207.572548] RCU grace-period kthread stack dump:
>>
>> [  207.577166] rcu_sched       R  running task        0     9      2 0x00000000
>> [  207.584389] Backtrace: 
>> [  207.586849] [<c0760450>] (__schedule) from [<c0760ba4>] (schedule+0x94/0xb8)
>> [  207.593901]  r10:e77813c0 r9:e77813c0 r8:ffffffff r7:e709bed4 r6:ffffaa80 r5:00000000
>> [  207.601732]  r4:ffffe000
>> [  207.604269] [<c0760b10>] (schedule) from [<c0764560>] (schedule_timeout+0x380/0x3dc)
>> [  207.612013]  r5:00000000 r4:00000000
>> [  207.615596] [<c07641e0>] (schedule_timeout) from [<c017b698>] (rcu_gp_kthread+0x668/0xe2c)
>> [  207.623863]  r10:c0b79018 r9:0000014d r8:0000014c r7:00000001 r6:00000000 r5:c0b10ad0
>> [  207.631693]  r4:c0b10980
>> [  207.634230] [<c017b030>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
>> [  207.641712]  r7:c0b10980
>> [  207.644249] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>> [  207.651472] Exception stack(0xe709bfb0 to 0xe709bff8)
>> [  207.656527] bfa0:                                     00000000 00000000 00000000 00000000
>> [  207.664709] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> [  207.672890] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> [  207.679508]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
>> [  207.687340]  r4:e7026f4
>>
>> Continuing the anecdotal testing, I can't seem to be able to trigger the 
>> lockup if i have ever had two ssh sessions open to the systems. And 
>> about half the time I can't trigger it at all but after a reset of the 
>> system it triggers with just hitting the return key 2-5 times of opening 
>> a ssh session and just hitting the return key. But please take this part 
>> with a grain of salt as it's done by the monkey testing method :-)
>>
>> All tests above have been run base on c18bb396d3d261eb ("Merge 
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>>
>>>
>>>>
> 
> [snip]
> 
>>>>
>>>> I'm a bit lost on how to progress with this issue and would appreciate
>>>> any help you can provide to help me figure this out.
>>>
>>> Can you send me your config ?
>>>
>>> I'm going to prepare a debug patch to spy what's happening when entering idle
> 
> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> and check if one of them fixes the problem ?i
> 
> (They should apply on linux-next as well)
> 
> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> 
> ---
>  kernel/sched/fair.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..b21925b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	 * candidate for ilb instead of waking up another idle CPU.
>  	 * Kick an normal ilb if we failed to do the update.
>  	 */
> -	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> -		kick_ilb(NOHZ_STATS_KICK);
> +	kick_ilb(NOHZ_STATS_KICK);
>  	raw_spin_lock(&this_rq->lock);
>  }
>  
> 
I tested both patches, with both of them the issue still occurs. However,
on top of linux-next from yesterday I have the impression that it happens
less frequent with the second patch.
On top of the commit mentioned by you I don't see a change in system behavior
with either patch.

Regards, Heiner

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12 10:33 ` Vincent Guittot
@ 2018-04-12 22:06     ` Niklas Söderlund
  2018-04-12 22:06     ` Niklas Söderlund
  1 sibling, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 22:06 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
[snip[

> 
> Can you send me your config ?
> 
> I'm going to prepare a debug patch to spy what's happening when entering idle

I'm sorry i missed this request when first reading your reply, I'm using
arch/arm/configs/shmobile_defconfig for my tests.

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-12 22:06     ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 22:06 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
[snip[

> 
> Can you send me your config ?
> 
> I'm going to prepare a debug patch to spy what's happening when entering idle

I'm sorry i missed this request when first reading your reply, I'm using
arch/arm/configs/shmobile_defconfig for my tests.

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
       [not found]     ` <20180412133031.GA551@linaro.org>
@ 2018-04-12 22:39         ` Niklas Söderlund
  2018-04-12 22:39         ` Niklas Söderlund
  1 sibling, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 22:39 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: i, Heiner Kallweit, Peter Zijlstra, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

Hi Vincent,

Thanks for helping trying to figure this out.

On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:

[snip]

> 
> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> and check if one of them fixes the problem ?i

I tried your suggested changes based on top of c18bb396d3d261eb.

> 
> (They should apply on linux-next as well)
> 
> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> 
> ---
>  kernel/sched/fair.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..b21925b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	 * candidate for ilb instead of waking up another idle CPU.
>  	 * Kick an normal ilb if we failed to do the update.
>  	 */
> -	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> -		kick_ilb(NOHZ_STATS_KICK);
> +	kick_ilb(NOHZ_STATS_KICK);
>  	raw_spin_lock(&this_rq->lock);
>  }

This change don't seem to effect the issue. I can still get the single 
ssh session and the system to lockup by hitting the return key. And 
opening a second ssh session immediately unblocks both the first ssh 
session and the serial console. And I can still trigger the console 
warning by just letting the system be once it locks-up. I do have 
just as before reset the system a few times to trigger the issue.

[  245.351693] INFO: rcu_sched detected stalls on CPUs/tasks:
[  245.357199] 	0-...!: (1 GPs behind) idle=93c/0/0 softirq=2224/2225 fqs=0 
[  245.363988] 	(detected by 1, t=3025 jiffies, g=337, c=336, q=10)
[  245.370003] Sending NMI from CPU 1 to CPUs 0:
[  245.374368] NMI backtrace for cpu 0
[  245.374377] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  245.374379] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  245.374393] PC is at arch_cpu_idle+0x24/0x40
[  245.374397] LR is at arch_cpu_idle+0x34/0x40
[  245.374400] pc : [<c0108554>]    lr : [<c0108564>]    psr: 60050013
[  245.374403] sp : c0b01f40  ip : c0b01f50  fp : c0b01f4c
[  245.374405] r10: c0a56a38  r9 : e7fffbc0  r8 : c0b04c00
[  245.374407] r7 : c0b04c78  r6 : c0b04c2c  r5 : ffffe000  r4 : 00000001
[  245.374410] r3 : c0119100  r2 : e77813a8  r1 : 0002d93c  r0 : 00000000
[  245.374414] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  245.374417] Control: 10c5387d  Table: 6662006a  DAC: 00000051
[  245.374421] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  245.374423] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  245.374425] Backtrace: 
[  245.374435] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  245.374440]  r7:c0b47278 r6:60050193 r5:00000000 r4:c0b73d80
[  245.374450] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[  245.374456] [<c074b7a0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[  245.374460]  r7:c0b47278 r6:c0b01ef0 r5:00000000 r4:c0bc62c8
[  245.374468] [<c01087d4>] (show_regs) from [<c0751938>] (nmi_cpu_backtrace+0xfc/0x118)
[  245.374475] [<c075183c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[  245.374479]  r7:c0b47278 r6:c0b01ef0 r5:00000007 r4:c0a775fc
[  245.374488] [<c010ed08>] (handle_IPI) from [<c037cd68>] (gic_handle_irq+0x8c/0x98)
[  245.374492]  r10:c0a56a38 r9:c0b00000 r8:f0803000 r7:c0b47278 r6:c0b01ef0 r5:c0b05244
[  245.374495]  r4:f0802000 r3:00000407
[  245.374501] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  245.374504] Exception stack(0xc0b01ef0 to 0xc0b01f38)
[  245.374507] 1ee0:                                     00000000 0002d93c e77813a8 c0119100
[  245.374512] 1f00: 00000001 ffffe000 c0b04c2c c0b04c78 c0b04c00 e7fffbc0 c0a56a38 c0b01f4c
[  245.374516] 1f20: c0b01f50 c0b01f40 c0108564 c0108554 60050013 ffffffff
[  245.374521]  r9:c0b00000 r8:c0b04c00 r7:c0b01f24 r6:ffffffff r5:60050013 r4:c0108554
[  245.374528] [<c0108530>] (arch_cpu_idle) from [<c076512c>] (default_idle_call+0x30/0x34)
[  245.374535] [<c07650fc>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[  245.374540] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[  245.374543]  r7:c0b04c08 r6:ffffffff r5:c0b80380 r4:000000c2
[  245.374549] [<c014bad0>] (cpu_startup_entry) from [<c075fb14>] (rest_init+0x9c/0xbc)
[  245.374555] [<c075fa78>] (rest_init) from [<c0a00dc4>] (start_kernel+0x368/0x3ec)
[  245.374558]  r5:c0b80380 r4:c0b803c0
[  245.374563] [<c0a00a5c>] (start_kernel) from [<00000000>] (  (null))
[  245.375369] rcu_sched kthread starved for 3028 jiffies! g337 c336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[  245.641242] RCU grace-period kthread stack dump:
[  245.645859] rcu_sched       I    0     9      2 0x00000000
[  245.651350] Backtrace: 
[  245.653804] [<c07603f0>] (__schedule) from [<c0760b44>] (schedule+0x94/0xb8)
[  245.660857]  r10:e77903c0 r9:e77903c0 r8:ffffffff r7:e709bed4 r6:ffffded7 r5:00000000
[  245.668689]  r4:ffffe000
[  245.671227] [<c0760ab0>] (schedule) from [<c0764500>] (schedule_timeout+0x380/0x3dc)
[  245.678973]  r5:00000000 r4:00000000
[  245.682557] [<c0764180>] (schedule_timeout) from [<c017b620>] (rcu_gp_kthread+0x668/0xe2c)
[  245.690826]  r10:c0b79018 r9:00000151 r8:00000150 r7:00000001 r6:00000000 r5:c0b10ad0
[  245.698658]  r4:c0b10980
[  245.701197] [<c017afb8>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[  245.708682]  r7:c0b10980
[  245.711218] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  245.718443] Exception stack(0xe709bfb0 to 0xe709bff8)
[  245.723495] bfa0:                                     00000000 00000000 00000000 00000000
[  245.731677] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  245.739858] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  245.746476]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[  245.754307]  r4:e7026f40
[  335.351162] INFO: rcu_sched detected stalls on CPUs/tasks:
[  335.356682] 	0-...!: (0 ticks this GP) idle=b0a/1/1073741824 softirq=2226/2226 fqs=0 
[  335.364521] 	(detected by 1, t=6027 jiffies, g=340, c=339, q=77)
[  335.370545] Sending NMI from CPU 1 to CPUs 0:
[  345.374803] NMI backtrace for cpu 1
[  345.378295] CPU: 1 PID: 1926 Comm: bash Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  345.386215] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  345.392395] Backtrace: 
[  345.394855] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  345.402430]  r7:00000000 r6:600f0193 r5:00000000 r4:c0b73d80
[  345.408097] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[  345.415329] [<c074b7a0>] (dump_stack) from [<c0751940>] (nmi_cpu_backtrace+0x104/0x118)
[  345.423337]  r7:00000000 r6:00000000 r5:00000001 r4:c0bc62c8
[  345.429001] [<c075183c>] (nmi_cpu_backtrace) from [<c07519e0>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[  345.438572]  r7:00000000 r6:c010e4d4 r5:00000001 r4:c0b08158
[  345.444237] [<c0751954>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[  345.454850]  r7:c0b10980 r6:c0b04cd0 r5:00000001 r4:c0b10980
[  345.460519] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ac>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[  345.470268] [<c017f8fc>] (rcu_dump_cpu_stacks) from [<c017e73c>] (rcu_check_callbacks+0x584/0x71c)
[  345.479232]  r10:0000004d r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7795080
[  345.487065]  r4:c0b10980 r3:f181863e
[  345.490648] [<c017e1b8>] (rcu_check_callbacks) from [<c0184a60>] (update_process_times+0x38/0x64)
[  345.499524]  r10:e7fa9ea0 r9:e67e8000 r8:e7795e00 r7:0000004e r6:00000000 r5:e72d0640
[  345.507356]  r4:ffffe000
[  345.509903] [<c0184a28>] (update_process_times) from [<c0195bf0>] (tick_nohz_handler+0xc8/0x114)
[  345.518693]  r7:0000004e r6:146afc64 r5:e67e9d70 r4:e77917a0
[  345.524362] [<c0195b28>] (tick_nohz_handler) from [<c05c16ac>] (arch_timer_handler_virt+0x30/0x40)
[  345.533325]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[  345.540033] [<c05c167c>] (arch_timer_handler_virt) from [<c016ecd4>] (handle_percpu_devid_irq+0xf0/0x224)
[  345.549607] [<c016ebe4>] (handle_percpu_devid_irq) from [<c016a208>] (generic_handle_irq+0x20/0x30)
[  345.558657]  r10:e7fa9ea0 r9:e67e8000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  345.566489]  r4:00000000 r3:c016ebe4
[  345.570067] [<c016a1e8>] (generic_handle_irq) from [<c016a2c0>] (__handle_domain_irq+0xa8/0xbc)
[  345.578777] [<c016a218>] (__handle_domain_irq) from [<c037cd3c>] (gic_handle_irq+0x60/0x98)
[  345.587132]  r9:e67e8000 r8:f0803000 r7:c0b47278 r6:e67e9d70 r5:c0b05244 r4:f0802000
[  345.594883] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  345.602368] Exception stack(0xe67e9d70 to 0xe67e9db8)
[  345.607421] 9d60:                                     00000000 e6c433e8 00182b08 00000000
[  345.615603] 9d80: e6c429b4 00000087 e67e9e34 e6c434c4 00000000 00000087 e7fa9ea0 e67e9e14
[  345.623784] 9da0: 00000000 e67e9dc0 c01e0f34 c01e0f88 600f0113 ffffffff
[  345.630401]  r9:e67e8000 r8:00000000 r7:e67e9da4 r6:ffffffff r5:600f0113 r4:c01e0f88
[  345.638156] [<c01e0d84>] (filemap_map_pages) from [<c02141f0>] (handle_mm_fault+0x5c8/0x93c)
[  345.646599]  r10:e5dc5380 r9:80000007 r8:00000080 r7:b6e02000 r6:c0b04c08 r5:0000008f
[  345.654431]  r4:00000000
[  345.656968] [<c0213c28>] (handle_mm_fault) from [<c011404c>] (do_page_fault+0x17c/0x324)
[  345.665063]  r10:e5dc5380 r9:80000007 r8:e72d0640 r7:00000054 r6:e5d030c0 r5:b6e023dc
[  345.672895]  r4:e67e9fb0
[  345.675431] [<c0113ed0>] (do_page_fault) from [<c0114470>] (do_PrefetchAbort+0x4c/0xc8)
[  345.683439]  r10:00000010 r9:00179207 r8:c0b0a434 r7:e67e9fb0 r6:b6e023dc r5:c0b04c08
[  345.691271]  r4:00000007
[  345.693807] [<c0114424>] (do_PrefetchAbort) from [<c010204c>] (ret_from_exception+0x0/0x14)
[  345.702160] Exception stack(0xe67e9fb0 to 0xe67e9ff8)
[  345.707213] 9fa0:                                     00000000 000e9600 00000001 bec86544
[  345.715395] 9fc0: 00000001 bec86544 00000000 00000000 bec8654c 00179207 00000010 000e9600
[  345.723576] 9fe0: 000c8140 bec86538 0002847c b6e023dc 200f0010 ffffffff
[  345.730192]  r8:10c5387d r7:10c5387d r6:ffffffff r5:200f0010 r4:b6e023dc
[  345.737364] nfs: server 10.0.1.1 not responding, still trying
[  345.743294] nfs: server 10.0.1.1 not responding, still trying
[  345.744534] nfs: server 10.0.1.1 not responding, still trying
[  345.749262] nfs: server 10.0.1.1 OK
[  345.755391] nfs: server 10.0.1.1 OK
[  345.758641] nfs: server 10.0.1.1 OK

>  
> -- 
> 2.7.4
> 
> 
> The 2nd patch does the opposite and never kick ilb. I have always made the
> assumtpion that the problem was coming from _nohz_idle_balance() and I just
> wanted to check that's the case
> 
> ---
>  kernel/sched/fair.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..1766594 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	 * candidate for ilb instead of waking up another idle CPU.
>  	 * Kick an normal ilb if we failed to do the update.
>  	 */
> -	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> -		kick_ilb(NOHZ_STATS_KICK);
> +	_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE);
>  	raw_spin_lock(&this_rq->lock);
>  }

With this change it seem more difficult to trigger the issue, I usually 
can trigger it within 2-3 rests of the system but with this applied I 
had to reset it over 5 times to trigger it. But I can still reproduce 
the lockup reliably multiple times. I would not trust my feeling that 
it's harder to reproduce with this change as the method to trigger it is 
kind of spotty and most likely due to circumstantial circumstances :-)

I can still unlock the system by starting a second ssh session and 
trigger the output on the serial console.

[  193.741481] INFO: rcu_sched detected stalls on CPUs/tasks:
[  193.746991] 	1-...!: (0 ticks this GP) idle=83c/0/0 softirq=2445/2445 fqs=0 
[  193.754041] 	(detected by 0, t=4096 jiffies, g=611, c=610, q=31)
[  193.760054] NMI backtrace for cpu 0
[  193.763546] CPU: 0 PID: 2193 Comm: bash Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  193.771984] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  193.778163] Backtrace: 
[  193.780625] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  193.788198]  r7:00000000 r6:60010193 r5:00000000 r4:c0b73d80
[  193.793865] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[  193.801093] [<c074b7e0>] (dump_stack) from [<c0751980>] (nmi_cpu_backtrace+0x104/0x118)
[  193.809098]  r7:00000000 r6:00000000 r5:00000000 r4:c0bc62c8
[  193.814761] [<c075187c>] (nmi_cpu_backtrace) from [<c0751a20>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[  193.824329]  r7:00000000 r6:c010e4d4 r5:00000000 r4:c0b08158
[  193.829992] [<c0751994>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[  193.840603]  r7:c0b10980 r6:c0b04cd0 r5:00000000 r4:c0b10980
[  193.846268] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ec>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[  193.856015] [<c017f93c>] (rcu_dump_cpu_stacks) from [<c017e77c>] (rcu_check_callbacks+0x584/0x71c)
[  193.864977]  r10:0000001f r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7786080
[  193.872810]  r4:c0b10980 r3:f181863e
[  193.876392] [<c017e1f8>] (rcu_check_callbacks) from [<c0184aa0>] (update_process_times+0x38/0x64)
[  193.885268]  r10:00000000 r9:00180b98 r8:e7786e00 r7:0000002d r6:00000001 r5:e5d86180
[  193.893100]  r4:ffffe000
[  193.895642] [<c0184a68>] (update_process_times) from [<c0195c30>] (tick_nohz_handler+0xc8/0x114)
[  193.904431]  r7:0000002d r6:1bcdda00 r5:e66c7fb0 r4:e77827a0
[  193.910099] [<c0195b68>] (tick_nohz_handler) from [<c05c16ec>] (arch_timer_handler_virt+0x30/0x40)
[  193.919062]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[  193.925769] [<c05c16bc>] (arch_timer_handler_virt) from [<c016ed14>] (handle_percpu_devid_irq+0xf0/0x224)
[  193.935343] [<c016ec24>] (handle_percpu_devid_irq) from [<c016a248>] (generic_handle_irq+0x20/0x30)
[  193.944394]  r10:00000000 r9:00180b98 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  193.952225]  r4:00000000 r3:c016ec24
[  193.955803] [<c016a228>] (generic_handle_irq) from [<c016a300>] (__handle_domain_irq+0xa8/0xbc)
[  193.964513] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[  193.972869]  r9:00180b98 r8:f0803000 r7:c0b47278 r6:e66c7fb0 r5:c0b05244 r4:f0802000
[  193.980619] [<c037cd1c>] (gic_handle_irq) from [<c0101dd4>] (__irq_usr+0x54/0x80)
[  193.988105] Exception stack(0xe66c7fb0 to 0xe66c7ff8)
[  193.993157] 7fa0:                                     00000000 00000000 b6e8b79c 00000000
[  194.001339] 7fc0: 00178378 000134e0 b6e8b794 b6e8b000 b6e8b794 00180b98 00000000 b6e8b7c4
[  194.009520] 7fe0: be955190 be955170 00000000 b6db5c58 20010010 ffffffff
[  194.016137]  r9:00180b98 r8:10c5387d r7:10c5387d r6:ffffffff r5:20010010 r4:b6db5c58
[  194.023888] Sending NMI from CPU 0 to CPUs 1:
[  204.028147] NMI backtrace for cpu 1
[  204.028152] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  204.028155] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  204.028164] PC is at _raw_spin_unlock_irqrestore+0x28/0x2c
[  204.028169] LR is at rcu_report_qs_rnp+0x4c/0x20c
[  204.028172] pc : [<c07653d8>]    lr : [<c01796d4>]    psr: 200e0113
[  204.028175] sp : e70a3db0  ip : e70a3dc0  fp : e70a3dbc
[  204.028177] r10: 00000264  r9 : c0b787eb  r8 : 00000001
[  204.028179] r7 : 00000002  r6 : c0b10980  r5 : c0b10980  r4 : c0b10980
[  204.028182] r3 : 000022a3  r2 : 00000264  r1 : 600e0113  r0 : c0b10980
[  204.028187] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  204.028189] Control: 10c5387d  Table: 66b9806a  DAC: 00000051
[  204.028193] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  204.028195] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  204.028197] Backtrace: 
[  204.028204] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  204.028209]  r7:c0b47278 r6:600e0193 r5:00000000 r4:c0b73d80
[  204.028215] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[  204.028228] [<c074b7e0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[  204.028231]  r7:c0b47278 r6:e70a3d60 r5:00000001 r4:c0bc62c8
[  204.028238] [<c01087d4>] (show_regs) from [<c0751978>] (nmi_cpu_backtrace+0xfc/0x118)
[  204.028244] [<c075187c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[  204.028247]  r7:c0b47278 r6:e70a3d60 r5:00000007 r4:c0a775fc
[  204.028254] [<c010ed08>] (handle_IPI) from [<c037cda8>] (gic_handle_irq+0x8c/0x98)
[  204.028259]  r10:00000264 r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3d60 r5:c0b05244
[  204.028261]  r4:f0802000 r3:00000007
[  204.028267] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  204.028270] Exception stack(0xe70a3d60 to 0xe70a3da8)
[  204.028275] 3d60: c0b10980 600e0113 00000264 000022a3 c0b10980 c0b10980 c0b10980 00000002
[  204.028280] 3d80: 00000001 c0b787eb 00000264 e70a3dbc e70a3dc0 e70a3db0 c01796d4 c07653d8
[  204.028283] 3da0: 200e0113 ffffffff
[  204.028287]  r9:e70a2000 r8:00000001 r7:e70a3d94 r6:ffffffff r5:200e0113 r4:c07653d8
[  204.028295] [<c07653b0>] (_raw_spin_unlock_irqrestore) from [<c01796d4>] (rcu_report_qs_rnp+0x4c/0x20c)
[  204.028301] [<c0179688>] (rcu_report_qs_rnp) from [<c017c188>] (rcu_process_callbacks+0x364/0x8cc)
[  204.028305]  r10:c0b03080 r9:00000000 r8:00000002 r7:600e0113 r6:c0b10980 r5:c0b10980
[  204.028306]  r4:e7795080
[  204.028312] [<c017be24>] (rcu_process_callbacks) from [<c010236c>] (__do_softirq+0x23c/0x338)
[  204.028316]  r10:c0b03080 r9:00000100 r8:c0b80ec0 r7:00000002 r6:00000009 r5:c0b030a4
[  204.028318]  r4:ffffe000
[  204.028325] [<c0102130>] (__do_softirq) from [<c0125b90>] (irq_exit+0x6c/0xd4)
[  204.028329]  r10:00000000 r9:e70a2000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  204.028331]  r4:00000000
[  204.028336] [<c0125b24>] (irq_exit) from [<c016a2e0>] (__handle_domain_irq+0x88/0xbc)
[  204.028338]  r5:00000000 r4:00000000
[  204.028344] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[  204.028348]  r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3f40 r5:c0b05244 r4:f0802000
[  204.028354] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  204.028356] Exception stack(0xe70a3f40 to 0xe70a3f88)
[  204.028361] 3f40: 00000000 00013844 e77903a8 c0119100 00000002 ffffe000 c0b04c2c c0b04c78
[  204.028367] 3f60: 4000406a 413fc0f2 00000000 e70a3f9c e70a3fa0 e70a3f90 c0108564 c0108554
[  204.028370] 3f80: 600e0013 ffffffff
[  204.028374]  r9:e70a2000 r8:4000406a r7:e70a3f74 r6:ffffffff r5:600e0013 r4:c0108554
[  204.028381] [<c0108530>] (arch_cpu_idle) from [<c076516c>] (default_idle_call+0x30/0x34)
[  204.028388] [<c076513c>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[  204.028392] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[  204.028396]  r7:c0b80750 r6:10c0387d r5:00000051 r4:00000085
[  204.028401] [<c014bad0>] (cpu_startup_entry) from [<c010eacc>] (secondary_start_kernel+0x10c/0x12c)
[  204.028406] [<c010e9c0>] (secondary_start_kernel) from [<4010250c>] (0x4010250c)
[  204.028408]  r5:00000051 r4:6709406a

>  
> -- 
> 2.7.4
> 
> 
> Thanks,
> Vincent
> 
> > >
> > > 
> > > Regards,
> > > Vincent
> > > >
> > > > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > > > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> > > >
> > > > --
> > > > Regards,
> > > > Niklas Söderlund
> > 
> > -- 
> > Regards,
> > Niklas Söderlund

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-12 22:39         ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-12 22:39 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: i, Heiner Kallweit, Peter Zijlstra, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

Hi Vincent,

Thanks for helping trying to figure this out.

On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:

[snip]

> 
> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> and check if one of them fixes the problem ?i

I tried your suggested changes based on top of c18bb396d3d261eb.

> 
> (They should apply on linux-next as well)
> 
> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> 
> ---
>  kernel/sched/fair.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..b21925b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	 * candidate for ilb instead of waking up another idle CPU.
>  	 * Kick an normal ilb if we failed to do the update.
>  	 */
> -	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> -		kick_ilb(NOHZ_STATS_KICK);
> +	kick_ilb(NOHZ_STATS_KICK);
>  	raw_spin_lock(&this_rq->lock);
>  }

This change don't seem to effect the issue. I can still get the single 
ssh session and the system to lockup by hitting the return key. And 
opening a second ssh session immediately unblocks both the first ssh 
session and the serial console. And I can still trigger the console 
warning by just letting the system be once it locks-up. I do have 
just as before reset the system a few times to trigger the issue.

[  245.351693] INFO: rcu_sched detected stalls on CPUs/tasks:
[  245.357199] 	0-...!: (1 GPs behind) idle=93c/0/0 softirq=2224/2225 fqs=0 
[  245.363988] 	(detected by 1, t=3025 jiffies, g=337, c=336, q=10)
[  245.370003] Sending NMI from CPU 1 to CPUs 0:
[  245.374368] NMI backtrace for cpu 0
[  245.374377] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  245.374379] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  245.374393] PC is at arch_cpu_idle+0x24/0x40
[  245.374397] LR is at arch_cpu_idle+0x34/0x40
[  245.374400] pc : [<c0108554>]    lr : [<c0108564>]    psr: 60050013
[  245.374403] sp : c0b01f40  ip : c0b01f50  fp : c0b01f4c
[  245.374405] r10: c0a56a38  r9 : e7fffbc0  r8 : c0b04c00
[  245.374407] r7 : c0b04c78  r6 : c0b04c2c  r5 : ffffe000  r4 : 00000001
[  245.374410] r3 : c0119100  r2 : e77813a8  r1 : 0002d93c  r0 : 00000000
[  245.374414] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  245.374417] Control: 10c5387d  Table: 6662006a  DAC: 00000051
[  245.374421] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  245.374423] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  245.374425] Backtrace: 
[  245.374435] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  245.374440]  r7:c0b47278 r6:60050193 r5:00000000 r4:c0b73d80
[  245.374450] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[  245.374456] [<c074b7a0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[  245.374460]  r7:c0b47278 r6:c0b01ef0 r5:00000000 r4:c0bc62c8
[  245.374468] [<c01087d4>] (show_regs) from [<c0751938>] (nmi_cpu_backtrace+0xfc/0x118)
[  245.374475] [<c075183c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[  245.374479]  r7:c0b47278 r6:c0b01ef0 r5:00000007 r4:c0a775fc
[  245.374488] [<c010ed08>] (handle_IPI) from [<c037cd68>] (gic_handle_irq+0x8c/0x98)
[  245.374492]  r10:c0a56a38 r9:c0b00000 r8:f0803000 r7:c0b47278 r6:c0b01ef0 r5:c0b05244
[  245.374495]  r4:f0802000 r3:00000407
[  245.374501] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  245.374504] Exception stack(0xc0b01ef0 to 0xc0b01f38)
[  245.374507] 1ee0:                                     00000000 0002d93c e77813a8 c0119100
[  245.374512] 1f00: 00000001 ffffe000 c0b04c2c c0b04c78 c0b04c00 e7fffbc0 c0a56a38 c0b01f4c
[  245.374516] 1f20: c0b01f50 c0b01f40 c0108564 c0108554 60050013 ffffffff
[  245.374521]  r9:c0b00000 r8:c0b04c00 r7:c0b01f24 r6:ffffffff r5:60050013 r4:c0108554
[  245.374528] [<c0108530>] (arch_cpu_idle) from [<c076512c>] (default_idle_call+0x30/0x34)
[  245.374535] [<c07650fc>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[  245.374540] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[  245.374543]  r7:c0b04c08 r6:ffffffff r5:c0b80380 r4:000000c2
[  245.374549] [<c014bad0>] (cpu_startup_entry) from [<c075fb14>] (rest_init+0x9c/0xbc)
[  245.374555] [<c075fa78>] (rest_init) from [<c0a00dc4>] (start_kernel+0x368/0x3ec)
[  245.374558]  r5:c0b80380 r4:c0b803c0
[  245.374563] [<c0a00a5c>] (start_kernel) from [<00000000>] (  (null))
[  245.375369] rcu_sched kthread starved for 3028 jiffies! g337 c336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[  245.641242] RCU grace-period kthread stack dump:
[  245.645859] rcu_sched       I    0     9      2 0x00000000
[  245.651350] Backtrace: 
[  245.653804] [<c07603f0>] (__schedule) from [<c0760b44>] (schedule+0x94/0xb8)
[  245.660857]  r10:e77903c0 r9:e77903c0 r8:ffffffff r7:e709bed4 r6:ffffded7 r5:00000000
[  245.668689]  r4:ffffe000
[  245.671227] [<c0760ab0>] (schedule) from [<c0764500>] (schedule_timeout+0x380/0x3dc)
[  245.678973]  r5:00000000 r4:00000000
[  245.682557] [<c0764180>] (schedule_timeout) from [<c017b620>] (rcu_gp_kthread+0x668/0xe2c)
[  245.690826]  r10:c0b79018 r9:00000151 r8:00000150 r7:00000001 r6:00000000 r5:c0b10ad0
[  245.698658]  r4:c0b10980
[  245.701197] [<c017afb8>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[  245.708682]  r7:c0b10980
[  245.711218] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  245.718443] Exception stack(0xe709bfb0 to 0xe709bff8)
[  245.723495] bfa0:                                     00000000 00000000 00000000 00000000
[  245.731677] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  245.739858] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  245.746476]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[  245.754307]  r4:e7026f40
[  335.351162] INFO: rcu_sched detected stalls on CPUs/tasks:
[  335.356682] 	0-...!: (0 ticks this GP) idle=b0a/1/1073741824 softirq=2226/2226 fqs=0 
[  335.364521] 	(detected by 1, t=6027 jiffies, g=340, c=339, q=77)
[  335.370545] Sending NMI from CPU 1 to CPUs 0:
[  345.374803] NMI backtrace for cpu 1
[  345.378295] CPU: 1 PID: 1926 Comm: bash Not tainted 4.16.0-10930-ged741fb4567c816f #42
[  345.386215] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  345.392395] Backtrace: 
[  345.394855] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  345.402430]  r7:00000000 r6:600f0193 r5:00000000 r4:c0b73d80
[  345.408097] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[  345.415329] [<c074b7a0>] (dump_stack) from [<c0751940>] (nmi_cpu_backtrace+0x104/0x118)
[  345.423337]  r7:00000000 r6:00000000 r5:00000001 r4:c0bc62c8
[  345.429001] [<c075183c>] (nmi_cpu_backtrace) from [<c07519e0>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[  345.438572]  r7:00000000 r6:c010e4d4 r5:00000001 r4:c0b08158
[  345.444237] [<c0751954>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[  345.454850]  r7:c0b10980 r6:c0b04cd0 r5:00000001 r4:c0b10980
[  345.460519] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ac>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[  345.470268] [<c017f8fc>] (rcu_dump_cpu_stacks) from [<c017e73c>] (rcu_check_callbacks+0x584/0x71c)
[  345.479232]  r10:0000004d r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7795080
[  345.487065]  r4:c0b10980 r3:f181863e
[  345.490648] [<c017e1b8>] (rcu_check_callbacks) from [<c0184a60>] (update_process_times+0x38/0x64)
[  345.499524]  r10:e7fa9ea0 r9:e67e8000 r8:e7795e00 r7:0000004e r6:00000000 r5:e72d0640
[  345.507356]  r4:ffffe000
[  345.509903] [<c0184a28>] (update_process_times) from [<c0195bf0>] (tick_nohz_handler+0xc8/0x114)
[  345.518693]  r7:0000004e r6:146afc64 r5:e67e9d70 r4:e77917a0
[  345.524362] [<c0195b28>] (tick_nohz_handler) from [<c05c16ac>] (arch_timer_handler_virt+0x30/0x40)
[  345.533325]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[  345.540033] [<c05c167c>] (arch_timer_handler_virt) from [<c016ecd4>] (handle_percpu_devid_irq+0xf0/0x224)
[  345.549607] [<c016ebe4>] (handle_percpu_devid_irq) from [<c016a208>] (generic_handle_irq+0x20/0x30)
[  345.558657]  r10:e7fa9ea0 r9:e67e8000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  345.566489]  r4:00000000 r3:c016ebe4
[  345.570067] [<c016a1e8>] (generic_handle_irq) from [<c016a2c0>] (__handle_domain_irq+0xa8/0xbc)
[  345.578777] [<c016a218>] (__handle_domain_irq) from [<c037cd3c>] (gic_handle_irq+0x60/0x98)
[  345.587132]  r9:e67e8000 r8:f0803000 r7:c0b47278 r6:e67e9d70 r5:c0b05244 r4:f0802000
[  345.594883] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  345.602368] Exception stack(0xe67e9d70 to 0xe67e9db8)
[  345.607421] 9d60:                                     00000000 e6c433e8 00182b08 00000000
[  345.615603] 9d80: e6c429b4 00000087 e67e9e34 e6c434c4 00000000 00000087 e7fa9ea0 e67e9e14
[  345.623784] 9da0: 00000000 e67e9dc0 c01e0f34 c01e0f88 600f0113 ffffffff
[  345.630401]  r9:e67e8000 r8:00000000 r7:e67e9da4 r6:ffffffff r5:600f0113 r4:c01e0f88
[  345.638156] [<c01e0d84>] (filemap_map_pages) from [<c02141f0>] (handle_mm_fault+0x5c8/0x93c)
[  345.646599]  r10:e5dc5380 r9:80000007 r8:00000080 r7:b6e02000 r6:c0b04c08 r5:0000008f
[  345.654431]  r4:00000000
[  345.656968] [<c0213c28>] (handle_mm_fault) from [<c011404c>] (do_page_fault+0x17c/0x324)
[  345.665063]  r10:e5dc5380 r9:80000007 r8:e72d0640 r7:00000054 r6:e5d030c0 r5:b6e023dc
[  345.672895]  r4:e67e9fb0
[  345.675431] [<c0113ed0>] (do_page_fault) from [<c0114470>] (do_PrefetchAbort+0x4c/0xc8)
[  345.683439]  r10:00000010 r9:00179207 r8:c0b0a434 r7:e67e9fb0 r6:b6e023dc r5:c0b04c08
[  345.691271]  r4:00000007
[  345.693807] [<c0114424>] (do_PrefetchAbort) from [<c010204c>] (ret_from_exception+0x0/0x14)
[  345.702160] Exception stack(0xe67e9fb0 to 0xe67e9ff8)
[  345.707213] 9fa0:                                     00000000 000e9600 00000001 bec86544
[  345.715395] 9fc0: 00000001 bec86544 00000000 00000000 bec8654c 00179207 00000010 000e9600
[  345.723576] 9fe0: 000c8140 bec86538 0002847c b6e023dc 200f0010 ffffffff
[  345.730192]  r8:10c5387d r7:10c5387d r6:ffffffff r5:200f0010 r4:b6e023dc
[  345.737364] nfs: server 10.0.1.1 not responding, still trying
[  345.743294] nfs: server 10.0.1.1 not responding, still trying
[  345.744534] nfs: server 10.0.1.1 not responding, still trying
[  345.749262] nfs: server 10.0.1.1 OK
[  345.755391] nfs: server 10.0.1.1 OK
[  345.758641] nfs: server 10.0.1.1 OK

>  
> -- 
> 2.7.4
> 
> 
> The 2nd patch does the opposite and never kick ilb. I have always made the
> assumtpion that the problem was coming from _nohz_idle_balance() and I just
> wanted to check that's the case
> 
> ---
>  kernel/sched/fair.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..1766594 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	 * candidate for ilb instead of waking up another idle CPU.
>  	 * Kick an normal ilb if we failed to do the update.
>  	 */
> -	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> -		kick_ilb(NOHZ_STATS_KICK);
> +	_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE);
>  	raw_spin_lock(&this_rq->lock);
>  }

With this change it seem more difficult to trigger the issue, I usually 
can trigger it within 2-3 rests of the system but with this applied I 
had to reset it over 5 times to trigger it. But I can still reproduce 
the lockup reliably multiple times. I would not trust my feeling that 
it's harder to reproduce with this change as the method to trigger it is 
kind of spotty and most likely due to circumstantial circumstances :-)

I can still unlock the system by starting a second ssh session and 
trigger the output on the serial console.

[  193.741481] INFO: rcu_sched detected stalls on CPUs/tasks:
[  193.746991] 	1-...!: (0 ticks this GP) idle=83c/0/0 softirq=2445/2445 fqs=0 
[  193.754041] 	(detected by 0, t=4096 jiffies, g=611, c=610, q=31)
[  193.760054] NMI backtrace for cpu 0
[  193.763546] CPU: 0 PID: 2193 Comm: bash Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  193.771984] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  193.778163] Backtrace: 
[  193.780625] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  193.788198]  r7:00000000 r6:60010193 r5:00000000 r4:c0b73d80
[  193.793865] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[  193.801093] [<c074b7e0>] (dump_stack) from [<c0751980>] (nmi_cpu_backtrace+0x104/0x118)
[  193.809098]  r7:00000000 r6:00000000 r5:00000000 r4:c0bc62c8
[  193.814761] [<c075187c>] (nmi_cpu_backtrace) from [<c0751a20>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[  193.824329]  r7:00000000 r6:c010e4d4 r5:00000000 r4:c0b08158
[  193.829992] [<c0751994>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[  193.840603]  r7:c0b10980 r6:c0b04cd0 r5:00000000 r4:c0b10980
[  193.846268] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ec>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[  193.856015] [<c017f93c>] (rcu_dump_cpu_stacks) from [<c017e77c>] (rcu_check_callbacks+0x584/0x71c)
[  193.864977]  r10:0000001f r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7786080
[  193.872810]  r4:c0b10980 r3:f181863e
[  193.876392] [<c017e1f8>] (rcu_check_callbacks) from [<c0184aa0>] (update_process_times+0x38/0x64)
[  193.885268]  r10:00000000 r9:00180b98 r8:e7786e00 r7:0000002d r6:00000001 r5:e5d86180
[  193.893100]  r4:ffffe000
[  193.895642] [<c0184a68>] (update_process_times) from [<c0195c30>] (tick_nohz_handler+0xc8/0x114)
[  193.904431]  r7:0000002d r6:1bcdda00 r5:e66c7fb0 r4:e77827a0
[  193.910099] [<c0195b68>] (tick_nohz_handler) from [<c05c16ec>] (arch_timer_handler_virt+0x30/0x40)
[  193.919062]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[  193.925769] [<c05c16bc>] (arch_timer_handler_virt) from [<c016ed14>] (handle_percpu_devid_irq+0xf0/0x224)
[  193.935343] [<c016ec24>] (handle_percpu_devid_irq) from [<c016a248>] (generic_handle_irq+0x20/0x30)
[  193.944394]  r10:00000000 r9:00180b98 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  193.952225]  r4:00000000 r3:c016ec24
[  193.955803] [<c016a228>] (generic_handle_irq) from [<c016a300>] (__handle_domain_irq+0xa8/0xbc)
[  193.964513] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[  193.972869]  r9:00180b98 r8:f0803000 r7:c0b47278 r6:e66c7fb0 r5:c0b05244 r4:f0802000
[  193.980619] [<c037cd1c>] (gic_handle_irq) from [<c0101dd4>] (__irq_usr+0x54/0x80)
[  193.988105] Exception stack(0xe66c7fb0 to 0xe66c7ff8)
[  193.993157] 7fa0:                                     00000000 00000000 b6e8b79c 00000000
[  194.001339] 7fc0: 00178378 000134e0 b6e8b794 b6e8b000 b6e8b794 00180b98 00000000 b6e8b7c4
[  194.009520] 7fe0: be955190 be955170 00000000 b6db5c58 20010010 ffffffff
[  194.016137]  r9:00180b98 r8:10c5387d r7:10c5387d r6:ffffffff r5:20010010 r4:b6db5c58
[  194.023888] Sending NMI from CPU 0 to CPUs 1:
[  204.028147] NMI backtrace for cpu 1
[  204.028152] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  204.028155] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  204.028164] PC is at _raw_spin_unlock_irqrestore+0x28/0x2c
[  204.028169] LR is at rcu_report_qs_rnp+0x4c/0x20c
[  204.028172] pc : [<c07653d8>]    lr : [<c01796d4>]    psr: 200e0113
[  204.028175] sp : e70a3db0  ip : e70a3dc0  fp : e70a3dbc
[  204.028177] r10: 00000264  r9 : c0b787eb  r8 : 00000001
[  204.028179] r7 : 00000002  r6 : c0b10980  r5 : c0b10980  r4 : c0b10980
[  204.028182] r3 : 000022a3  r2 : 00000264  r1 : 600e0113  r0 : c0b10980
[  204.028187] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  204.028189] Control: 10c5387d  Table: 66b9806a  DAC: 00000051
[  204.028193] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[  204.028195] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  204.028197] Backtrace: 
[  204.028204] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[  204.028209]  r7:c0b47278 r6:600e0193 r5:00000000 r4:c0b73d80
[  204.028215] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[  204.028228] [<c074b7e0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[  204.028231]  r7:c0b47278 r6:e70a3d60 r5:00000001 r4:c0bc62c8
[  204.028238] [<c01087d4>] (show_regs) from [<c0751978>] (nmi_cpu_backtrace+0xfc/0x118)
[  204.028244] [<c075187c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[  204.028247]  r7:c0b47278 r6:e70a3d60 r5:00000007 r4:c0a775fc
[  204.028254] [<c010ed08>] (handle_IPI) from [<c037cda8>] (gic_handle_irq+0x8c/0x98)
[  204.028259]  r10:00000264 r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3d60 r5:c0b05244
[  204.028261]  r4:f0802000 r3:00000007
[  204.028267] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  204.028270] Exception stack(0xe70a3d60 to 0xe70a3da8)
[  204.028275] 3d60: c0b10980 600e0113 00000264 000022a3 c0b10980 c0b10980 c0b10980 00000002
[  204.028280] 3d80: 00000001 c0b787eb 00000264 e70a3dbc e70a3dc0 e70a3db0 c01796d4 c07653d8
[  204.028283] 3da0: 200e0113 ffffffff
[  204.028287]  r9:e70a2000 r8:00000001 r7:e70a3d94 r6:ffffffff r5:200e0113 r4:c07653d8
[  204.028295] [<c07653b0>] (_raw_spin_unlock_irqrestore) from [<c01796d4>] (rcu_report_qs_rnp+0x4c/0x20c)
[  204.028301] [<c0179688>] (rcu_report_qs_rnp) from [<c017c188>] (rcu_process_callbacks+0x364/0x8cc)
[  204.028305]  r10:c0b03080 r9:00000000 r8:00000002 r7:600e0113 r6:c0b10980 r5:c0b10980
[  204.028306]  r4:e7795080
[  204.028312] [<c017be24>] (rcu_process_callbacks) from [<c010236c>] (__do_softirq+0x23c/0x338)
[  204.028316]  r10:c0b03080 r9:00000100 r8:c0b80ec0 r7:00000002 r6:00000009 r5:c0b030a4
[  204.028318]  r4:ffffe000
[  204.028325] [<c0102130>] (__do_softirq) from [<c0125b90>] (irq_exit+0x6c/0xd4)
[  204.028329]  r10:00000000 r9:e70a2000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[  204.028331]  r4:00000000
[  204.028336] [<c0125b24>] (irq_exit) from [<c016a2e0>] (__handle_domain_irq+0x88/0xbc)
[  204.028338]  r5:00000000 r4:00000000
[  204.028344] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[  204.028348]  r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3f40 r5:c0b05244 r4:f0802000
[  204.028354] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[  204.028356] Exception stack(0xe70a3f40 to 0xe70a3f88)
[  204.028361] 3f40: 00000000 00013844 e77903a8 c0119100 00000002 ffffe000 c0b04c2c c0b04c78
[  204.028367] 3f60: 4000406a 413fc0f2 00000000 e70a3f9c e70a3fa0 e70a3f90 c0108564 c0108554
[  204.028370] 3f80: 600e0013 ffffffff
[  204.028374]  r9:e70a2000 r8:4000406a r7:e70a3f74 r6:ffffffff r5:600e0013 r4:c0108554
[  204.028381] [<c0108530>] (arch_cpu_idle) from [<c076516c>] (default_idle_call+0x30/0x34)
[  204.028388] [<c076513c>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[  204.028392] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[  204.028396]  r7:c0b80750 r6:10c0387d r5:00000051 r4:00000085
[  204.028401] [<c014bad0>] (cpu_startup_entry) from [<c010eacc>] (secondary_start_kernel+0x10c/0x12c)
[  204.028406] [<c010e9c0>] (secondary_start_kernel) from [<4010250c>] (0x4010250c)
[  204.028408]  r5:00000051 r4:6709406a

>  
> -- 
> 2.7.4
> 
> 
> Thanks,
> Vincent
> 
> > >
> > > 
> > > Regards,
> > > Vincent
> > > >
> > > > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > > > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> > > >
> > > > --
> > > > Regards,
> > > > Niklas S�derlund
> > 
> > -- 
> > Regards,
> > Niklas S�derlund

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12 11:15     ` Niklas Söderlund
@ 2018-04-13 20:38       ` Niklas Söderlund
  -1 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-13 20:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

On 2018-04-12 13:15:19 +0200, Niklas Söderlund wrote:
> Hi Vincent,
> 
> Thanks for your feedback.
> 
> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> > Hi Niklas,
> > 
> > On 12 April 2018 at 11:18, Niklas Söderlund
> > <niklas.soderlund@ragnatech.se> wrote:
> > > Hi Vincent,
> > >
> > > I have observed issues running on linus/master from a few days back [1].
> > > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > > courser around in the GUI (best test case description award...). I'm
> > > sorry about the really bad way I trigger this but I can't do it in any
> > > other way, I'm happy to try other methods if you got some ideas. The
> > > symptom of the issue is a complete hang of the system for more then 30
> > > seconds and then this information is printed in the console:
> > 
> > Heiner (edded cc) also reported similar problem with his platform: a
> > dual core celeron
> > 
> > Do you confirm that your platform is a dual cortex-A15 ? At least that
> > what I have seen on web
> > This would confirm that dual system is a key point.
> 
> I can confirm that my platform is a dual core.

I tested another dual core system today Renesas M3-W ARM64 system and I 
can observe the same lockups-on that system if it helps you understand 
the problem. It seems to be much harder to trigger the issue on this 
system for some reason. Hitting return in a ssh session don't seem to 
produce the lockup while starting a GUI using X forwarding over ssh it's 
possible.

[  392.306441] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  392.312201]  (detected by 0, t=19366 jiffies, g=7177, c=7176, q=35)
[  392.318555] All QSes seen, last rcu_preempt kthread activity 19368 
(4294990375-4294971007), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  392.330758] swapper/0       R  running task        0     0      0 
0x00000022
[  392.337883] Call trace:
[  392.340365]  dump_backtrace+0x0/0x1c8
[  392.344065]  show_stack+0x14/0x20
[  392.347416]  sched_show_task+0x224/0x2e8
[  392.351377]  rcu_check_callbacks+0x8ac/0x8b0
[  392.355686]  update_process_times+0x2c/0x58
[  392.359908]  tick_sched_handle.isra.5+0x30/0x50
[  392.364479]  tick_sched_timer+0x40/0x90
[  392.368351]  __hrtimer_run_queues+0xfc/0x208
[  392.372659]  hrtimer_interrupt+0xd4/0x258
[  392.376710]  arch_timer_handler_virt+0x28/0x48
[  392.381194]  handle_percpu_devid_irq+0x80/0x138
[  392.385767]  generic_handle_irq+0x28/0x40
[  392.389813]  __handle_domain_irq+0x5c/0xb8
[  392.393946]  gic_handle_irq+0x58/0xa8
[  392.397640]  el1_irq+0xb4/0x130
[  392.400810]  arch_cpu_idle+0x14/0x20
[  392.404422]  default_idle_call+0x1c/0x38
[  392.408381]  do_idle+0x17c/0x1f8
[  392.411640]  cpu_startup_entry+0x20/0x28
[  392.415598]  rest_init+0x24c/0x260
[  392.419037]  start_kernel+0x3e8/0x414

I was running the same tests on another ARM64 platform earlier using the 
same build which have more then two cores and there I could not observe 
this issue.

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-13 20:38       ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-13 20:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

Hi Vincent,

On 2018-04-12 13:15:19 +0200, Niklas S�derlund wrote:
> Hi Vincent,
> 
> Thanks for your feedback.
> 
> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> > Hi Niklas,
> > 
> > On 12 April 2018 at 11:18, Niklas S�derlund
> > <niklas.soderlund@ragnatech.se> wrote:
> > > Hi Vincent,
> > >
> > > I have observed issues running on linus/master from a few days back [1].
> > > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > > courser around in the GUI (best test case description award...). I'm
> > > sorry about the really bad way I trigger this but I can't do it in any
> > > other way, I'm happy to try other methods if you got some ideas. The
> > > symptom of the issue is a complete hang of the system for more then 30
> > > seconds and then this information is printed in the console:
> > 
> > Heiner (edded cc) also reported similar problem with his platform: a
> > dual core celeron
> > 
> > Do you confirm that your platform is a dual cortex-A15 ? At least that
> > what I have seen on web
> > This would confirm that dual system is a key point.
> 
> I can confirm that my platform is a dual core.

I tested another dual core system today Renesas M3-W ARM64 system and I 
can observe the same lockups-on that system if it helps you understand 
the problem. It seems to be much harder to trigger the issue on this 
system for some reason. Hitting return in a ssh session don't seem to 
produce the lockup while starting a GUI using X forwarding over ssh it's 
possible.

[  392.306441] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  392.312201]  (detected by 0, t=19366 jiffies, g=7177, c=7176, q=35)
[  392.318555] All QSes seen, last rcu_preempt kthread activity 19368 
(4294990375-4294971007), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  392.330758] swapper/0       R  running task        0     0      0 
0x00000022
[  392.337883] Call trace:
[  392.340365]  dump_backtrace+0x0/0x1c8
[  392.344065]  show_stack+0x14/0x20
[  392.347416]  sched_show_task+0x224/0x2e8
[  392.351377]  rcu_check_callbacks+0x8ac/0x8b0
[  392.355686]  update_process_times+0x2c/0x58
[  392.359908]  tick_sched_handle.isra.5+0x30/0x50
[  392.364479]  tick_sched_timer+0x40/0x90
[  392.368351]  __hrtimer_run_queues+0xfc/0x208
[  392.372659]  hrtimer_interrupt+0xd4/0x258
[  392.376710]  arch_timer_handler_virt+0x28/0x48
[  392.381194]  handle_percpu_devid_irq+0x80/0x138
[  392.385767]  generic_handle_irq+0x28/0x40
[  392.389813]  __handle_domain_irq+0x5c/0xb8
[  392.393946]  gic_handle_irq+0x58/0xa8
[  392.397640]  el1_irq+0xb4/0x130
[  392.400810]  arch_cpu_idle+0x14/0x20
[  392.404422]  default_idle_call+0x1c/0x38
[  392.408381]  do_idle+0x17c/0x1f8
[  392.411640]  cpu_startup_entry+0x20/0x28
[  392.415598]  rest_init+0x24c/0x260
[  392.419037]  start_kernel+0x3e8/0x414

I was running the same tests on another ARM64 platform earlier using the 
same build which have more then two cores and there I could not observe 
this issue.

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12 19:43       ` Heiner Kallweit
@ 2018-04-14 11:21         ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-14 11:21 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Niklas Söderlund, Peter Zijlstra, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

Heiner,

On 12 April 2018 at 21:43, Heiner Kallweit <hkallweit1@gmail.com> wrote:

>>>>
>>>> I'm going to prepare a debug patch to spy what's happening when entering idle
>>
>> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> and check if one of them fixes the problem ?i
>>
>> (They should apply on linux-next as well)
>>
>> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>>
>> ---
>>  kernel/sched/fair.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..b21925b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>>        * candidate for ilb instead of waking up another idle CPU.
>>        * Kick an normal ilb if we failed to do the update.
>>        */
>> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> -             kick_ilb(NOHZ_STATS_KICK);
>> +     kick_ilb(NOHZ_STATS_KICK);
>>       raw_spin_lock(&this_rq->lock);
>>  }
>>
>>
> I tested both patches, with both of them the issue still occurs. However,
> on top of linux-next from yesterday I have the impression that it happens
> less frequent with the second patch.
> On top of the commit mentioned by you I don't see a change in system behavior
> with either patch.

Thanks for the tests.
I was expecting to have more differences between the 2 patches and
especially no problem with the 1st patch which only send a ipi
reschedule to the other CPU if it is idle.
It seems to not really be related to what is done but to the fact that
it is done at that place in the code

Thanks
>
> Regards, Heiner

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-12 22:39         ` Niklas Söderlund
  (?)
@ 2018-04-14 11:24         ` Vincent Guittot
  2018-04-20 16:00             ` Vincent Guittot
  -1 siblings, 1 reply; 33+ messages in thread
From: Vincent Guittot @ 2018-04-14 11:24 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: i, Heiner Kallweit, Peter Zijlstra, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

Hi Niklas,

On 13 April 2018 at 00:39, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> Hi Vincent,
>
> Thanks for helping trying to figure this out.
>
> On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
>
> [snip]
>
>>
>> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> and check if one of them fixes the problem ?i
>
> I tried your suggested changes based on top of c18bb396d3d261eb.
>
>>
>> (They should apply on linux-next as well)
>>
>> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>>
>> ---
>>  kernel/sched/fair.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..b21925b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>>        * candidate for ilb instead of waking up another idle CPU.
>>        * Kick an normal ilb if we failed to do the update.
>>        */
>> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> -             kick_ilb(NOHZ_STATS_KICK);
>> +     kick_ilb(NOHZ_STATS_KICK);
>>       raw_spin_lock(&this_rq->lock);
>>  }
>
> This change don't seem to effect the issue. I can still get the single
> ssh session and the system to lockup by hitting the return key. And
> opening a second ssh session immediately unblocks both the first ssh
> session and the serial console. And I can still trigger the console
> warning by just letting the system be once it locks-up. I do have
> just as before reset the system a few times to trigger the issue.

You results are similar to Heiner's ones. The problem is still there
even if we only kick ilb which mainly send an IPI reschedule to the
other CPU if Idle

>
> [  245.351693] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  245.357199]  0-...!: (1 GPs behind) idle=93c/0/0 softirq=2224/2225 fqs=0
> [  245.363988]  (detected by 1, t=3025 jiffies, g=337, c=336, q=10)
> [  245.370003] Sending NMI from CPU 1 to CPUs 0:
> [  245.374368] NMI backtrace for cpu 0
> [  245.374377] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [  245.374379] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  245.374393] PC is at arch_cpu_idle+0x24/0x40
> [  245.374397] LR is at arch_cpu_idle+0x34/0x40
> [  245.374400] pc : [<c0108554>]    lr : [<c0108564>]    psr: 60050013
> [  245.374403] sp : c0b01f40  ip : c0b01f50  fp : c0b01f4c
> [  245.374405] r10: c0a56a38  r9 : e7fffbc0  r8 : c0b04c00
> [  245.374407] r7 : c0b04c78  r6 : c0b04c2c  r5 : ffffe000  r4 : 00000001
> [  245.374410] r3 : c0119100  r2 : e77813a8  r1 : 0002d93c  r0 : 00000000
> [  245.374414] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [  245.374417] Control: 10c5387d  Table: 6662006a  DAC: 00000051
> [  245.374421] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [  245.374423] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  245.374425] Backtrace:
> [  245.374435] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [  245.374440]  r7:c0b47278 r6:60050193 r5:00000000 r4:c0b73d80
> [  245.374450] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
> [  245.374456] [<c074b7a0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
> [  245.374460]  r7:c0b47278 r6:c0b01ef0 r5:00000000 r4:c0bc62c8
> [  245.374468] [<c01087d4>] (show_regs) from [<c0751938>] (nmi_cpu_backtrace+0xfc/0x118)
> [  245.374475] [<c075183c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
> [  245.374479]  r7:c0b47278 r6:c0b01ef0 r5:00000007 r4:c0a775fc
> [  245.374488] [<c010ed08>] (handle_IPI) from [<c037cd68>] (gic_handle_irq+0x8c/0x98)
> [  245.374492]  r10:c0a56a38 r9:c0b00000 r8:f0803000 r7:c0b47278 r6:c0b01ef0 r5:c0b05244
> [  245.374495]  r4:f0802000 r3:00000407
> [  245.374501] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [  245.374504] Exception stack(0xc0b01ef0 to 0xc0b01f38)
> [  245.374507] 1ee0:                                     00000000 0002d93c e77813a8 c0119100
> [  245.374512] 1f00: 00000001 ffffe000 c0b04c2c c0b04c78 c0b04c00 e7fffbc0 c0a56a38 c0b01f4c
> [  245.374516] 1f20: c0b01f50 c0b01f40 c0108564 c0108554 60050013 ffffffff
> [  245.374521]  r9:c0b00000 r8:c0b04c00 r7:c0b01f24 r6:ffffffff r5:60050013 r4:c0108554
> [  245.374528] [<c0108530>] (arch_cpu_idle) from [<c076512c>] (default_idle_call+0x30/0x34)
> [  245.374535] [<c07650fc>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
> [  245.374540] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
> [  245.374543]  r7:c0b04c08 r6:ffffffff r5:c0b80380 r4:000000c2
> [  245.374549] [<c014bad0>] (cpu_startup_entry) from [<c075fb14>] (rest_init+0x9c/0xbc)
> [  245.374555] [<c075fa78>] (rest_init) from [<c0a00dc4>] (start_kernel+0x368/0x3ec)
> [  245.374558]  r5:c0b80380 r4:c0b803c0
> [  245.374563] [<c0a00a5c>] (start_kernel) from [<00000000>] (  (null))
> [  245.375369] rcu_sched kthread starved for 3028 jiffies! g337 c336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
> [  245.641242] RCU grace-period kthread stack dump:
> [  245.645859] rcu_sched       I    0     9      2 0x00000000
> [  245.651350] Backtrace:
> [  245.653804] [<c07603f0>] (__schedule) from [<c0760b44>] (schedule+0x94/0xb8)
> [  245.660857]  r10:e77903c0 r9:e77903c0 r8:ffffffff r7:e709bed4 r6:ffffded7 r5:00000000
> [  245.668689]  r4:ffffe000
> [  245.671227] [<c0760ab0>] (schedule) from [<c0764500>] (schedule_timeout+0x380/0x3dc)
> [  245.678973]  r5:00000000 r4:00000000
> [  245.682557] [<c0764180>] (schedule_timeout) from [<c017b620>] (rcu_gp_kthread+0x668/0xe2c)
> [  245.690826]  r10:c0b79018 r9:00000151 r8:00000150 r7:00000001 r6:00000000 r5:c0b10ad0
> [  245.698658]  r4:c0b10980
> [  245.701197] [<c017afb8>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
> [  245.708682]  r7:c0b10980
> [  245.711218] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [  245.718443] Exception stack(0xe709bfb0 to 0xe709bff8)
> [  245.723495] bfa0:                                     00000000 00000000 00000000 00000000
> [  245.731677] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  245.739858] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [  245.746476]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
> [  245.754307]  r4:e7026f40
> [  335.351162] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  335.356682]  0-...!: (0 ticks this GP) idle=b0a/1/1073741824 softirq=2226/2226 fqs=0
> [  335.364521]  (detected by 1, t=6027 jiffies, g=340, c=339, q=77)
> [  335.370545] Sending NMI from CPU 1 to CPUs 0:
> [  345.374803] NMI backtrace for cpu 1
> [  345.378295] CPU: 1 PID: 1926 Comm: bash Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [  345.386215] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  345.392395] Backtrace:
> [  345.394855] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [  345.402430]  r7:00000000 r6:600f0193 r5:00000000 r4:c0b73d80
> [  345.408097] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
> [  345.415329] [<c074b7a0>] (dump_stack) from [<c0751940>] (nmi_cpu_backtrace+0x104/0x118)
> [  345.423337]  r7:00000000 r6:00000000 r5:00000001 r4:c0bc62c8
> [  345.429001] [<c075183c>] (nmi_cpu_backtrace) from [<c07519e0>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
> [  345.438572]  r7:00000000 r6:c010e4d4 r5:00000001 r4:c0b08158
> [  345.444237] [<c0751954>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
> [  345.454850]  r7:c0b10980 r6:c0b04cd0 r5:00000001 r4:c0b10980
> [  345.460519] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ac>] (rcu_dump_cpu_stacks+0xb0/0xd0)
> [  345.470268] [<c017f8fc>] (rcu_dump_cpu_stacks) from [<c017e73c>] (rcu_check_callbacks+0x584/0x71c)
> [  345.479232]  r10:0000004d r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7795080
> [  345.487065]  r4:c0b10980 r3:f181863e
> [  345.490648] [<c017e1b8>] (rcu_check_callbacks) from [<c0184a60>] (update_process_times+0x38/0x64)
> [  345.499524]  r10:e7fa9ea0 r9:e67e8000 r8:e7795e00 r7:0000004e r6:00000000 r5:e72d0640
> [  345.507356]  r4:ffffe000
> [  345.509903] [<c0184a28>] (update_process_times) from [<c0195bf0>] (tick_nohz_handler+0xc8/0x114)
> [  345.518693]  r7:0000004e r6:146afc64 r5:e67e9d70 r4:e77917a0
> [  345.524362] [<c0195b28>] (tick_nohz_handler) from [<c05c16ac>] (arch_timer_handler_virt+0x30/0x40)
> [  345.533325]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
> [  345.540033] [<c05c167c>] (arch_timer_handler_virt) from [<c016ecd4>] (handle_percpu_devid_irq+0xf0/0x224)
> [  345.549607] [<c016ebe4>] (handle_percpu_devid_irq) from [<c016a208>] (generic_handle_irq+0x20/0x30)
> [  345.558657]  r10:e7fa9ea0 r9:e67e8000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [  345.566489]  r4:00000000 r3:c016ebe4
> [  345.570067] [<c016a1e8>] (generic_handle_irq) from [<c016a2c0>] (__handle_domain_irq+0xa8/0xbc)
> [  345.578777] [<c016a218>] (__handle_domain_irq) from [<c037cd3c>] (gic_handle_irq+0x60/0x98)
> [  345.587132]  r9:e67e8000 r8:f0803000 r7:c0b47278 r6:e67e9d70 r5:c0b05244 r4:f0802000
> [  345.594883] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [  345.602368] Exception stack(0xe67e9d70 to 0xe67e9db8)
> [  345.607421] 9d60:                                     00000000 e6c433e8 00182b08 00000000
> [  345.615603] 9d80: e6c429b4 00000087 e67e9e34 e6c434c4 00000000 00000087 e7fa9ea0 e67e9e14
> [  345.623784] 9da0: 00000000 e67e9dc0 c01e0f34 c01e0f88 600f0113 ffffffff
> [  345.630401]  r9:e67e8000 r8:00000000 r7:e67e9da4 r6:ffffffff r5:600f0113 r4:c01e0f88
> [  345.638156] [<c01e0d84>] (filemap_map_pages) from [<c02141f0>] (handle_mm_fault+0x5c8/0x93c)
> [  345.646599]  r10:e5dc5380 r9:80000007 r8:00000080 r7:b6e02000 r6:c0b04c08 r5:0000008f
> [  345.654431]  r4:00000000
> [  345.656968] [<c0213c28>] (handle_mm_fault) from [<c011404c>] (do_page_fault+0x17c/0x324)
> [  345.665063]  r10:e5dc5380 r9:80000007 r8:e72d0640 r7:00000054 r6:e5d030c0 r5:b6e023dc
> [  345.672895]  r4:e67e9fb0
> [  345.675431] [<c0113ed0>] (do_page_fault) from [<c0114470>] (do_PrefetchAbort+0x4c/0xc8)
> [  345.683439]  r10:00000010 r9:00179207 r8:c0b0a434 r7:e67e9fb0 r6:b6e023dc r5:c0b04c08
> [  345.691271]  r4:00000007
> [  345.693807] [<c0114424>] (do_PrefetchAbort) from [<c010204c>] (ret_from_exception+0x0/0x14)
> [  345.702160] Exception stack(0xe67e9fb0 to 0xe67e9ff8)
> [  345.707213] 9fa0:                                     00000000 000e9600 00000001 bec86544
> [  345.715395] 9fc0: 00000001 bec86544 00000000 00000000 bec8654c 00179207 00000010 000e9600
> [  345.723576] 9fe0: 000c8140 bec86538 0002847c b6e023dc 200f0010 ffffffff
> [  345.730192]  r8:10c5387d r7:10c5387d r6:ffffffff r5:200f0010 r4:b6e023dc
> [  345.737364] nfs: server 10.0.1.1 not responding, still trying
> [  345.743294] nfs: server 10.0.1.1 not responding, still trying
> [  345.744534] nfs: server 10.0.1.1 not responding, still trying
> [  345.749262] nfs: server 10.0.1.1 OK
> [  345.755391] nfs: server 10.0.1.1 OK
> [  345.758641] nfs: server 10.0.1.1 OK
>
>>
>> --
>> 2.7.4
>>
>>
>> The 2nd patch does the opposite and never kick ilb. I have always made the
>> assumtpion that the problem was coming from _nohz_idle_balance() and I just
>> wanted to check that's the case
>>
>> ---
>>  kernel/sched/fair.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..1766594 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>>        * candidate for ilb instead of waking up another idle CPU.
>>        * Kick an normal ilb if we failed to do the update.
>>        */
>> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> -             kick_ilb(NOHZ_STATS_KICK);
>> +     _nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE);
>>       raw_spin_lock(&this_rq->lock);
>>  }
>
> With this change it seem more difficult to trigger the issue, I usually
> can trigger it within 2-3 rests of the system but with this applied I
> had to reset it over 5 times to trigger it. But I can still reproduce
> the lockup reliably multiple times. I would not trust my feeling that
> it's harder to reproduce with this change as the method to trigger it is
> kind of spotty and most likely due to circumstantial circumstances :-)
>
> I can still unlock the system by starting a second ssh session and
> trigger the output on the serial console.
>
> [  193.741481] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  193.746991]  1-...!: (0 ticks this GP) idle=83c/0/0 softirq=2445/2445 fqs=0
> [  193.754041]  (detected by 0, t=4096 jiffies, g=611, c=610, q=31)
> [  193.760054] NMI backtrace for cpu 0
> [  193.763546] CPU: 0 PID: 2193 Comm: bash Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [  193.771984] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  193.778163] Backtrace:
> [  193.780625] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [  193.788198]  r7:00000000 r6:60010193 r5:00000000 r4:c0b73d80
> [  193.793865] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
> [  193.801093] [<c074b7e0>] (dump_stack) from [<c0751980>] (nmi_cpu_backtrace+0x104/0x118)
> [  193.809098]  r7:00000000 r6:00000000 r5:00000000 r4:c0bc62c8
> [  193.814761] [<c075187c>] (nmi_cpu_backtrace) from [<c0751a20>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
> [  193.824329]  r7:00000000 r6:c010e4d4 r5:00000000 r4:c0b08158
> [  193.829992] [<c0751994>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
> [  193.840603]  r7:c0b10980 r6:c0b04cd0 r5:00000000 r4:c0b10980
> [  193.846268] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ec>] (rcu_dump_cpu_stacks+0xb0/0xd0)
> [  193.856015] [<c017f93c>] (rcu_dump_cpu_stacks) from [<c017e77c>] (rcu_check_callbacks+0x584/0x71c)
> [  193.864977]  r10:0000001f r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7786080
> [  193.872810]  r4:c0b10980 r3:f181863e
> [  193.876392] [<c017e1f8>] (rcu_check_callbacks) from [<c0184aa0>] (update_process_times+0x38/0x64)
> [  193.885268]  r10:00000000 r9:00180b98 r8:e7786e00 r7:0000002d r6:00000001 r5:e5d86180
> [  193.893100]  r4:ffffe000
> [  193.895642] [<c0184a68>] (update_process_times) from [<c0195c30>] (tick_nohz_handler+0xc8/0x114)
> [  193.904431]  r7:0000002d r6:1bcdda00 r5:e66c7fb0 r4:e77827a0
> [  193.910099] [<c0195b68>] (tick_nohz_handler) from [<c05c16ec>] (arch_timer_handler_virt+0x30/0x40)
> [  193.919062]  r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
> [  193.925769] [<c05c16bc>] (arch_timer_handler_virt) from [<c016ed14>] (handle_percpu_devid_irq+0xf0/0x224)
> [  193.935343] [<c016ec24>] (handle_percpu_devid_irq) from [<c016a248>] (generic_handle_irq+0x20/0x30)
> [  193.944394]  r10:00000000 r9:00180b98 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [  193.952225]  r4:00000000 r3:c016ec24
> [  193.955803] [<c016a228>] (generic_handle_irq) from [<c016a300>] (__handle_domain_irq+0xa8/0xbc)
> [  193.964513] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
> [  193.972869]  r9:00180b98 r8:f0803000 r7:c0b47278 r6:e66c7fb0 r5:c0b05244 r4:f0802000
> [  193.980619] [<c037cd1c>] (gic_handle_irq) from [<c0101dd4>] (__irq_usr+0x54/0x80)
> [  193.988105] Exception stack(0xe66c7fb0 to 0xe66c7ff8)
> [  193.993157] 7fa0:                                     00000000 00000000 b6e8b79c 00000000
> [  194.001339] 7fc0: 00178378 000134e0 b6e8b794 b6e8b000 b6e8b794 00180b98 00000000 b6e8b7c4
> [  194.009520] 7fe0: be955190 be955170 00000000 b6db5c58 20010010 ffffffff
> [  194.016137]  r9:00180b98 r8:10c5387d r7:10c5387d r6:ffffffff r5:20010010 r4:b6db5c58
> [  194.023888] Sending NMI from CPU 0 to CPUs 1:
> [  204.028147] NMI backtrace for cpu 1
> [  204.028152] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [  204.028155] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  204.028164] PC is at _raw_spin_unlock_irqrestore+0x28/0x2c
> [  204.028169] LR is at rcu_report_qs_rnp+0x4c/0x20c
> [  204.028172] pc : [<c07653d8>]    lr : [<c01796d4>]    psr: 200e0113
> [  204.028175] sp : e70a3db0  ip : e70a3dc0  fp : e70a3dbc
> [  204.028177] r10: 00000264  r9 : c0b787eb  r8 : 00000001
> [  204.028179] r7 : 00000002  r6 : c0b10980  r5 : c0b10980  r4 : c0b10980
> [  204.028182] r3 : 000022a3  r2 : 00000264  r1 : 600e0113  r0 : c0b10980
> [  204.028187] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [  204.028189] Control: 10c5387d  Table: 66b9806a  DAC: 00000051
> [  204.028193] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [  204.028195] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [  204.028197] Backtrace:
> [  204.028204] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [  204.028209]  r7:c0b47278 r6:600e0193 r5:00000000 r4:c0b73d80
> [  204.028215] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
> [  204.028228] [<c074b7e0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
> [  204.028231]  r7:c0b47278 r6:e70a3d60 r5:00000001 r4:c0bc62c8
> [  204.028238] [<c01087d4>] (show_regs) from [<c0751978>] (nmi_cpu_backtrace+0xfc/0x118)
> [  204.028244] [<c075187c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
> [  204.028247]  r7:c0b47278 r6:e70a3d60 r5:00000007 r4:c0a775fc
> [  204.028254] [<c010ed08>] (handle_IPI) from [<c037cda8>] (gic_handle_irq+0x8c/0x98)
> [  204.028259]  r10:00000264 r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3d60 r5:c0b05244
> [  204.028261]  r4:f0802000 r3:00000007
> [  204.028267] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [  204.028270] Exception stack(0xe70a3d60 to 0xe70a3da8)
> [  204.028275] 3d60: c0b10980 600e0113 00000264 000022a3 c0b10980 c0b10980 c0b10980 00000002
> [  204.028280] 3d80: 00000001 c0b787eb 00000264 e70a3dbc e70a3dc0 e70a3db0 c01796d4 c07653d8
> [  204.028283] 3da0: 200e0113 ffffffff
> [  204.028287]  r9:e70a2000 r8:00000001 r7:e70a3d94 r6:ffffffff r5:200e0113 r4:c07653d8
> [  204.028295] [<c07653b0>] (_raw_spin_unlock_irqrestore) from [<c01796d4>] (rcu_report_qs_rnp+0x4c/0x20c)
> [  204.028301] [<c0179688>] (rcu_report_qs_rnp) from [<c017c188>] (rcu_process_callbacks+0x364/0x8cc)
> [  204.028305]  r10:c0b03080 r9:00000000 r8:00000002 r7:600e0113 r6:c0b10980 r5:c0b10980
> [  204.028306]  r4:e7795080
> [  204.028312] [<c017be24>] (rcu_process_callbacks) from [<c010236c>] (__do_softirq+0x23c/0x338)
> [  204.028316]  r10:c0b03080 r9:00000100 r8:c0b80ec0 r7:00000002 r6:00000009 r5:c0b030a4
> [  204.028318]  r4:ffffe000
> [  204.028325] [<c0102130>] (__do_softirq) from [<c0125b90>] (irq_exit+0x6c/0xd4)
> [  204.028329]  r10:00000000 r9:e70a2000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [  204.028331]  r4:00000000
> [  204.028336] [<c0125b24>] (irq_exit) from [<c016a2e0>] (__handle_domain_irq+0x88/0xbc)
> [  204.028338]  r5:00000000 r4:00000000
> [  204.028344] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
> [  204.028348]  r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3f40 r5:c0b05244 r4:f0802000
> [  204.028354] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [  204.028356] Exception stack(0xe70a3f40 to 0xe70a3f88)
> [  204.028361] 3f40: 00000000 00013844 e77903a8 c0119100 00000002 ffffe000 c0b04c2c c0b04c78
> [  204.028367] 3f60: 4000406a 413fc0f2 00000000 e70a3f9c e70a3fa0 e70a3f90 c0108564 c0108554
> [  204.028370] 3f80: 600e0013 ffffffff
> [  204.028374]  r9:e70a2000 r8:4000406a r7:e70a3f74 r6:ffffffff r5:600e0013 r4:c0108554
> [  204.028381] [<c0108530>] (arch_cpu_idle) from [<c076516c>] (default_idle_call+0x30/0x34)
> [  204.028388] [<c076513c>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
> [  204.028392] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
> [  204.028396]  r7:c0b80750 r6:10c0387d r5:00000051 r4:00000085
> [  204.028401] [<c014bad0>] (cpu_startup_entry) from [<c010eacc>] (secondary_start_kernel+0x10c/0x12c)
> [  204.028406] [<c010e9c0>] (secondary_start_kernel) from [<4010250c>] (0x4010250c)
> [  204.028408]  r5:00000051 r4:6709406a
>
>>
>> --
>> 2.7.4
>>
>>
>> Thanks,
>> Vincent
>>
>> > >
>> > >
>> > > Regards,
>> > > Vincent
>> > > >
>> > > > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> > > > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
>> > > >
>> > > > --
>> > > > Regards,
>> > > > Niklas Söderlund
>> >
>> > --
>> > Regards,
>> > Niklas Söderlund
>
> --
> Regards,
> Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-13 20:38       ` Niklas Söderlund
  (?)
@ 2018-04-14 11:26       ` Vincent Guittot
  -1 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-14 11:26 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc, Heiner Kallweit

On 13 April 2018 at 22:38, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> Hi Vincent,
>
> On 2018-04-12 13:15:19 +0200, Niklas Söderlund wrote:
>> Hi Vincent,
>>
>> Thanks for your feedback.
>>
>> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
>> > Hi Niklas,
>> >
>> > On 12 April 2018 at 11:18, Niklas Söderlund
>> > <niklas.soderlund@ragnatech.se> wrote:
>> > > Hi Vincent,
>> > >
>> > > I have observed issues running on linus/master from a few days back [1].
>> > > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
>> > > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
>> > > courser around in the GUI (best test case description award...). I'm
>> > > sorry about the really bad way I trigger this but I can't do it in any
>> > > other way, I'm happy to try other methods if you got some ideas. The
>> > > symptom of the issue is a complete hang of the system for more then 30
>> > > seconds and then this information is printed in the console:
>> >
>> > Heiner (edded cc) also reported similar problem with his platform: a
>> > dual core celeron
>> >
>> > Do you confirm that your platform is a dual cortex-A15 ? At least that
>> > what I have seen on web
>> > This would confirm that dual system is a key point.
>>
>> I can confirm that my platform is a dual core.
>
> I tested another dual core system today Renesas M3-W ARM64 system and I
> can observe the same lockups-on that system if it helps you understand
> the problem. It seems to be much harder to trigger the issue on this
> system for some reason. Hitting return in a ssh session don't seem to
> produce the lockup while starting a GUI using X forwarding over ssh it's
> possible.

Thanks for the test. That's confirm, it's only happen on dual core

>
> [  392.306441] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [  392.312201]  (detected by 0, t=19366 jiffies, g=7177, c=7176, q=35)
> [  392.318555] All QSes seen, last rcu_preempt kthread activity 19368
> (4294990375-4294971007), jiffies_till_next_fqs=1, root ->qsmask 0x0
> [  392.330758] swapper/0       R  running task        0     0      0
> 0x00000022
> [  392.337883] Call trace:
> [  392.340365]  dump_backtrace+0x0/0x1c8
> [  392.344065]  show_stack+0x14/0x20
> [  392.347416]  sched_show_task+0x224/0x2e8
> [  392.351377]  rcu_check_callbacks+0x8ac/0x8b0
> [  392.355686]  update_process_times+0x2c/0x58
> [  392.359908]  tick_sched_handle.isra.5+0x30/0x50
> [  392.364479]  tick_sched_timer+0x40/0x90
> [  392.368351]  __hrtimer_run_queues+0xfc/0x208
> [  392.372659]  hrtimer_interrupt+0xd4/0x258
> [  392.376710]  arch_timer_handler_virt+0x28/0x48
> [  392.381194]  handle_percpu_devid_irq+0x80/0x138
> [  392.385767]  generic_handle_irq+0x28/0x40
> [  392.389813]  __handle_domain_irq+0x5c/0xb8
> [  392.393946]  gic_handle_irq+0x58/0xa8
> [  392.397640]  el1_irq+0xb4/0x130
> [  392.400810]  arch_cpu_idle+0x14/0x20
> [  392.404422]  default_idle_call+0x1c/0x38
> [  392.408381]  do_idle+0x17c/0x1f8
> [  392.411640]  cpu_startup_entry+0x20/0x28
> [  392.415598]  rest_init+0x24c/0x260
> [  392.419037]  start_kernel+0x3e8/0x414
>
> I was running the same tests on another ARM64 platform earlier using the
> same build which have more then two cores and there I could not observe
> this issue.
>
> --
> Regards,
> Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-14 11:24         ` Vincent Guittot
@ 2018-04-20 16:00             ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-20 16:00 UTC (permalink / raw)
  To: Niklas Söderlund, Heiner Kallweit
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Heiner and Niklas,

Le Saturday 14 Apr 2018 à 13:24:20 (+0200), Vincent Guittot a écrit :
> Hi Niklas,
> 
> On 13 April 2018 at 00:39, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> > Hi Vincent,
> >
> > Thanks for helping trying to figure this out.
> >
> > On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
> >
> > [snip]
> >
> >>
> >> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> >> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> >> and check if one of them fixes the problem ?i
> >
> > I tried your suggested changes based on top of c18bb396d3d261eb.
> >
> >>
> >> (They should apply on linux-next as well)
> >>
> >> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> >>
> >> ---
> >>  kernel/sched/fair.c | 3 +--
> >>  1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..b21925b 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> >>        * candidate for ilb instead of waking up another idle CPU.
> >>        * Kick an normal ilb if we failed to do the update.
> >>        */
> >> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> >> -             kick_ilb(NOHZ_STATS_KICK);
> >> +     kick_ilb(NOHZ_STATS_KICK);
> >>       raw_spin_lock(&this_rq->lock);
> >>  }
> >
> > This change don't seem to effect the issue. I can still get the single
> > ssh session and the system to lockup by hitting the return key. And
> > opening a second ssh session immediately unblocks both the first ssh
> > session and the serial console. And I can still trigger the console
> > warning by just letting the system be once it locks-up. I do have
> > just as before reset the system a few times to trigger the issue.
> 
> You results are similar to Heiner's ones. The problem is still there
> even if we only kick ilb which mainly send an IPI reschedule to the
> other CPU if Idle
> 

Could it be possible to record some traces of the problem to get a better view
of what happens ?

I have a small patch that adds some traces in the functions that seems to create
the problem

---
 kernel/sched/fair.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..a951464 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	 */
 	WRITE_ONCE(nohz.has_blocked, 0);
 
+	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
 	/*
 	 * Ensures that if we miss the CPU, we must see the has_blocked
 	 * store from nohz_balance_enter_idle().
@@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	if (likely(update_next_balance))
 		nohz.next_balance = next_balance;
 
+	trace_printk("_nohz_idle_balance return %d", ret);
+
 	return ret;
 }
 
@@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
 		return;
 
+	trace_printk("nohz_newidle_balance start update");
+
 	raw_spin_unlock(&this_rq->lock);
 	/*
 	 * This CPU is going to be idle and blocked load of idle CPUs
-- 
2.7.4


Also that would be good to stop tracing when the RCU stall happens

In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called

trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace

The trace-cmd start has to be called twice to make sure that we will not trace all function

Once the dump happen and the trace is stopped, you can extract the traces with 

trace-cmd extract -o <trace file name>

Thanks
Vincent

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-20 16:00             ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-20 16:00 UTC (permalink / raw)
  To: Niklas Söderlund, Heiner Kallweit
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Heiner and Niklas,

Le Saturday 14 Apr 2018 � 13:24:20 (+0200), Vincent Guittot a �crit :
> Hi Niklas,
> 
> On 13 April 2018 at 00:39, Niklas S�derlund
> <niklas.soderlund@ragnatech.se> wrote:
> > Hi Vincent,
> >
> > Thanks for helping trying to figure this out.
> >
> > On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
> >
> > [snip]
> >
> >>
> >> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> >> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> >> and check if one of them fixes the problem ?i
> >
> > I tried your suggested changes based on top of c18bb396d3d261eb.
> >
> >>
> >> (They should apply on linux-next as well)
> >>
> >> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> >>
> >> ---
> >>  kernel/sched/fair.c | 3 +--
> >>  1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..b21925b 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> >>        * candidate for ilb instead of waking up another idle CPU.
> >>        * Kick an normal ilb if we failed to do the update.
> >>        */
> >> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> >> -             kick_ilb(NOHZ_STATS_KICK);
> >> +     kick_ilb(NOHZ_STATS_KICK);
> >>       raw_spin_lock(&this_rq->lock);
> >>  }
> >
> > This change don't seem to effect the issue. I can still get the single
> > ssh session and the system to lockup by hitting the return key. And
> > opening a second ssh session immediately unblocks both the first ssh
> > session and the serial console. And I can still trigger the console
> > warning by just letting the system be once it locks-up. I do have
> > just as before reset the system a few times to trigger the issue.
> 
> You results are similar to Heiner's ones. The problem is still there
> even if we only kick ilb which mainly send an IPI reschedule to the
> other CPU if Idle
> 

Could it be possible to record some traces of the problem to get a better view
of what happens ?

I have a small patch that adds some traces in the functions that seems to create
the problem

---
 kernel/sched/fair.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..a951464 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	 */
 	WRITE_ONCE(nohz.has_blocked, 0);
 
+	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
 	/*
 	 * Ensures that if we miss the CPU, we must see the has_blocked
 	 * store from nohz_balance_enter_idle().
@@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	if (likely(update_next_balance))
 		nohz.next_balance = next_balance;
 
+	trace_printk("_nohz_idle_balance return %d", ret);
+
 	return ret;
 }
 
@@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
 		return;
 
+	trace_printk("nohz_newidle_balance start update");
+
 	raw_spin_unlock(&this_rq->lock);
 	/*
 	 * This CPU is going to be idle and blocked load of idle CPUs
-- 
2.7.4


Also that would be good to stop tracing when the RCU stall happens

In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called

trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace

The trace-cmd start has to be called twice to make sure that we will not trace all function

Once the dump happen and the trace is stopped, you can extract the traces with 

trace-cmd extract -o <trace file name>

Thanks
Vincent

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-20 16:00             ` Vincent Guittot
  (?)
@ 2018-04-20 16:30             ` Joel Fernandes
  -1 siblings, 0 replies; 33+ messages in thread
From: Joel Fernandes @ 2018-04-20 16:30 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Niklas Söderlund, Heiner Kallweit, Peter Zijlstra,
	Paul E. McKenney, Ingo Molnar, linux-kernel, linux-renesas-soc

On Fri, Apr 20, 2018 at 9:00 AM, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
[..]
>
> Le Saturday 14 Apr 2018 à 13:24:20 (+0200), Vincent Guittot a écrit :
>> Hi Niklas,
>>
>> On 13 April 2018 at 00:39, Niklas Söderlund
>> <niklas.soderlund@ragnatech.se> wrote:
>> > Hi Vincent,
>> >
>> > Thanks for helping trying to figure this out.
>> >
>> > On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
>> >
>> > [snip]
>> >
>> >>
>> >> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> >> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> >> and check if one of them fixes the problem ?i
>> >
>> > I tried your suggested changes based on top of c18bb396d3d261eb.
>> >
>> >>
>> >> (They should apply on linux-next as well)
>> >>
>> >> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>> >>
>> >> ---
>> >>  kernel/sched/fair.c | 3 +--
>> >>  1 file changed, 1 insertion(+), 2 deletions(-)
>> >>
>> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> >> index 0951d1c..b21925b 100644
>> >> --- a/kernel/sched/fair.c
>> >> +++ b/kernel/sched/fair.c
>> >> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> >>        * candidate for ilb instead of waking up another idle CPU.
>> >>        * Kick an normal ilb if we failed to do the update.
>> >>        */
>> >> -     if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> >> -             kick_ilb(NOHZ_STATS_KICK);
>> >> +     kick_ilb(NOHZ_STATS_KICK);
>> >>       raw_spin_lock(&this_rq->lock);
>> >>  }
>> >
>> > This change don't seem to effect the issue. I can still get the single
>> > ssh session and the system to lockup by hitting the return key. And
>> > opening a second ssh session immediately unblocks both the first ssh
>> > session and the serial console. And I can still trigger the console
>> > warning by just letting the system be once it locks-up. I do have
>> > just as before reset the system a few times to trigger the issue.
>>
>> You results are similar to Heiner's ones. The problem is still there
>> even if we only kick ilb which mainly send an IPI reschedule to the
>> other CPU if Idle
>>
>
> Could it be possible to record some traces of the problem to get a better view
> of what happens ?
>
> I have a small patch that adds some traces in the functions that seems to create
> the problem
>
> ---
>  kernel/sched/fair.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..a951464 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>          */
>         WRITE_ONCE(nohz.has_blocked, 0);
>
> +       trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
> +
>         /*
>          * Ensures that if we miss the CPU, we must see the has_blocked
>          * store from nohz_balance_enter_idle().
> @@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>         if (likely(update_next_balance))
>                 nohz.next_balance = next_balance;
>
> +       trace_printk("_nohz_idle_balance return %d", ret);
> +
>         return ret;
>  }
>
> @@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
>             time_before(jiffies, READ_ONCE(nohz.next_blocked)))
>                 return;
>
> +       trace_printk("nohz_newidle_balance start update");
> +
>         raw_spin_unlock(&this_rq->lock);
>         /*
>          * This CPU is going to be idle and blocked load of idle CPUs
> --
> 2.7.4
>
>
> Also that would be good to stop tracing when the RCU stall happens
>
> In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
>
> trace-cmd reset > /dev/null
> trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> trace-cmd start -b 40000 -p function -l dump_backtrace
>
> The trace-cmd start has to be called twice to make sure that we will not trace all function
>
> Once the dump happen and the trace is stopped, you can extract the traces with
>
> trace-cmd extract -o <trace file name>

Just adding to that (and... Feel free to use or ignore these tips)
To prevent the trace_printk from getting lost in the trace flood, you
can also call tracing_stop() from wherever the problem is detected and
look at the last set of messages which is easier. Also you can dump
the ftrace buffer to the kernel logs before the lock up, using
ftrace_dump() and when you do that you probably want to limit the
ftrace buffer size.

thanks,

- Joel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-20 16:00             ` Vincent Guittot
@ 2018-04-22 22:18               ` Niklas Söderlund
  -1 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-22 22:18 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

[-- Attachment #1: Type: text/plain, Size: 13687 bytes --]

Hi Vincent,

On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:

[snip]

> > 
> > You results are similar to Heiner's ones. The problem is still there
> > even if we only kick ilb which mainly send an IPI reschedule to the
> > other CPU if Idle
> > 
> 
> Could it be possible to record some traces of the problem to get a better view
> of what happens ?
> 
> I have a small patch that adds some traces in the functions that seems to create
> the problem

Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").

> 
> ---
>  kernel/sched/fair.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..a951464 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>  	 */
>  	WRITE_ONCE(nohz.has_blocked, 0);
>  
> +	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
> +
>  	/*
>  	 * Ensures that if we miss the CPU, we must see the has_blocked
>  	 * store from nohz_balance_enter_idle().
> @@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>  	if (likely(update_next_balance))
>  		nohz.next_balance = next_balance;
>  
> +	trace_printk("_nohz_idle_balance return %d", ret);
> +
>  	return ret;
>  }
>  
> @@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
>  		return;
>  
> +	trace_printk("nohz_newidle_balance start update");
> +
>  	raw_spin_unlock(&this_rq->lock);
>  	/*
>  	 * This CPU is going to be idle and blocked load of idle CPUs
> -- 
> 2.7.4
> 
> 
> Also that would be good to stop tracing when the RCU stall happens
> 
> In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
> 
> trace-cmd reset > /dev/null
> trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> trace-cmd start -b 40000 -p function -l dump_backtrace
> 
> The trace-cmd start has to be called twice to make sure that we will not trace all function
> 
> Once the dump happen and the trace is stopped, you can extract the traces with 
> 
> trace-cmd extract -o <trace file name>

Thanks for the help with trace-cmd, I have attached the full 
trace-cmd.dat extracted from running this. But a quick run of trace-cmd 
report.

       trace-cmd-1948  [000]   144.504072: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       trace-cmd-1948  [000]   144.504079: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       trace-cmd-1948  [000]   144.504082: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   144.572757: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   144.572765: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   144.572771: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   144.612750: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   144.612752: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   144.612757: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   145.032809: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   145.032813: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   145.032817: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
    kworker/u5:0-460   [001]   145.249412: bputs:                pick_next_task_fair: nohz_newidle_balance start update
    kworker/u5:0-460   [001]   145.249416: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
    kworker/u5:0-460   [001]   145.249418: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [001]   145.362764: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [001]   145.362773: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
       rcu_sched-9     [001]   145.362775: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
 systemd-journal-1090  [001]   145.582817: bputs:                pick_next_task_fair: nohz_newidle_balance start update
 systemd-journal-1090  [001]   145.582822: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
 systemd-journal-1090  [001]   145.582824: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   146.392753: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   146.392756: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   146.392760: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
   systemd-udevd-1723  [000]   146.832885: bputs:                pick_next_task_fair: nohz_newidle_balance start update
   systemd-udevd-1723  [000]   146.832891: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
   systemd-udevd-1723  [000]   146.832893: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [001]   146.882739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [001]   146.882742: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
       rcu_sched-9     [001]   146.882747: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         haveged-1823  [001]   146.952985: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         haveged-1823  [001]   146.952988: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         haveged-1823  [001]   146.952993: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.002743: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.002752: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.002755: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.042741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.042744: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.042748: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.082741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.082743: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.082748: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
            sshd-1881  [000]   147.277339: bputs:                pick_next_task_fair: nohz_newidle_balance start update
            sshd-1881  [000]   147.277343: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
            sshd-1881  [000]   147.277346: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.342741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.342749: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.342754: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.382739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.382741: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.382745: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
            sshd-1881  [000]   147.621105: bputs:                pick_next_task_fair: nohz_newidle_balance start update
            sshd-1881  [000]   147.621108: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
            sshd-1881  [000]   147.621110: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
          <idle>-0     [001]   147.662725: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
          <idle>-0     [001]   147.662730: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   151.032830: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   151.032833: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   151.032835: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   155.032777: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   155.032780: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   155.032781: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         haveged-1823  [001]   155.515789: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         haveged-1823  [001]   155.515792: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         haveged-1823  [001]   155.515794: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   157.032718: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   157.032720: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   157.032722: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   159.032724: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   159.032726: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   159.032728: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   161.032698: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   161.032701: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   161.032702: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   165.032671: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   165.032673: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   165.032675: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   167.032667: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   167.032669: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   167.032671: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   169.032654: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   169.032657: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   169.032658: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         rpcbind-1847  [001]   169.472688: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         rpcbind-1847  [001]   169.472691: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         rpcbind-1847  [001]   169.472693: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   171.032660: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   171.032662: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   171.032664: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         rpcbind-1847  [001]   199.482558: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         rpcbind-1847  [001]   199.482561: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         rpcbind-1847  [001]   199.482563: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
          <idle>-0     [000]   241.615152: function:             dump_backtrace

-- 
Regards,
Niklas Söderlund

[-- Attachment #2: trace-cmd.dat.gz --]
[-- Type: application/gzip, Size: 342838 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-22 22:18               ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-22 22:18 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

[-- Attachment #1: Type: text/plain, Size: 13689 bytes --]

Hi Vincent,

On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:

[snip]

> > 
> > You results are similar to Heiner's ones. The problem is still there
> > even if we only kick ilb which mainly send an IPI reschedule to the
> > other CPU if Idle
> > 
> 
> Could it be possible to record some traces of the problem to get a better view
> of what happens ?
> 
> I have a small patch that adds some traces in the functions that seems to create
> the problem

Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").

> 
> ---
>  kernel/sched/fair.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..a951464 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>  	 */
>  	WRITE_ONCE(nohz.has_blocked, 0);
>  
> +	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
> +
>  	/*
>  	 * Ensures that if we miss the CPU, we must see the has_blocked
>  	 * store from nohz_balance_enter_idle().
> @@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>  	if (likely(update_next_balance))
>  		nohz.next_balance = next_balance;
>  
> +	trace_printk("_nohz_idle_balance return %d", ret);
> +
>  	return ret;
>  }
>  
> @@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
>  	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
>  		return;
>  
> +	trace_printk("nohz_newidle_balance start update");
> +
>  	raw_spin_unlock(&this_rq->lock);
>  	/*
>  	 * This CPU is going to be idle and blocked load of idle CPUs
> -- 
> 2.7.4
> 
> 
> Also that would be good to stop tracing when the RCU stall happens
> 
> In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
> 
> trace-cmd reset > /dev/null
> trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> trace-cmd start -b 40000 -p function -l dump_backtrace
> 
> The trace-cmd start has to be called twice to make sure that we will not trace all function
> 
> Once the dump happen and the trace is stopped, you can extract the traces with 
> 
> trace-cmd extract -o <trace file name>

Thanks for the help with trace-cmd, I have attached the full 
trace-cmd.dat extracted from running this. But a quick run of trace-cmd 
report.

       trace-cmd-1948  [000]   144.504072: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       trace-cmd-1948  [000]   144.504079: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       trace-cmd-1948  [000]   144.504082: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   144.572757: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   144.572765: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   144.572771: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   144.612750: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   144.612752: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   144.612757: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   145.032809: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   145.032813: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   145.032817: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
    kworker/u5:0-460   [001]   145.249412: bputs:                pick_next_task_fair: nohz_newidle_balance start update
    kworker/u5:0-460   [001]   145.249416: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
    kworker/u5:0-460   [001]   145.249418: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [001]   145.362764: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [001]   145.362773: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
       rcu_sched-9     [001]   145.362775: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
 systemd-journal-1090  [001]   145.582817: bputs:                pick_next_task_fair: nohz_newidle_balance start update
 systemd-journal-1090  [001]   145.582822: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
 systemd-journal-1090  [001]   145.582824: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   146.392753: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   146.392756: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   146.392760: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
   systemd-udevd-1723  [000]   146.832885: bputs:                pick_next_task_fair: nohz_newidle_balance start update
   systemd-udevd-1723  [000]   146.832891: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
   systemd-udevd-1723  [000]   146.832893: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [001]   146.882739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [001]   146.882742: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
       rcu_sched-9     [001]   146.882747: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         haveged-1823  [001]   146.952985: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         haveged-1823  [001]   146.952988: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         haveged-1823  [001]   146.952993: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.002743: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.002752: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.002755: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.042741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.042744: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.042748: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.082741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.082743: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.082748: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
            sshd-1881  [000]   147.277339: bputs:                pick_next_task_fair: nohz_newidle_balance start update
            sshd-1881  [000]   147.277343: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
            sshd-1881  [000]   147.277346: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.342741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.342749: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.342754: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
       rcu_sched-9     [000]   147.382739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
       rcu_sched-9     [000]   147.382741: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
       rcu_sched-9     [000]   147.382745: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
            sshd-1881  [000]   147.621105: bputs:                pick_next_task_fair: nohz_newidle_balance start update
            sshd-1881  [000]   147.621108: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
            sshd-1881  [000]   147.621110: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
          <idle>-0     [001]   147.662725: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
          <idle>-0     [001]   147.662730: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   151.032830: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   151.032833: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   151.032835: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   155.032777: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   155.032780: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   155.032781: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         haveged-1823  [001]   155.515789: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         haveged-1823  [001]   155.515792: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         haveged-1823  [001]   155.515794: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   157.032718: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   157.032720: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   157.032722: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   159.032724: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   159.032726: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   159.032728: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   161.032698: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   161.032701: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   161.032702: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   165.032671: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   165.032673: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   165.032675: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   167.032667: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   167.032669: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   167.032671: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   169.032654: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   169.032657: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   169.032658: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         rpcbind-1847  [001]   169.472688: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         rpcbind-1847  [001]   169.472691: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         rpcbind-1847  [001]   169.472693: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
     kworker/1:1-31    [001]   171.032660: bputs:                pick_next_task_fair: nohz_newidle_balance start update
     kworker/1:1-31    [001]   171.032662: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
     kworker/1:1-31    [001]   171.032664: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
         rpcbind-1847  [001]   199.482558: bputs:                pick_next_task_fair: nohz_newidle_balance start update
         rpcbind-1847  [001]   199.482561: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
         rpcbind-1847  [001]   199.482563: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
          <idle>-0     [000]   241.615152: function:             dump_backtrace

-- 
Regards,
Niklas S�derlund

[-- Attachment #2: trace-cmd.dat.gz --]
[-- Type: application/gzip, Size: 342838 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-22 22:18               ` Niklas Söderlund
@ 2018-04-23  9:54                 ` Vincent Guittot
  -1 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-23  9:54 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Niklas,

Le Monday 23 Apr 2018 à 00:18:27 (+0200), Niklas Söderlund a écrit :
> Hi Vincent,
> 
> On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:
> 
> [snip]
> 
> > > 
> > > You results are similar to Heiner's ones. The problem is still there
> > > even if we only kick ilb which mainly send an IPI reschedule to the
> > > other CPU if Idle
> > > 
> > 
> > Could it be possible to record some traces of the problem to get a better view
> > of what happens ?
> > 
> > I have a small patch that adds some traces in the functions that seems to create
> > the problem
> 
> Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> 
> > 
> > 
> > 
> > Also that would be good to stop tracing when the RCU stall happens
> > 
> > In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
> > 
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace
> > 
> > The trace-cmd start has to be called twice to make sure that we will not trace all function
> > 
> > Once the dump happen and the trace is stopped, you can extract the traces with 
> > 
> > trace-cmd extract -o <trace file name>
> 
> Thanks for the help with trace-cmd, I have attached the full 
> trace-cmd.dat extracted from running this. But a quick run of trace-cmd 
> report.i

Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events

trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk

I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck

---
 kernel/sched/fair.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..4285511 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	 */
 	WRITE_ONCE(nohz.has_blocked, 0);
 
+	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
 	/*
 	 * Ensures that if we miss the CPU, we must see the has_blocked
 	 * store from nohz_balance_enter_idle().
@@ -9628,6 +9630,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 
 		rq = cpu_rq(balance_cpu);
 
+		trace_printk("_nohz_idle_balance update cpu %d ", balance_cpu);
+
 		has_blocked_load |= update_nohz_stats(rq, true);
 
 		/*
@@ -9680,6 +9684,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	if (likely(update_next_balance))
 		nohz.next_balance = next_balance;
 
+	trace_printk("_nohz_idle_balance return %d", ret);
+
 	return ret;
 }
 
@@ -9732,6 +9738,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
 		return;
 
+	trace_printk("nohz_newidle_balance start update");
+
 	raw_spin_unlock(&this_rq->lock);
 	/*
 	 * This CPU is going to be idle and blocked load of idle CPUs
@@ -9742,6 +9750,9 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
 		kick_ilb(NOHZ_STATS_KICK);
 	raw_spin_lock(&this_rq->lock);
+
+	trace_printk("nohz_newidle_balance lock back");
+
 }
 
 #else /* !CONFIG_NO_HZ_COMMON */
@@ -9869,6 +9880,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 
 	rq_repin_lock(this_rq, rf);
 
+	trace_printk("idle_balance %d", pulled_task);
+
 	return pulled_task;
 }
 
-- 
2.7.4

>
> 

[snip]

>        rcu_sched-9     [000]   147.342741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>        rcu_sched-9     [000]   147.342749: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>        rcu_sched-9     [000]   147.342754: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>        rcu_sched-9     [000]   147.382739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>        rcu_sched-9     [000]   147.382741: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>        rcu_sched-9     [000]   147.382745: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>             sshd-1881  [000]   147.621105: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>             sshd-1881  [000]   147.621108: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>             sshd-1881  [000]   147.621110: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1

This is the last call to the function added by the patch on CPU0 and it returns completely.
But then nothing else is schedule on CPU0

>           <idle>-0     [001]   147.662725: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
>           <idle>-0     [001]   147.662730: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   151.032830: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   151.032833: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   151.032835: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   155.032777: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   155.032780: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   155.032781: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          haveged-1823  [001]   155.515789: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          haveged-1823  [001]   155.515792: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          haveged-1823  [001]   155.515794: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   157.032718: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   157.032720: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   157.032722: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   159.032724: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   159.032726: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   159.032728: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   161.032698: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   161.032701: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   161.032702: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   165.032671: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   165.032673: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   165.032675: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   167.032667: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   167.032669: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   167.032671: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   169.032654: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   169.032657: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   169.032658: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          rpcbind-1847  [001]   169.472688: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          rpcbind-1847  [001]   169.472691: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          rpcbind-1847  [001]   169.472693: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   171.032660: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   171.032662: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   171.032664: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          rpcbind-1847  [001]   199.482558: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          rpcbind-1847  [001]   199.482561: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          rpcbind-1847  [001]   199.482563: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>           <idle>-0     [000]   241.615152: function:             dump_backtrace
> 
> -- 
> Regards,
> Niklas Söderlund

sorry for the inconvenience

Vincent

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-23  9:54                 ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-23  9:54 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Niklas,

Le Monday 23 Apr 2018 � 00:18:27 (+0200), Niklas S�derlund a �crit :
> Hi Vincent,
> 
> On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:
> 
> [snip]
> 
> > > 
> > > You results are similar to Heiner's ones. The problem is still there
> > > even if we only kick ilb which mainly send an IPI reschedule to the
> > > other CPU if Idle
> > > 
> > 
> > Could it be possible to record some traces of the problem to get a better view
> > of what happens ?
> > 
> > I have a small patch that adds some traces in the functions that seems to create
> > the problem
> 
> Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> 
> > 
> > 
> > 
> > Also that would be good to stop tracing when the RCU stall happens
> > 
> > In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
> > 
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace
> > 
> > The trace-cmd start has to be called twice to make sure that we will not trace all function
> > 
> > Once the dump happen and the trace is stopped, you can extract the traces with 
> > 
> > trace-cmd extract -o <trace file name>
> 
> Thanks for the help with trace-cmd, I have attached the full 
> trace-cmd.dat extracted from running this. But a quick run of trace-cmd 
> report.i

Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events

trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk

I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck

---
 kernel/sched/fair.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..4285511 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	 */
 	WRITE_ONCE(nohz.has_blocked, 0);
 
+	trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
 	/*
 	 * Ensures that if we miss the CPU, we must see the has_blocked
 	 * store from nohz_balance_enter_idle().
@@ -9628,6 +9630,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 
 		rq = cpu_rq(balance_cpu);
 
+		trace_printk("_nohz_idle_balance update cpu %d ", balance_cpu);
+
 		has_blocked_load |= update_nohz_stats(rq, true);
 
 		/*
@@ -9680,6 +9684,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
 	if (likely(update_next_balance))
 		nohz.next_balance = next_balance;
 
+	trace_printk("_nohz_idle_balance return %d", ret);
+
 	return ret;
 }
 
@@ -9732,6 +9738,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	    time_before(jiffies, READ_ONCE(nohz.next_blocked)))
 		return;
 
+	trace_printk("nohz_newidle_balance start update");
+
 	raw_spin_unlock(&this_rq->lock);
 	/*
 	 * This CPU is going to be idle and blocked load of idle CPUs
@@ -9742,6 +9750,9 @@ static void nohz_newidle_balance(struct rq *this_rq)
 	if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
 		kick_ilb(NOHZ_STATS_KICK);
 	raw_spin_lock(&this_rq->lock);
+
+	trace_printk("nohz_newidle_balance lock back");
+
 }
 
 #else /* !CONFIG_NO_HZ_COMMON */
@@ -9869,6 +9880,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 
 	rq_repin_lock(this_rq, rf);
 
+	trace_printk("idle_balance %d", pulled_task);
+
 	return pulled_task;
 }
 
-- 
2.7.4

>
> 

[snip]

>        rcu_sched-9     [000]   147.342741: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>        rcu_sched-9     [000]   147.342749: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>        rcu_sched-9     [000]   147.342754: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>        rcu_sched-9     [000]   147.382739: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>        rcu_sched-9     [000]   147.382741: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>        rcu_sched-9     [000]   147.382745: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>             sshd-1881  [000]   147.621105: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>             sshd-1881  [000]   147.621108: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
>             sshd-1881  [000]   147.621110: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1

This is the last call to the function added by the patch on CPU0 and it returns completely.
But then nothing else is schedule on CPU0

>           <idle>-0     [001]   147.662725: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
>           <idle>-0     [001]   147.662730: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   151.032830: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   151.032833: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   151.032835: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   155.032777: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   155.032780: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   155.032781: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          haveged-1823  [001]   155.515789: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          haveged-1823  [001]   155.515792: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          haveged-1823  [001]   155.515794: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   157.032718: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   157.032720: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   157.032722: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   159.032724: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   159.032726: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   159.032728: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   161.032698: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   161.032701: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   161.032702: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   165.032671: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   165.032673: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   165.032675: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   167.032667: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   167.032669: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   167.032671: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   169.032654: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   169.032657: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   169.032658: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          rpcbind-1847  [001]   169.472688: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          rpcbind-1847  [001]   169.472691: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          rpcbind-1847  [001]   169.472693: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>      kworker/1:1-31    [001]   171.032660: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>      kworker/1:1-31    [001]   171.032662: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>      kworker/1:1-31    [001]   171.032664: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>          rpcbind-1847  [001]   199.482558: bputs:                pick_next_task_fair: nohz_newidle_balance start update
>          rpcbind-1847  [001]   199.482561: bprint:               _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
>          rpcbind-1847  [001]   199.482563: bprint:               _nohz_idle_balance: _nohz_idle_balance return 1
>           <idle>-0     [000]   241.615152: function:             dump_backtrace
> 
> -- 
> Regards,
> Niklas S�derlund

sorry for the inconvenience

Vincent

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
       [not found]                 ` <20180425225603.GA26177@bigcity.dyn.berto.se>
@ 2018-04-26 10:31                     ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-26 10:31 UTC (permalink / raw)
  To: Niklas Söderlund, Heiner Kallweit
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Niklas,

Le Thursday 26 Apr 2018 à 00:56:03 (+0200), Niklas Söderlund a écrit :
> Hi Vincent,
> 
> Here are the result, sorry for the delay.
> 
> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
> 
> [snip]
> 
> > 
> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> > 
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > 
> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
> 
> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> 
> This time the result of 'trace-cmd report' is so large I do not include 
> it here, but I attach the trace.dat file. Not sure why but the timing of 
> sending the NMI to the backtrace print is different (but content the 
> same AFIK) so in the odd change it can help figure this out:
> 

Thanks for the trace, I have been able to catch a problem with it.
Could you test the patch below to confirm that the problem is solved ?
The patch apply on-top of
c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

From: Vincent Guittot <vincent.guittot@linaro.org>
Date: Thu, 26 Apr 2018 12:19:32 +0200
Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
we release the rq->lock when updating blocked load of idle CPUs. This open
a time window during which another CPU can add a task to this CPU's cfs_rq.
The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.

Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..15a9f5e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (curr_cost > this_rq->max_idle_balance_cost)
 		this_rq->max_idle_balance_cost = curr_cost;
 
+out:
 	/*
 	 * While browsing the domains, we released the rq lock, a task could
 	 * have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (this_rq->cfs.h_nr_running && !pulled_task)
 		pulled_task = 1;
 
-out:
 	/* Move the next balance forward */
 	if (time_after(this_rq->next_balance, next_balance))
 		this_rq->next_balance = next_balance;
-- 
2.7.4



[snip]

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-26 10:31                     ` Vincent Guittot
  0 siblings, 0 replies; 33+ messages in thread
From: Vincent Guittot @ 2018-04-26 10:31 UTC (permalink / raw)
  To: Niklas Söderlund, Heiner Kallweit
  Cc: Peter Zijlstra, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Niklas,

Le Thursday 26 Apr 2018 � 00:56:03 (+0200), Niklas S�derlund a �crit :
> Hi Vincent,
> 
> Here are the result, sorry for the delay.
> 
> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
> 
> [snip]
> 
> > 
> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> > 
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > 
> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
> 
> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> 
> This time the result of 'trace-cmd report' is so large I do not include 
> it here, but I attach the trace.dat file. Not sure why but the timing of 
> sending the NMI to the backtrace print is different (but content the 
> same AFIK) so in the odd change it can help figure this out:
> 

Thanks for the trace, I have been able to catch a problem with it.
Could you test the patch below to confirm that the problem is solved ?
The patch apply on-top of
c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

From: Vincent Guittot <vincent.guittot@linaro.org>
Date: Thu, 26 Apr 2018 12:19:32 +0200
Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
we release the rq->lock when updating blocked load of idle CPUs. This open
a time window during which another CPU can add a task to this CPU's cfs_rq.
The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.

Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Niklas S�derlund <niklas.soderlund@ragnatech.se>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..15a9f5e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (curr_cost > this_rq->max_idle_balance_cost)
 		this_rq->max_idle_balance_cost = curr_cost;
 
+out:
 	/*
 	 * While browsing the domains, we released the rq lock, a task could
 	 * have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (this_rq->cfs.h_nr_running && !pulled_task)
 		pulled_task = 1;
 
-out:
 	/* Move the next balance forward */
 	if (time_after(this_rq->next_balance, next_balance))
 		this_rq->next_balance = next_balance;
-- 
2.7.4



[snip]

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-26 10:31                     ` Vincent Guittot
@ 2018-04-26 11:48                       ` Peter Zijlstra
  -1 siblings, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2018-04-26 11:48 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Niklas Söderlund, Heiner Kallweit, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

On Thu, Apr 26, 2018 at 12:31:33PM +0200, Vincent Guittot wrote:
> From: Vincent Guittot <vincent.guittot@linaro.org>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.

Ah quite so indeed. This could result in us running idle even though
there is a runnable task around -- which is bad.

> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (curr_cost > this_rq->max_idle_balance_cost)
>  		this_rq->max_idle_balance_cost = curr_cost;
>  
> +out:
>  	/*
>  	 * While browsing the domains, we released the rq lock, a task could
>  	 * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (this_rq->cfs.h_nr_running && !pulled_task)
>  		pulled_task = 1;
>  
> -out:
>  	/* Move the next balance forward */
>  	if (time_after(this_rq->next_balance, next_balance))
>  		this_rq->next_balance = next_balance;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-26 11:48                       ` Peter Zijlstra
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2018-04-26 11:48 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Niklas Söderlund, Heiner Kallweit, Paul E. McKenney,
	Ingo Molnar, linux-kernel, linux-renesas-soc

On Thu, Apr 26, 2018 at 12:31:33PM +0200, Vincent Guittot wrote:
> From: Vincent Guittot <vincent.guittot@linaro.org>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.

Ah quite so indeed. This could result in us running idle even though
there is a runnable task around -- which is bad.

> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Niklas S�derlund <niklas.soderlund@ragnatech.se>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (curr_cost > this_rq->max_idle_balance_cost)
>  		this_rq->max_idle_balance_cost = curr_cost;
>  
> +out:
>  	/*
>  	 * While browsing the domains, we released the rq lock, a task could
>  	 * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (this_rq->cfs.h_nr_running && !pulled_task)
>  		pulled_task = 1;
>  
> -out:
>  	/* Move the next balance forward */
>  	if (time_after(this_rq->next_balance, next_balance))
>  		this_rq->next_balance = next_balance;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-26 10:31                     ` Vincent Guittot
@ 2018-04-26 14:41                       ` Niklas Söderlund
  -1 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-26 14:41 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Vincent,

Thanks for all your help.

On 2018-04-26 12:31:33 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> Le Thursday 26 Apr 2018 à 00:56:03 (+0200), Niklas Söderlund a écrit :
> > Hi Vincent,
> > 
> > Here are the result, sorry for the delay.
> > 
> > On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
> > 
> > [snip]
> > 
> > > 
> > > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> > > 
> > > trace-cmd reset > /dev/null
> > > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > > 
> > > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
> > 
> > Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge 
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> > 
> > This time the result of 'trace-cmd report' is so large I do not include 
> > it here, but I attach the trace.dat file. Not sure why but the timing of 
> > sending the NMI to the backtrace print is different (but content the 
> > same AFIK) so in the odd change it can help figure this out:
> > 
> 
> Thanks for the trace, I have been able to catch a problem with it.
> Could you test the patch below to confirm that the problem is solved ?
> The patch apply on-top of
> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

I can confirm that with the patch bellow I can no longer produce the 
problem. Thanks!

> 
> From: Vincent Guittot <vincent.guittot@linaro.org>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
> 
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (curr_cost > this_rq->max_idle_balance_cost)
>  		this_rq->max_idle_balance_cost = curr_cost;
>  
> +out:
>  	/*
>  	 * While browsing the domains, we released the rq lock, a task could
>  	 * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (this_rq->cfs.h_nr_running && !pulled_task)
>  		pulled_task = 1;
>  
> -out:
>  	/* Move the next balance forward */
>  	if (time_after(this_rq->next_balance, next_balance))
>  		this_rq->next_balance = next_balance;
> -- 
> 2.7.4
> 
> 
> 
> [snip]
> 

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-26 14:41                       ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-26 14:41 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Heiner Kallweit, Peter Zijlstra, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Vincent,

Thanks for all your help.

On 2018-04-26 12:31:33 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> Le Thursday 26 Apr 2018 � 00:56:03 (+0200), Niklas S�derlund a �crit :
> > Hi Vincent,
> > 
> > Here are the result, sorry for the delay.
> > 
> > On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
> > 
> > [snip]
> > 
> > > 
> > > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> > > 
> > > trace-cmd reset > /dev/null
> > > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > > 
> > > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
> > 
> > Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge 
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> > 
> > This time the result of 'trace-cmd report' is so large I do not include 
> > it here, but I attach the trace.dat file. Not sure why but the timing of 
> > sending the NMI to the backtrace print is different (but content the 
> > same AFIK) so in the odd change it can help figure this out:
> > 
> 
> Thanks for the trace, I have been able to catch a problem with it.
> Could you test the patch below to confirm that the problem is solved ?
> The patch apply on-top of
> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

I can confirm that with the patch bellow I can no longer produce the 
problem. Thanks!

> 
> From: Vincent Guittot <vincent.guittot@linaro.org>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
> 
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Niklas S�derlund <niklas.soderlund@ragnatech.se>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (curr_cost > this_rq->max_idle_balance_cost)
>  		this_rq->max_idle_balance_cost = curr_cost;
>  
> +out:
>  	/*
>  	 * While browsing the domains, we released the rq lock, a task could
>  	 * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>  	if (this_rq->cfs.h_nr_running && !pulled_task)
>  		pulled_task = 1;
>  
> -out:
>  	/* Move the next balance forward */
>  	if (time_after(this_rq->next_balance, next_balance))
>  		this_rq->next_balance = next_balance;
> -- 
> 2.7.4
> 
> 
> 
> [snip]
> 

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-26 14:41                       ` Niklas Söderlund
  (?)
@ 2018-04-26 15:27                       ` Vincent Guittot
  2018-04-26 15:38                           ` Niklas Söderlund
  -1 siblings, 1 reply; 33+ messages in thread
From: Vincent Guittot @ 2018-04-26 15:27 UTC (permalink / raw)
  To: Niklas Söderlund, Peter Zijlstra
  Cc: Heiner Kallweit, Paul E. McKenney, Ingo Molnar, linux-kernel,
	linux-renesas-soc

Hi Niklas,

>> Thanks for the trace, I have been able to catch a problem with it.
>> Could you test the patch below to confirm that the problem is solved ?
>> The patch apply on-top of
>> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
>
> I can confirm that with the patch bellow I can no longer produce the
> problem. Thanks!

Thanks for testing
Do you mind if I add
Tested-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>

Peter, Ingo,
Do you want me to re-send the patch with all tags or you will take
this version ?

Regards,
Vincent

>
>>
>> From: Vincent Guittot <vincent.guittot@linaro.org>
>> Date: Thu, 26 Apr 2018 12:19:32 +0200
>> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
>> we release the rq->lock when updating blocked load of idle CPUs. This open
>> a time window during which another CPU can add a task to this CPU's cfs_rq.
>> The check for newly added task of idle_balance() is not in the common path.
>> Move the out label to include this check.
>>
>> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
>> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
>> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> ---
>>  kernel/sched/fair.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..15a9f5e 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>>       if (curr_cost > this_rq->max_idle_balance_cost)
>>               this_rq->max_idle_balance_cost = curr_cost;
>>
>> +out:
>>       /*
>>        * While browsing the domains, we released the rq lock, a task could
>>        * have been enqueued in the meantime. Since we're not going idle,
>> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>>       if (this_rq->cfs.h_nr_running && !pulled_task)
>>               pulled_task = 1;
>>
>> -out:
>>       /* Move the next balance forward */
>>       if (time_after(this_rq->next_balance, next_balance))
>>               this_rq->next_balance = next_balance;
>> --
>> 2.7.4
>>
>>
>>
>> [snip]
>>
>
> --
> Regards,
> Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-26 15:27                       ` Vincent Guittot
@ 2018-04-26 15:38                           ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-26 15:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Heiner Kallweit, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Vincent,

On 2018-04-26 17:27:24 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> >> Thanks for the trace, I have been able to catch a problem with it.
> >> Could you test the patch below to confirm that the problem is solved ?
> >> The patch apply on-top of
> >> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
> >
> > I can confirm that with the patch bellow I can no longer produce the
> > problem. Thanks!
> 
> Thanks for testing
> Do you mind if I add
> Tested-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>

Please do.

> 
> Peter, Ingo,
> Do you want me to re-send the patch with all tags or you will take
> this version ?
> 
> Regards,
> Vincent
> 
> >
> >>
> >> From: Vincent Guittot <vincent.guittot@linaro.org>
> >> Date: Thu, 26 Apr 2018 12:19:32 +0200
> >> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> >> MIME-Version: 1.0
> >> Content-Type: text/plain; charset=UTF-8
> >> Content-Transfer-Encoding: 8bit
> >>
> >> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> >> we release the rq->lock when updating blocked load of idle CPUs. This open
> >> a time window during which another CPU can add a task to this CPU's cfs_rq.
> >> The check for newly added task of idle_balance() is not in the common path.
> >> Move the out label to include this check.
> >>
> >> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> >> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> >> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >> ---
> >>  kernel/sched/fair.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..15a9f5e 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >>       if (curr_cost > this_rq->max_idle_balance_cost)
> >>               this_rq->max_idle_balance_cost = curr_cost;
> >>
> >> +out:
> >>       /*
> >>        * While browsing the domains, we released the rq lock, a task could
> >>        * have been enqueued in the meantime. Since we're not going idle,
> >> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >>       if (this_rq->cfs.h_nr_running && !pulled_task)
> >>               pulled_task = 1;
> >>
> >> -out:
> >>       /* Move the next balance forward */
> >>       if (time_after(this_rq->next_balance, next_balance))
> >>               this_rq->next_balance = next_balance;
> >> --
> >> 2.7.4
> >>
> >>
> >>
> >> [snip]
> >>
> >
> > --
> > Regards,
> > Niklas Söderlund

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
@ 2018-04-26 15:38                           ` Niklas Söderlund
  0 siblings, 0 replies; 33+ messages in thread
From: Niklas Söderlund @ 2018-04-26 15:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Heiner Kallweit, Paul E. McKenney, Ingo Molnar,
	linux-kernel, linux-renesas-soc

Hi Vincent,

On 2018-04-26 17:27:24 +0200, Vincent Guittot wrote:
> Hi Niklas,
> 
> >> Thanks for the trace, I have been able to catch a problem with it.
> >> Could you test the patch below to confirm that the problem is solved ?
> >> The patch apply on-top of
> >> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
> >
> > I can confirm that with the patch bellow I can no longer produce the
> > problem. Thanks!
> 
> Thanks for testing
> Do you mind if I add
> Tested-by: Niklas S�derlund <niklas.soderlund@ragnatech.se>

Please do.

> 
> Peter, Ingo,
> Do you want me to re-send the patch with all tags or you will take
> this version ?
> 
> Regards,
> Vincent
> 
> >
> >>
> >> From: Vincent Guittot <vincent.guittot@linaro.org>
> >> Date: Thu, 26 Apr 2018 12:19:32 +0200
> >> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> >> MIME-Version: 1.0
> >> Content-Type: text/plain; charset=UTF-8
> >> Content-Transfer-Encoding: 8bit
> >>
> >> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> >> we release the rq->lock when updating blocked load of idle CPUs. This open
> >> a time window during which another CPU can add a task to this CPU's cfs_rq.
> >> The check for newly added task of idle_balance() is not in the common path.
> >> Move the out label to include this check.
> >>
> >> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> >> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> >> Reported-by: Niklas S�derlund <niklas.soderlund@ragnatech.se>
> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >> ---
> >>  kernel/sched/fair.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..15a9f5e 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >>       if (curr_cost > this_rq->max_idle_balance_cost)
> >>               this_rq->max_idle_balance_cost = curr_cost;
> >>
> >> +out:
> >>       /*
> >>        * While browsing the domains, we released the rq lock, a task could
> >>        * have been enqueued in the meantime. Since we're not going idle,
> >> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >>       if (this_rq->cfs.h_nr_running && !pulled_task)
> >>               pulled_task = 1;
> >>
> >> -out:
> >>       /* Move the next balance forward */
> >>       if (time_after(this_rq->next_balance, next_balance))
> >>               this_rq->next_balance = next_balance;
> >> --
> >> 2.7.4
> >>
> >>
> >>
> >> [snip]
> >>
> >
> > --
> > Regards,
> > Niklas S�derlund

-- 
Regards,
Niklas S�derlund

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
  2018-04-26 10:31                     ` Vincent Guittot
                                       ` (2 preceding siblings ...)
  (?)
@ 2018-05-02 13:40                     ` Geert Uytterhoeven
  -1 siblings, 0 replies; 33+ messages in thread
From: Geert Uytterhoeven @ 2018-05-02 13:40 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Niklas Söderlund, Heiner Kallweit, Peter Zijlstra,
	Paul E. McKenney, Ingo Molnar, linux-kernel, Linux-Renesas

Hi Vincent,

On Thu, Apr 26, 2018 at 12:31 PM, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
> Le Thursday 26 Apr 2018 à 00:56:03 (+0200), Niklas Söderlund a écrit :
>> Here are the result, sorry for the delay.
>>
>> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
>>
>> [snip]
>>
>> >
>> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
>> >
>> > trace-cmd reset > /dev/null
>> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
>> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
>> >
>> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
>>
>> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>>
>> This time the result of 'trace-cmd report' is so large I do not include
>> it here, but I attach the trace.dat file. Not sure why but the timing of
>> sending the NMI to the backtrace print is different (but content the
>> same AFIK) so in the odd change it can help figure this out:
>>
>
> Thanks for the trace, I have been able to catch a problem with it.
> Could you test the patch below to confirm that the problem is solved ?
> The patch apply on-top of
> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
>
> From: Vincent Guittot <vincent.guittot@linaro.org>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
>
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Thanks!
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:sched/urgent] sched/fair: Fix the update of blocked load when newly idle
  2018-04-26 10:31                     ` Vincent Guittot
                                       ` (3 preceding siblings ...)
  (?)
@ 2018-05-03  9:25                     ` tip-bot for Vincent Guittot
  -1 siblings, 0 replies; 33+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-05-03  9:25 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, hpa, paulmck, hkallweit1, linux-kernel, tglx,
	geert+renesas, vincent.guittot, peterz, torvalds

Commit-ID:  457be908c83637ee10bda085a23dc05afa3b14a0
Gitweb:     https://git.kernel.org/tip/457be908c83637ee10bda085a23dc05afa3b14a0
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Thu, 26 Apr 2018 12:19:32 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 3 May 2018 07:38:03 +0200

sched/fair: Fix the update of blocked load when newly idle

With commit:

  31e77c93e432 ("sched/fair: Update blocked load when newly idle")

... we release the rq->lock when updating blocked load of idle CPUs.

This opens a time window during which another CPU can add a task to this
CPU's cfs_rq.

The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.

Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Link: http://lkml.kernel.org/r/20180426103133.GA6953@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 54dc31e7ab9b..e3002e5ada31 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (curr_cost > this_rq->max_idle_balance_cost)
 		this_rq->max_idle_balance_cost = curr_cost;
 
+out:
 	/*
 	 * While browsing the domains, we released the rq lock, a task could
 	 * have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (this_rq->cfs.h_nr_running && !pulled_task)
 		pulled_task = 1;
 
-out:
 	/* Move the next balance forward */
 	if (time_after(this_rq->next_balance, next_balance))
 		this_rq->next_balance = next_balance;

^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2018-05-03  9:26 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-12  9:18 Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle") Niklas Söderlund
2018-04-12  9:18 ` Niklas Söderlund
2018-04-12 10:33 ` Vincent Guittot
2018-04-12 11:15   ` Niklas Söderlund
2018-04-12 11:15     ` Niklas Söderlund
     [not found]     ` <20180412133031.GA551@linaro.org>
2018-04-12 19:43       ` Heiner Kallweit
2018-04-14 11:21         ` Vincent Guittot
2018-04-12 22:39       ` Niklas Söderlund
2018-04-12 22:39         ` Niklas Söderlund
2018-04-14 11:24         ` Vincent Guittot
2018-04-20 16:00           ` Vincent Guittot
2018-04-20 16:00             ` Vincent Guittot
2018-04-20 16:30             ` Joel Fernandes
2018-04-22 22:18             ` Niklas Söderlund
2018-04-22 22:18               ` Niklas Söderlund
2018-04-23  9:54               ` Vincent Guittot
2018-04-23  9:54                 ` Vincent Guittot
     [not found]                 ` <20180425225603.GA26177@bigcity.dyn.berto.se>
2018-04-26 10:31                   ` Vincent Guittot
2018-04-26 10:31                     ` Vincent Guittot
2018-04-26 11:48                     ` Peter Zijlstra
2018-04-26 11:48                       ` Peter Zijlstra
2018-04-26 14:41                     ` Niklas Söderlund
2018-04-26 14:41                       ` Niklas Söderlund
2018-04-26 15:27                       ` Vincent Guittot
2018-04-26 15:38                         ` Niklas Söderlund
2018-04-26 15:38                           ` Niklas Söderlund
2018-05-02 13:40                     ` Geert Uytterhoeven
2018-05-03  9:25                     ` [tip:sched/urgent] sched/fair: Fix the update of blocked load when newly idle tip-bot for Vincent Guittot
2018-04-13 20:38     ` Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle") Niklas Söderlund
2018-04-13 20:38       ` Niklas Söderlund
2018-04-14 11:26       ` Vincent Guittot
2018-04-12 22:06   ` Niklas Söderlund
2018-04-12 22:06     ` Niklas Söderlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.