* rcu_preempt detected stalls [not found] <20210831152144.VOyu0gjmwOCWZBtSzDaOYQE-YYszY9tK_z8p-fZZ_kM@z> @ 2021-08-31 15:21 ` Jorge Ramirez-Ortiz, Foundries [not found] ` <20210831155359.Lf_X-nNhopb2XqSc-L34l6LeYKbvMZK-etHxz_l8K68@z> [not found] ` <20210831170111.ftOr-K7l2idP8zoj3V-y_wsmeIJ_u1im_3zfoOffRAs@z> 0 siblings, 2 replies; 14+ messages in thread From: Jorge Ramirez-Ortiz, Foundries @ 2021-08-31 15:21 UTC (permalink / raw) To: paulmck Cc: josh, rostedt, mathieu.desnoyers, jiangshanlai, joel, rcu, soc, linux-arm-kernel Hi When enabling CONFIG_PREEMPT and running the stress-ng scheduler class tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root +->qsmask 0x0 [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 62.622359] rcu: RCU grace-period kthread stack dump: [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 [ 62.637308] Call trace: [ 62.639748] __switch_to+0x11c/0x190 [ 62.643319] __schedule+0x3b8/0x8d8 [ 62.646796] schedule+0x4c/0x108 [ 62.650018] schedule_timeout+0x1ac/0x358 [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 [ 62.657933] kthread+0x14c/0x158 [ 62.661153] ret_from_fork+0x10/0x18 [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 [ 62.689604] Preemption disabled at: [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 [ 62.706296] Hardware name: Zynqmp new (DT) [ 62.710115] Call trace: [ 62.712548] dump_backtrace+0x0/0x240 [ 62.716202] show_stack+0x2c/0x38 [ 62.719510] dump_stack+0xcc/0x104 [ 62.722904] __schedule_bug+0x78/0xc8 [ 62.726556] __schedule+0x70c/0x8d8 [ 62.730037] schedule+0x4c/0x108 [ 62.733259] do_notify_resume+0x224/0x5d8 [ 62.737259] work_pending+0xc/0x2a4 The error results in OOM eventually. RCU priority boosting does work around this issue but it seems to me a workaround more than a fix (otherwise boosting would be enabled by CONFIG_PREEMPT for arm64 I guess?). The question is: is this an arm64 bug that should be investigated? or is this some known corner case of running stress-ng that is already understood? thanks Jorge _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210831155359.Lf_X-nNhopb2XqSc-L34l6LeYKbvMZK-etHxz_l8K68@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210831155359.Lf_X-nNhopb2XqSc-L34l6LeYKbvMZK-etHxz_l8K68@z> @ 2021-08-31 15:53 ` Paul E. McKenney 0 siblings, 0 replies; 14+ messages in thread From: Paul E. McKenney @ 2021-08-31 15:53 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: josh, rostedt, mathieu.desnoyers, jiangshanlai, joel, rcu, soc, linux-arm-kernel On Tue, Aug 31, 2021 at 05:21:44PM +0200, Jorge Ramirez-Ortiz, Foundries wrote: > Hi > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > +->qsmask 0x0 > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. The message above really does mean what it says: If your workload prevents RCU's grace-period kthread ("rcu_preempt" in this case) from running, you just bought yourself an OOM. > [ 62.622359] rcu: RCU grace-period kthread stack dump: > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > [ 62.637308] Call trace: > [ 62.639748] __switch_to+0x11c/0x190 > [ 62.643319] __schedule+0x3b8/0x8d8 > [ 62.646796] schedule+0x4c/0x108 > [ 62.650018] schedule_timeout+0x1ac/0x358 > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > [ 62.657933] kthread+0x14c/0x158 > [ 62.661153] ret_from_fork+0x10/0x18 > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > [ 62.689604] Preemption disabled at: > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > [ 62.706296] Hardware name: Zynqmp new (DT) > [ 62.710115] Call trace: > [ 62.712548] dump_backtrace+0x0/0x240 > [ 62.716202] show_stack+0x2c/0x38 > [ 62.719510] dump_stack+0xcc/0x104 > [ 62.722904] __schedule_bug+0x78/0xc8 > [ 62.726556] __schedule+0x70c/0x8d8 > [ 62.730037] schedule+0x4c/0x108 > [ 62.733259] do_notify_resume+0x224/0x5d8 > [ 62.737259] work_pending+0xc/0x2a4 > > The error results in OOM eventually. > > RCU priority boosting does work around this issue but it seems to me > a workaround more than a fix (otherwise boosting would be enabled > by CONFIG_PREEMPT for arm64 I guess?). RCU priority boosting sets the rcu_preempt kthread's scheduling priority to SCHED_FIFO priority level 1 instead of the normal SCHED_OTHER. Therefore, if you build with CONFIG_RCU_BOOST=n, but manually set the priority of rcu_preempt to SCHED_FIFO priority level 1, you might also see this RCU CPU stall warning go away. > The question is: is this an arm64 bug that should be investigated? or > is this some known corner case of running stress-ng that is already > understood? I have not looked at stress-ng, but it is possible to configure your system so that rcu_preempt gets little or no CPU time, for example, by placing it into a CPU-poor cgroup on the one hand or by disabling throttling and running a heavy real-time workload on the other. Is stress-ng doing something like this? There could of course also be an arm64 problem that affect scheduling, but I suggest looking closely at what stress-ng is doing first. Please let me know how it goes! Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210831170111.ftOr-K7l2idP8zoj3V-y_wsmeIJ_u1im_3zfoOffRAs@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210831170111.ftOr-K7l2idP8zoj3V-y_wsmeIJ_u1im_3zfoOffRAs@z> @ 2021-08-31 17:01 ` Zhouyi Zhou 2021-08-31 17:01 ` Zhouyi Zhou [not found] ` <20210831171159.L6GWokiqRdP6iR7ifyRwmK4o4zLRDfv695oAA6v1sWc@z> 0 siblings, 2 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-08-31 17:01 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel I did an experiment just now on x86_64 virtual machines, rcu did not complain after 10 minutes's test, I hope my effort can provide some clue. 1. I clone the fresh new linux kernel (git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda ./debian10.qcow2 -m 4096 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30) 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) 5. monitor the system by constantly typing top and dmesg 6. after 10 minutes, nothing else happens except that the dmesg report following two messages [ 672.528192] sched: DL replenish lagged too much [ 751.127790] hrtimer: interrupt took 12143 ns So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines Zhouyi On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries <jorge@foundries.io> wrote: > > Hi > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > +->qsmask 0x0 > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > [ 62.622359] rcu: RCU grace-period kthread stack dump: > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > [ 62.637308] Call trace: > [ 62.639748] __switch_to+0x11c/0x190 > [ 62.643319] __schedule+0x3b8/0x8d8 > [ 62.646796] schedule+0x4c/0x108 > [ 62.650018] schedule_timeout+0x1ac/0x358 > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > [ 62.657933] kthread+0x14c/0x158 > [ 62.661153] ret_from_fork+0x10/0x18 > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > [ 62.689604] Preemption disabled at: > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > [ 62.706296] Hardware name: Zynqmp new (DT) > [ 62.710115] Call trace: > [ 62.712548] dump_backtrace+0x0/0x240 > [ 62.716202] show_stack+0x2c/0x38 > [ 62.719510] dump_stack+0xcc/0x104 > [ 62.722904] __schedule_bug+0x78/0xc8 > [ 62.726556] __schedule+0x70c/0x8d8 > [ 62.730037] schedule+0x4c/0x108 > [ 62.733259] do_notify_resume+0x224/0x5d8 > [ 62.737259] work_pending+0xc/0x2a4 > > The error results in OOM eventually. > > RCU priority boosting does work around this issue but it seems to me > a workaround more than a fix (otherwise boosting would be enabled > by CONFIG_PREEMPT for arm64 I guess?). > > The question is: is this an arm64 bug that should be investigated? or > is this some known corner case of running stress-ng that is already > understood? > > thanks > Jorge > > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-08-31 17:01 ` Zhouyi Zhou @ 2021-08-31 17:01 ` Zhouyi Zhou [not found] ` <20210831171159.L6GWokiqRdP6iR7ifyRwmK4o4zLRDfv695oAA6v1sWc@z> 1 sibling, 0 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-08-31 17:01 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel I did an experiment just now on x86_64 virtual machines, rcu did not complain after 10 minutes's test, I hope my effort can provide some clue. 1. I clone the fresh new linux kernel (git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda ./debian10.qcow2 -m 4096 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30) 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) 5. monitor the system by constantly typing top and dmesg 6. after 10 minutes, nothing else happens except that the dmesg report following two messages [ 672.528192] sched: DL replenish lagged too much [ 751.127790] hrtimer: interrupt took 12143 ns So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines Zhouyi On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries <jorge@foundries.io> wrote: > > Hi > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > +->qsmask 0x0 > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > [ 62.622359] rcu: RCU grace-period kthread stack dump: > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > [ 62.637308] Call trace: > [ 62.639748] __switch_to+0x11c/0x190 > [ 62.643319] __schedule+0x3b8/0x8d8 > [ 62.646796] schedule+0x4c/0x108 > [ 62.650018] schedule_timeout+0x1ac/0x358 > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > [ 62.657933] kthread+0x14c/0x158 > [ 62.661153] ret_from_fork+0x10/0x18 > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > [ 62.689604] Preemption disabled at: > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > [ 62.706296] Hardware name: Zynqmp new (DT) > [ 62.710115] Call trace: > [ 62.712548] dump_backtrace+0x0/0x240 > [ 62.716202] show_stack+0x2c/0x38 > [ 62.719510] dump_stack+0xcc/0x104 > [ 62.722904] __schedule_bug+0x78/0xc8 > [ 62.726556] __schedule+0x70c/0x8d8 > [ 62.730037] schedule+0x4c/0x108 > [ 62.733259] do_notify_resume+0x224/0x5d8 > [ 62.737259] work_pending+0xc/0x2a4 > > The error results in OOM eventually. > > RCU priority boosting does work around this issue but it seems to me > a workaround more than a fix (otherwise boosting would be enabled > by CONFIG_PREEMPT for arm64 I guess?). > > The question is: is this an arm64 bug that should be investigated? or > is this some known corner case of running stress-ng that is already > understood? > > thanks > Jorge > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210831171159.L6GWokiqRdP6iR7ifyRwmK4o4zLRDfv695oAA6v1sWc@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210831171159.L6GWokiqRdP6iR7ifyRwmK4o4zLRDfv695oAA6v1sWc@z> @ 2021-08-31 17:11 ` Zhouyi Zhou 2021-08-31 17:11 ` Zhouyi Zhou [not found] ` <20210901010332.atavvn1KrA_ubxTIThV4OTrLSH5nQAB3vaUEFWHzgho@z> 0 siblings, 2 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-08-31 17:11 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel followings are some of my kernel config options: CONFIG_PREEMPT=y CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y CONFIG_PREEMPT_DYNAMIC=y CONFIG_PREEMPT_RCU=y CONFIG_HAVE_PREEMPT_DYNAMIC=y CONFIG_TREE_RCU=y CONFIG_PREEMPT_RCU=y CONFIG_RCU_EXPERT=y CONFIG_SRCU=y CONFIG_TREE_SRCU=y CONFIG_TASKS_RCU_GENERIC=y CONFIG_TASKS_RCU=y CONFIG_TASKS_RUDE_RCU=y CONFIG_TASKS_TRACE_RCU=y CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_NEED_SEGCBLIST=y CONFIG_RCU_FANOUT=64 CONFIG_RCU_FANOUT_LEAF=16 # CONFIG_RCU_FAST_NO_HZ is not set # CONFIG_RCU_BOOST is not set CONFIG_RCU_NOCB_CPU=y # CONFIG_TASKS_TRACE_RCU_READ_MB is not set # end of RCU Subsystem CONFIG_MMU_GATHER_RCU_TABLE_FREE=y CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 # RCU Debugging CONFIG_PROVE_RCU=y # CONFIG_PROVE_RCU_LIST is not set # CONFIG_RCU_SCALE_TEST is not set CONFIG_RCU_TORTURE_TEST=m # CONFIG_RCU_REF_SCALE_TEST is not set CONFIG_RCU_CPU_STALL_TIMEOUT=21 # CONFIG_RCU_TRACE is not set # CONFIG_RCU_EQS_DEBUG is not set Thanks Zhouyi On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > I did an experiment just now on x86_64 virtual machines, rcu did not > complain after 10 minutes's test, I hope my effort can provide some > clue. > > 1. I clone the fresh new linux kernel (git clone > https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > ./debian10.qcow2 -m 4096 -net > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > -vnc :30) > 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > 5. monitor the system by constantly typing top and dmesg > 6. after 10 minutes, nothing else happens except that the dmesg report > following two messages > [ 672.528192] sched: DL replenish lagged too much > [ 751.127790] hrtimer: interrupt took 12143 ns > > So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > > Zhouyi > > On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > <jorge@foundries.io> wrote: > > > > Hi > > > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > > +->qsmask 0x0 > > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > > [ 62.622359] rcu: RCU grace-period kthread stack dump: > > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > > [ 62.637308] Call trace: > > [ 62.639748] __switch_to+0x11c/0x190 > > [ 62.643319] __schedule+0x3b8/0x8d8 > > [ 62.646796] schedule+0x4c/0x108 > > [ 62.650018] schedule_timeout+0x1ac/0x358 > > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > > [ 62.657933] kthread+0x14c/0x158 > > [ 62.661153] ret_from_fork+0x10/0x18 > > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > > [ 62.689604] Preemption disabled at: > > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > > [ 62.706296] Hardware name: Zynqmp new (DT) > > [ 62.710115] Call trace: > > [ 62.712548] dump_backtrace+0x0/0x240 > > [ 62.716202] show_stack+0x2c/0x38 > > [ 62.719510] dump_stack+0xcc/0x104 > > [ 62.722904] __schedule_bug+0x78/0xc8 > > [ 62.726556] __schedule+0x70c/0x8d8 > > [ 62.730037] schedule+0x4c/0x108 > > [ 62.733259] do_notify_resume+0x224/0x5d8 > > [ 62.737259] work_pending+0xc/0x2a4 > > > > The error results in OOM eventually. > > > > RCU priority boosting does work around this issue but it seems to me > > a workaround more than a fix (otherwise boosting would be enabled > > by CONFIG_PREEMPT for arm64 I guess?). > > > > The question is: is this an arm64 bug that should be investigated? or > > is this some known corner case of running stress-ng that is already > > understood? > > > > thanks > > Jorge > > > > > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-08-31 17:11 ` Zhouyi Zhou @ 2021-08-31 17:11 ` Zhouyi Zhou [not found] ` <20210901010332.atavvn1KrA_ubxTIThV4OTrLSH5nQAB3vaUEFWHzgho@z> 1 sibling, 0 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-08-31 17:11 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel followings are some of my kernel config options: CONFIG_PREEMPT=y CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y CONFIG_PREEMPT_DYNAMIC=y CONFIG_PREEMPT_RCU=y CONFIG_HAVE_PREEMPT_DYNAMIC=y CONFIG_TREE_RCU=y CONFIG_PREEMPT_RCU=y CONFIG_RCU_EXPERT=y CONFIG_SRCU=y CONFIG_TREE_SRCU=y CONFIG_TASKS_RCU_GENERIC=y CONFIG_TASKS_RCU=y CONFIG_TASKS_RUDE_RCU=y CONFIG_TASKS_TRACE_RCU=y CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_NEED_SEGCBLIST=y CONFIG_RCU_FANOUT=64 CONFIG_RCU_FANOUT_LEAF=16 # CONFIG_RCU_FAST_NO_HZ is not set # CONFIG_RCU_BOOST is not set CONFIG_RCU_NOCB_CPU=y # CONFIG_TASKS_TRACE_RCU_READ_MB is not set # end of RCU Subsystem CONFIG_MMU_GATHER_RCU_TABLE_FREE=y CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 # RCU Debugging CONFIG_PROVE_RCU=y # CONFIG_PROVE_RCU_LIST is not set # CONFIG_RCU_SCALE_TEST is not set CONFIG_RCU_TORTURE_TEST=m # CONFIG_RCU_REF_SCALE_TEST is not set CONFIG_RCU_CPU_STALL_TIMEOUT=21 # CONFIG_RCU_TRACE is not set # CONFIG_RCU_EQS_DEBUG is not set Thanks Zhouyi On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > I did an experiment just now on x86_64 virtual machines, rcu did not > complain after 10 minutes's test, I hope my effort can provide some > clue. > > 1. I clone the fresh new linux kernel (git clone > https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > ./debian10.qcow2 -m 4096 -net > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > -vnc :30) > 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > 5. monitor the system by constantly typing top and dmesg > 6. after 10 minutes, nothing else happens except that the dmesg report > following two messages > [ 672.528192] sched: DL replenish lagged too much > [ 751.127790] hrtimer: interrupt took 12143 ns > > So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > > Zhouyi > > On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > <jorge@foundries.io> wrote: > > > > Hi > > > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > > +->qsmask 0x0 > > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > > [ 62.622359] rcu: RCU grace-period kthread stack dump: > > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > > [ 62.637308] Call trace: > > [ 62.639748] __switch_to+0x11c/0x190 > > [ 62.643319] __schedule+0x3b8/0x8d8 > > [ 62.646796] schedule+0x4c/0x108 > > [ 62.650018] schedule_timeout+0x1ac/0x358 > > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > > [ 62.657933] kthread+0x14c/0x158 > > [ 62.661153] ret_from_fork+0x10/0x18 > > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > > [ 62.689604] Preemption disabled at: > > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > > [ 62.706296] Hardware name: Zynqmp new (DT) > > [ 62.710115] Call trace: > > [ 62.712548] dump_backtrace+0x0/0x240 > > [ 62.716202] show_stack+0x2c/0x38 > > [ 62.719510] dump_stack+0xcc/0x104 > > [ 62.722904] __schedule_bug+0x78/0xc8 > > [ 62.726556] __schedule+0x70c/0x8d8 > > [ 62.730037] schedule+0x4c/0x108 > > [ 62.733259] do_notify_resume+0x224/0x5d8 > > [ 62.737259] work_pending+0xc/0x2a4 > > > > The error results in OOM eventually. > > > > RCU priority boosting does work around this issue but it seems to me > > a workaround more than a fix (otherwise boosting would be enabled > > by CONFIG_PREEMPT for arm64 I guess?). > > > > The question is: is this an arm64 bug that should be investigated? or > > is this some known corner case of running stress-ng that is already > > understood? > > > > thanks > > Jorge > > > > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210901010332.atavvn1KrA_ubxTIThV4OTrLSH5nQAB3vaUEFWHzgho@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210901010332.atavvn1KrA_ubxTIThV4OTrLSH5nQAB3vaUEFWHzgho@z> @ 2021-09-01 1:03 ` Zhouyi Zhou 2021-09-01 1:03 ` Zhouyi Zhou ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 1:03 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel Hi, I perform following two new rounds of experiments: Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp 8 -hda ./debian10.qcow2 -m 4096 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30) 1. CONFIG_RCU_BOOST=y 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times 1.2 as regular user at the same time, run $stress-ng --sequential 100 --class scheduler -t 5m --times System begin OOM kill after 6 minutes: 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D stack: 0 pid: 1634 ppid: 2 flag\ s:0x00004000 Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf addrconf_verify_work Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 Aug 31 19:41:12 debian kernel: [ 847.171921] schedule_preempt_disabled+0x14/0x20 Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 Aug 31 19:41:12 debian kernel: [ 847.176007] Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held in the system: Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 (rcu_read_lock){....}-{1:2}, at: debug_show_a\ ll_locks+0xe/0x1a0 2. # CONFIG_RCU_BOOST is not set 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times 2.2 as regular user at the same time, run $stress-ng --sequential 100 --class scheduler -t 5m --times System begin OOM kill after 6 minutes: The system is so dead, that I can't save the backtrace to file nor did kernel has a chance to save the log to /var/log/messages Thanks Zhouyi On Wed, Sep 1, 2021 at 1:11 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > followings are some of my kernel config options: > CONFIG_PREEMPT=y > CONFIG_PREEMPT_COUNT=y > CONFIG_PREEMPTION=y > CONFIG_PREEMPT_DYNAMIC=y > CONFIG_PREEMPT_RCU=y > CONFIG_HAVE_PREEMPT_DYNAMIC=y > CONFIG_TREE_RCU=y > CONFIG_PREEMPT_RCU=y > CONFIG_RCU_EXPERT=y > CONFIG_SRCU=y > CONFIG_TREE_SRCU=y > CONFIG_TASKS_RCU_GENERIC=y > CONFIG_TASKS_RCU=y > CONFIG_TASKS_RUDE_RCU=y > CONFIG_TASKS_TRACE_RCU=y > CONFIG_RCU_STALL_COMMON=y > CONFIG_RCU_NEED_SEGCBLIST=y > CONFIG_RCU_FANOUT=64 > CONFIG_RCU_FANOUT_LEAF=16 > # CONFIG_RCU_FAST_NO_HZ is not set > # CONFIG_RCU_BOOST is not set > CONFIG_RCU_NOCB_CPU=y > # CONFIG_TASKS_TRACE_RCU_READ_MB is not set > # end of RCU Subsystem > CONFIG_MMU_GATHER_RCU_TABLE_FREE=y > CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > # RCU Debugging > CONFIG_PROVE_RCU=y > # CONFIG_PROVE_RCU_LIST is not set > # CONFIG_RCU_SCALE_TEST is not set > CONFIG_RCU_TORTURE_TEST=m > # CONFIG_RCU_REF_SCALE_TEST is not set > CONFIG_RCU_CPU_STALL_TIMEOUT=21 > # CONFIG_RCU_TRACE is not set > # CONFIG_RCU_EQS_DEBUG is not set > > Thanks > Zhouyi > > On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > I did an experiment just now on x86_64 virtual machines, rcu did not > > complain after 10 minutes's test, I hope my effort can provide some > > clue. > > > > 1. I clone the fresh new linux kernel (git clone > > https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > > 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > > 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > > ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > > 5. monitor the system by constantly typing top and dmesg > > 6. after 10 minutes, nothing else happens except that the dmesg report > > following two messages > > [ 672.528192] sched: DL replenish lagged too much > > [ 751.127790] hrtimer: interrupt took 12143 ns > > > > So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > > > > Zhouyi > > > > On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > > <jorge@foundries.io> wrote: > > > > > > Hi > > > > > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > > > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > > > > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > > > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > > > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > > > +->qsmask 0x0 > > > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > > > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > > > [ 62.622359] rcu: RCU grace-period kthread stack dump: > > > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > > > [ 62.637308] Call trace: > > > [ 62.639748] __switch_to+0x11c/0x190 > > > [ 62.643319] __schedule+0x3b8/0x8d8 > > > [ 62.646796] schedule+0x4c/0x108 > > > [ 62.650018] schedule_timeout+0x1ac/0x358 > > > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > > > [ 62.657933] kthread+0x14c/0x158 > > > [ 62.661153] ret_from_fork+0x10/0x18 > > > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > > > [ 62.689604] Preemption disabled at: > > > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > > > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > > > [ 62.706296] Hardware name: Zynqmp new (DT) > > > [ 62.710115] Call trace: > > > [ 62.712548] dump_backtrace+0x0/0x240 > > > [ 62.716202] show_stack+0x2c/0x38 > > > [ 62.719510] dump_stack+0xcc/0x104 > > > [ 62.722904] __schedule_bug+0x78/0xc8 > > > [ 62.726556] __schedule+0x70c/0x8d8 > > > [ 62.730037] schedule+0x4c/0x108 > > > [ 62.733259] do_notify_resume+0x224/0x5d8 > > > [ 62.737259] work_pending+0xc/0x2a4 > > > > > > The error results in OOM eventually. > > > > > > RCU priority boosting does work around this issue but it seems to me > > > a workaround more than a fix (otherwise boosting would be enabled > > > by CONFIG_PREEMPT for arm64 I guess?). > > > > > > The question is: is this an arm64 bug that should be investigated? or > > > is this some known corner case of running stress-ng that is already > > > understood? > > > > > > thanks > > > Jorge > > > > > > > > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-09-01 1:03 ` Zhouyi Zhou @ 2021-09-01 1:03 ` Zhouyi Zhou 2021-09-01 4:08 ` Neeraj Upadhyay [not found] ` <20210901082321.nj61nA4mXy_hgClhySzqE-uoWYVzA0aI3c0f_uguoIs@z> 2 siblings, 0 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 1:03 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel Hi, I perform following two new rounds of experiments: Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp 8 -hda ./debian10.qcow2 -m 4096 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30) 1. CONFIG_RCU_BOOST=y 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times 1.2 as regular user at the same time, run $stress-ng --sequential 100 --class scheduler -t 5m --times System begin OOM kill after 6 minutes: 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D stack: 0 pid: 1634 ppid: 2 flag\ s:0x00004000 Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf addrconf_verify_work Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 Aug 31 19:41:12 debian kernel: [ 847.171921] schedule_preempt_disabled+0x14/0x20 Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 Aug 31 19:41:12 debian kernel: [ 847.176007] Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held in the system: Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 (rcu_read_lock){....}-{1:2}, at: debug_show_a\ ll_locks+0xe/0x1a0 2. # CONFIG_RCU_BOOST is not set 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times 2.2 as regular user at the same time, run $stress-ng --sequential 100 --class scheduler -t 5m --times System begin OOM kill after 6 minutes: The system is so dead, that I can't save the backtrace to file nor did kernel has a chance to save the log to /var/log/messages Thanks Zhouyi On Wed, Sep 1, 2021 at 1:11 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > followings are some of my kernel config options: > CONFIG_PREEMPT=y > CONFIG_PREEMPT_COUNT=y > CONFIG_PREEMPTION=y > CONFIG_PREEMPT_DYNAMIC=y > CONFIG_PREEMPT_RCU=y > CONFIG_HAVE_PREEMPT_DYNAMIC=y > CONFIG_TREE_RCU=y > CONFIG_PREEMPT_RCU=y > CONFIG_RCU_EXPERT=y > CONFIG_SRCU=y > CONFIG_TREE_SRCU=y > CONFIG_TASKS_RCU_GENERIC=y > CONFIG_TASKS_RCU=y > CONFIG_TASKS_RUDE_RCU=y > CONFIG_TASKS_TRACE_RCU=y > CONFIG_RCU_STALL_COMMON=y > CONFIG_RCU_NEED_SEGCBLIST=y > CONFIG_RCU_FANOUT=64 > CONFIG_RCU_FANOUT_LEAF=16 > # CONFIG_RCU_FAST_NO_HZ is not set > # CONFIG_RCU_BOOST is not set > CONFIG_RCU_NOCB_CPU=y > # CONFIG_TASKS_TRACE_RCU_READ_MB is not set > # end of RCU Subsystem > CONFIG_MMU_GATHER_RCU_TABLE_FREE=y > CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > # RCU Debugging > CONFIG_PROVE_RCU=y > # CONFIG_PROVE_RCU_LIST is not set > # CONFIG_RCU_SCALE_TEST is not set > CONFIG_RCU_TORTURE_TEST=m > # CONFIG_RCU_REF_SCALE_TEST is not set > CONFIG_RCU_CPU_STALL_TIMEOUT=21 > # CONFIG_RCU_TRACE is not set > # CONFIG_RCU_EQS_DEBUG is not set > > Thanks > Zhouyi > > On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > I did an experiment just now on x86_64 virtual machines, rcu did not > > complain after 10 minutes's test, I hope my effort can provide some > > clue. > > > > 1. I clone the fresh new linux kernel (git clone > > https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > > 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > > 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > > ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > > 5. monitor the system by constantly typing top and dmesg > > 6. after 10 minutes, nothing else happens except that the dmesg report > > following two messages > > [ 672.528192] sched: DL replenish lagged too much > > [ 751.127790] hrtimer: interrupt took 12143 ns > > > > So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > > > > Zhouyi > > > > On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > > <jorge@foundries.io> wrote: > > > > > > Hi > > > > > > When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > > > tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > > > > > > [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > > > [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > > > [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > > > +->qsmask 0x0 > > > [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > > > [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > > > [ 62.622359] rcu: RCU grace-period kthread stack dump: > > > [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > > > [ 62.637308] Call trace: > > > [ 62.639748] __switch_to+0x11c/0x190 > > > [ 62.643319] __schedule+0x3b8/0x8d8 > > > [ 62.646796] schedule+0x4c/0x108 > > > [ 62.650018] schedule_timeout+0x1ac/0x358 > > > [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > > > [ 62.657933] kthread+0x14c/0x158 > > > [ 62.661153] ret_from_fork+0x10/0x18 > > > [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > > > [ 62.689604] Preemption disabled at: > > > [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > > > [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > > > [ 62.706296] Hardware name: Zynqmp new (DT) > > > [ 62.710115] Call trace: > > > [ 62.712548] dump_backtrace+0x0/0x240 > > > [ 62.716202] show_stack+0x2c/0x38 > > > [ 62.719510] dump_stack+0xcc/0x104 > > > [ 62.722904] __schedule_bug+0x78/0xc8 > > > [ 62.726556] __schedule+0x70c/0x8d8 > > > [ 62.730037] schedule+0x4c/0x108 > > > [ 62.733259] do_notify_resume+0x224/0x5d8 > > > [ 62.737259] work_pending+0xc/0x2a4 > > > > > > The error results in OOM eventually. > > > > > > RCU priority boosting does work around this issue but it seems to me > > > a workaround more than a fix (otherwise boosting would be enabled > > > by CONFIG_PREEMPT for arm64 I guess?). > > > > > > The question is: is this an arm64 bug that should be investigated? or > > > is this some known corner case of running stress-ng that is already > > > understood? > > > > > > thanks > > > Jorge > > > > > > > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-09-01 1:03 ` Zhouyi Zhou 2021-09-01 1:03 ` Zhouyi Zhou @ 2021-09-01 4:08 ` Neeraj Upadhyay [not found] ` <20210901064718.aKXabQBXfHO1cXDOrnGPmUo0Uikxa2W9WUpwaf6UQZs@z> [not found] ` <20210901082321.nj61nA4mXy_hgClhySzqE-uoWYVzA0aI3c0f_uguoIs@z> 2 siblings, 1 reply; 14+ messages in thread From: Neeraj Upadhyay @ 2021-09-01 4:08 UTC (permalink / raw) To: Zhouyi Zhou, Jorge Ramirez-Ortiz, Foundries Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel Hi, >[ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 >[ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. These logs tells us, rcu_preempt thread is in RUNNING state, however, the thread is not scheduled for long. So, can you please check the activity (scheduler traces, runqueue) on the CPU where rcu_preempt kthread is running (->cpu=1 in this case) , to see why rcu_preempt is starved of CPU cycles? Thanks Neeraj On 9/1/2021 6:33 AM, Zhouyi Zhou wrote: > Hi, > > I perform following two new rounds of experiments: > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > 8 -hda ./debian10.qcow2 -m 4096 -net > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > -vnc :30) > > 1. CONFIG_RCU_BOOST=y > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > stack: 0 pid: 1634 ppid: 2 flag\ > s:0x00004000 > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > addrconf_verify_work > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > Aug 31 19:41:12 debian kernel: [ 847.171921] > schedule_preempt_disabled+0x14/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > Aug 31 19:41:12 debian kernel: [ 847.176007] > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > in the system: > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > ll_locks+0xe/0x1a0 > > 2. # CONFIG_RCU_BOOST is not set > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > --class scheduler -t 5m --times > System begin OOM kill after 6 minutes: > The system is so dead, that I can't save the backtrace to file nor did > kernel has a chance to save the log to /var/log/messages > > Thanks > Zhouyi > > On Wed, Sep 1, 2021 at 1:11 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >> >> followings are some of my kernel config options: >> CONFIG_PREEMPT=y >> CONFIG_PREEMPT_COUNT=y >> CONFIG_PREEMPTION=y >> CONFIG_PREEMPT_DYNAMIC=y >> CONFIG_PREEMPT_RCU=y >> CONFIG_HAVE_PREEMPT_DYNAMIC=y >> CONFIG_TREE_RCU=y >> CONFIG_PREEMPT_RCU=y >> CONFIG_RCU_EXPERT=y >> CONFIG_SRCU=y >> CONFIG_TREE_SRCU=y >> CONFIG_TASKS_RCU_GENERIC=y >> CONFIG_TASKS_RCU=y >> CONFIG_TASKS_RUDE_RCU=y >> CONFIG_TASKS_TRACE_RCU=y >> CONFIG_RCU_STALL_COMMON=y >> CONFIG_RCU_NEED_SEGCBLIST=y >> CONFIG_RCU_FANOUT=64 >> CONFIG_RCU_FANOUT_LEAF=16 >> # CONFIG_RCU_FAST_NO_HZ is not set >> # CONFIG_RCU_BOOST is not set >> CONFIG_RCU_NOCB_CPU=y >> # CONFIG_TASKS_TRACE_RCU_READ_MB is not set >> # end of RCU Subsystem >> CONFIG_MMU_GATHER_RCU_TABLE_FREE=y >> CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 >> # RCU Debugging >> CONFIG_PROVE_RCU=y >> # CONFIG_PROVE_RCU_LIST is not set >> # CONFIG_RCU_SCALE_TEST is not set >> CONFIG_RCU_TORTURE_TEST=m >> # CONFIG_RCU_REF_SCALE_TEST is not set >> CONFIG_RCU_CPU_STALL_TIMEOUT=21 >> # CONFIG_RCU_TRACE is not set >> # CONFIG_RCU_EQS_DEBUG is not set >> >> Thanks >> Zhouyi >> >> On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >>> >>> I did an experiment just now on x86_64 virtual machines, rcu did not >>> complain after 10 minutes's test, I hope my effort can provide some >>> clue. >>> >>> 1. I clone the fresh new linux kernel (git clone >>> https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) >>> 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) >>> 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda >>> ./debian10.qcow2 -m 4096 -net >>> user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 >>> -vnc :30) >>> 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) >>> 5. monitor the system by constantly typing top and dmesg >>> 6. after 10 minutes, nothing else happens except that the dmesg report >>> following two messages >>> [ 672.528192] sched: DL replenish lagged too much >>> [ 751.127790] hrtimer: interrupt took 12143 ns >>> >>> So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines >>> >>> Zhouyi >>> >>> On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries >>> <jorge@foundries.io> wrote: >>>> >>>> Hi >>>> >>>> When enabling CONFIG_PREEMPT and running the stress-ng scheduler class >>>> tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. >>>> >>>> [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: >>>> [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) >>>> [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root >>>> +->qsmask 0x0 >>>> [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 >>>> [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. >>>> [ 62.622359] rcu: RCU grace-period kthread stack dump: >>>> [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 >>>> [ 62.637308] Call trace: >>>> [ 62.639748] __switch_to+0x11c/0x190 >>>> [ 62.643319] __schedule+0x3b8/0x8d8 >>>> [ 62.646796] schedule+0x4c/0x108 >>>> [ 62.650018] schedule_timeout+0x1ac/0x358 >>>> [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 >>>> [ 62.657933] kthread+0x14c/0x158 >>>> [ 62.661153] ret_from_fork+0x10/0x18 >>>> [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 >>>> [ 62.689604] Preemption disabled at: >>>> [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 >>>> [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 >>>> [ 62.706296] Hardware name: Zynqmp new (DT) >>>> [ 62.710115] Call trace: >>>> [ 62.712548] dump_backtrace+0x0/0x240 >>>> [ 62.716202] show_stack+0x2c/0x38 >>>> [ 62.719510] dump_stack+0xcc/0x104 >>>> [ 62.722904] __schedule_bug+0x78/0xc8 >>>> [ 62.726556] __schedule+0x70c/0x8d8 >>>> [ 62.730037] schedule+0x4c/0x108 >>>> [ 62.733259] do_notify_resume+0x224/0x5d8 >>>> [ 62.737259] work_pending+0xc/0x2a4 >>>> >>>> The error results in OOM eventually. >>>> >>>> RCU priority boosting does work around this issue but it seems to me >>>> a workaround more than a fix (otherwise boosting would be enabled >>>> by CONFIG_PREEMPT for arm64 I guess?). >>>> >>>> The question is: is this an arm64 bug that should be investigated? or >>>> is this some known corner case of running stress-ng that is already >>>> understood? >>>> >>>> thanks >>>> Jorge >>>> >>>> >>>> -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210901064718.aKXabQBXfHO1cXDOrnGPmUo0Uikxa2W9WUpwaf6UQZs@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210901064718.aKXabQBXfHO1cXDOrnGPmUo0Uikxa2W9WUpwaf6UQZs@z> @ 2021-09-01 6:47 ` Zhouyi Zhou 2021-09-01 6:47 ` Zhouyi Zhou 0 siblings, 1 reply; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 6:47 UTC (permalink / raw) To: Neeraj Upadhyay Cc: Jorge Ramirez-Ortiz, Foundries, paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel Hi Neeraj Thank you for reviewing the original logs provided by jorge and my new test logs. On Wed, Sep 1, 2021 at 12:08 PM Neeraj Upadhyay <neeraju@codeaurora.org> wrote: > > Hi, > > >[ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! > g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > >[ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU > time, OOM is now expected behavior. > > These logs tells us, rcu_preempt thread is in RUNNING state, however, > the thread is not scheduled for long. So, can you please check the > activity (scheduler traces, runqueue) on the CPU where rcu_preempt > kthread is running (->cpu=1 in this case) , to see why rcu_preempt > is starved of CPU cycles? I start the qemu virtual machine with gdb remote debugging kvm -cpu host -smp 8 -s -S -hda ./debian10.qcow2 -m 8192 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30& gdb ./vmlinux gdb)target remote :1234 gdb)break rcu_check_gp_kthread_starvation gdb) continue then rerun my second test case (2. # CONFIG_RCU_BOOST is not set) the gdb breakpoint does not get a hit! then I use gdb command info threads, the system seems to struggling with memory subsystem (gdb) info threads Id Target Id Frame 8 Thread 8 (CPU#7 [running]) check_preemption_disabled ( what1=0xffffffff82604e07 "__this_cpu_", what2=0xffffffff82589772 "add_return") at lib/smp_processor_id.c:14 7 Thread 7 (CPU#6 [running]) __lock_is_held (read=<optimized out>, lock=<optimized out>) at kernel/locking/lockdep.c:5366 6 Thread 6 (CPU#5 [running]) nfsd_reply_cache_count ( shrink=0xffff88804d10fd10, sc=0xffffc9000eb4f788) at fs/nfsd/nfscache.c:295 5 Thread 5 (CPU#4 [running]) preempt_count () at ./arch/x86/include/asm/preempt.h:27 4 Thread 4 (CPU#3 [running]) rcu_read_lock () at ./include/linux/rcupdate.h:688 3 Thread 3 (CPU#2 [running]) 0xffffffff812c3b34 in find_next_bit ( offset=<optimized out>, size=<optimized out>, addr=<optimized out>) at ./include/asm-generic/bitops/find.h:36 * 2 Thread 2 (CPU#1 [running]) do_shrink_slab ( shrinkctl=0xffffc9000031f820, shrinker=0xffff888191c2c510, priority=1) at mm/vmscan.c:659 1 Thread 1 (CPU#0 [running]) check_wait_context (next=<optimized out>, curr=<optimized out>) at kernel/locking/lockdep.c:4715 (gdb) bt #0 do_shrink_slab (shrinkctl=0xffffc9000031f820, shrinker=0xffff888191c2c510, priority=1) at mm/vmscan.c:659 #1 0xffffffff812c3c75 in shrink_slab (gfp_mask=17829066, nid=0, memcg=0xffff888100322000, priority=1) at mm/vmscan.c:872 #2 0xffffffff812c656f in shrink_node_memcgs (sc=<optimized out>, pgdat=<optimized out>) at mm/vmscan.c:2879 #3 shrink_node (pgdat=0xffffc9000031f820, sc=0xffffc9000031f978) at mm/vmscan.c:2994 #4 0xffffffff812c6b9d in shrink_zones (sc=<optimized out>, zonelist=<optimized out>) at mm/vmscan.c:3197 #5 do_try_to_free_pages (zonelist=0xffff88823fffba00, sc=0xffffc9000031f978) at mm/vmscan.c:3252 #6 0xffffffff812c8357 in try_to_free_pages (zonelist=0xffff88823fffba00, order=0, gfp_mask=<optimized out>, nodemask=<optimized out>) at mm/vmscan.c:3491 #7 0xffffffff81315a9e in __perform_reclaim (ac=<optimized out>, ac=<optimized out>, order=<optimized out>, gfp_mask=<optimized out>) at mm/page_alloc.c:4607 #8 __alloc_pages_direct_reclaim (did_some_progress=<optimized out>, ac=<optimized out>, alloc_flags=<optimized out>, order=<optimized out>, gfp_mask=<optimized out>) at mm/page_alloc.c:4628 #9 __alloc_pages_slowpath (gfp_mask=3274784, order=17829066, ac=0xffffc9000031faf8) at mm/page_alloc.c:5032 ---Type <return> to continue, or q <return> to quit--- #10 0xffffffff81316b53 in __alloc_pages (gfp=17829066, order=0, preferred_nid=<optimized out>, nodemask=0x0 <fixed_percpu_data>) at mm/page_alloc.c:5403 #11 0xffffffff81334f13 in alloc_pages_vma (gfp=17829066, order=0, vma=0xffff888107ed4cc0, addr=<optimized out>, node=0, hugepage=<optimized out>) at mm/mempolicy.c:2208 #12 0xffffffff8131ee1e in __read_swap_cache_async (entry=..., gfp_mask=17829066, vma=0x1 <fixed_percpu_data+1>, addr=0, new_page_allocated=0x1 <fixed_percpu_data+1>) at mm/swap_state.c:459 #13 0xffffffff8131f0b8 in read_swap_cache_async (entry=..., gfp_mask=<optimized out>, vma=<optimized out>, addr=<optimized out>, do_poll=true) at mm/swap_state.c:525 #14 0xffffffff8131f2bd in swap_cluster_readahead (entry=..., gfp_mask=17829066, vmf=0x6d3b4) at mm/swap_state.c:661 #15 0xffffffff8131f4c5 in swapin_readahead (entry=..., gfp_mask=2445460752, vmf=0x1 <fixed_percpu_data+1>) at mm/swap_state.c:853 #16 0xffffffff812efcf0 in do_swap_page (vmf=0xffffc9000031fe00) at mm/memory.c:3551 #17 0xffffffff812f3a7d in handle_pte_fault (vmf=<optimized out>) at mm/memory.c:4562 #18 __handle_mm_fault (vma=<optimized out>, address=<optimized out>, flags=<optimized out>) at mm/memory.c:4693 #19 0xffffffff812f4633 in handle_mm_fault (vma=0xffff888107ed4cc0, ---Type <return> to continue, or q <return> to quit--- address=140202217193320, flags=596, regs=0xffffc9000031ff58) at mm/memory.c:4791 #20 0xffffffff8110e36e in do_user_addr_fault (regs=0xffffc9000031f820, error_code=18446612688810460432, address=1) at arch/x86/mm/fault.c:1390 #21 0xffffffff81c13d69 in handle_page_fault (address=<optimized out>, error_code=<optimized out>, regs=<optimized out>) at arch/x86/mm/fault.c:1475 #22 exc_page_fault (regs=0xffffc9000031f820, error_code=18446612688810460432) at arch/x86/mm/fault.c:1531 #23 0xffffffff81e00ace in asm_exc_page_fault () at ./arch/x86/include/asm/idtentry.h:568 #24 0x0000000000000016 in fixed_percpu_data () #25 0x0000000000000000 in ?? () So, I guess jorge's test may be more instructive than mine ;-) > > > Thanks > Neeraj > > On 9/1/2021 6:33 AM, Zhouyi Zhou wrote: > > Hi, > > > > I perform following two new rounds of experiments: > > > > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > > 8 -hda ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > > > 1. CONFIG_RCU_BOOST=y > > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > > > System begin OOM kill after 6 minutes: > > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > > stack: 0 pid: 1634 ppid: 2 flag\ > > s:0x00004000 > > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > > addrconf_verify_work > > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > > Aug 31 19:41:12 debian kernel: [ 847.171921] > > schedule_preempt_disabled+0x14/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > > Aug 31 19:41:12 debian kernel: [ 847.176007] > > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > > in the system: > > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > > ll_locks+0xe/0x1a0 > > > > 2. # CONFIG_RCU_BOOST is not set > > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > > The system is so dead, that I can't save the backtrace to file nor did > > kernel has a chance to save the log to /var/log/messages > > > > Thanks > > Zhouyi > > > > On Wed, Sep 1, 2021 at 1:11 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> > >> followings are some of my kernel config options: > >> CONFIG_PREEMPT=y > >> CONFIG_PREEMPT_COUNT=y > >> CONFIG_PREEMPTION=y > >> CONFIG_PREEMPT_DYNAMIC=y > >> CONFIG_PREEMPT_RCU=y > >> CONFIG_HAVE_PREEMPT_DYNAMIC=y > >> CONFIG_TREE_RCU=y > >> CONFIG_PREEMPT_RCU=y > >> CONFIG_RCU_EXPERT=y > >> CONFIG_SRCU=y > >> CONFIG_TREE_SRCU=y > >> CONFIG_TASKS_RCU_GENERIC=y > >> CONFIG_TASKS_RCU=y > >> CONFIG_TASKS_RUDE_RCU=y > >> CONFIG_TASKS_TRACE_RCU=y > >> CONFIG_RCU_STALL_COMMON=y > >> CONFIG_RCU_NEED_SEGCBLIST=y > >> CONFIG_RCU_FANOUT=64 > >> CONFIG_RCU_FANOUT_LEAF=16 > >> # CONFIG_RCU_FAST_NO_HZ is not set > >> # CONFIG_RCU_BOOST is not set > >> CONFIG_RCU_NOCB_CPU=y > >> # CONFIG_TASKS_TRACE_RCU_READ_MB is not set > >> # end of RCU Subsystem > >> CONFIG_MMU_GATHER_RCU_TABLE_FREE=y > >> CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > >> # RCU Debugging > >> CONFIG_PROVE_RCU=y > >> # CONFIG_PROVE_RCU_LIST is not set > >> # CONFIG_RCU_SCALE_TEST is not set > >> CONFIG_RCU_TORTURE_TEST=m > >> # CONFIG_RCU_REF_SCALE_TEST is not set > >> CONFIG_RCU_CPU_STALL_TIMEOUT=21 > >> # CONFIG_RCU_TRACE is not set > >> # CONFIG_RCU_EQS_DEBUG is not set > >> > >> Thanks > >> Zhouyi > >> > >> On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >>> > >>> I did an experiment just now on x86_64 virtual machines, rcu did not > >>> complain after 10 minutes's test, I hope my effort can provide some > >>> clue. > >>> > >>> 1. I clone the fresh new linux kernel (git clone > >>> https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > >>> 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > >>> 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > >>> ./debian10.qcow2 -m 4096 -net > >>> user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > >>> -vnc :30) > >>> 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > >>> 5. monitor the system by constantly typing top and dmesg > >>> 6. after 10 minutes, nothing else happens except that the dmesg report > >>> following two messages > >>> [ 672.528192] sched: DL replenish lagged too much > >>> [ 751.127790] hrtimer: interrupt took 12143 ns > >>> > >>> So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > >>> > >>> Zhouyi > >>> > >>> On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > >>> <jorge@foundries.io> wrote: > >>>> > >>>> Hi > >>>> > >>>> When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > >>>> tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > >>>> > >>>> [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > >>>> [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > >>>> [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > >>>> +->qsmask 0x0 > >>>> [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > >>>> [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > >>>> [ 62.622359] rcu: RCU grace-period kthread stack dump: > >>>> [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > >>>> [ 62.637308] Call trace: > >>>> [ 62.639748] __switch_to+0x11c/0x190 > >>>> [ 62.643319] __schedule+0x3b8/0x8d8 > >>>> [ 62.646796] schedule+0x4c/0x108 > >>>> [ 62.650018] schedule_timeout+0x1ac/0x358 > >>>> [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > >>>> [ 62.657933] kthread+0x14c/0x158 > >>>> [ 62.661153] ret_from_fork+0x10/0x18 > >>>> [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > >>>> [ 62.689604] Preemption disabled at: > >>>> [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > >>>> [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > >>>> [ 62.706296] Hardware name: Zynqmp new (DT) > >>>> [ 62.710115] Call trace: > >>>> [ 62.712548] dump_backtrace+0x0/0x240 > >>>> [ 62.716202] show_stack+0x2c/0x38 > >>>> [ 62.719510] dump_stack+0xcc/0x104 > >>>> [ 62.722904] __schedule_bug+0x78/0xc8 > >>>> [ 62.726556] __schedule+0x70c/0x8d8 > >>>> [ 62.730037] schedule+0x4c/0x108 > >>>> [ 62.733259] do_notify_resume+0x224/0x5d8 > >>>> [ 62.737259] work_pending+0xc/0x2a4 > >>>> > >>>> The error results in OOM eventually. > >>>> > >>>> RCU priority boosting does work around this issue but it seems to me > >>>> a workaround more than a fix (otherwise boosting would be enabled > >>>> by CONFIG_PREEMPT for arm64 I guess?). > >>>> > >>>> The question is: is this an arm64 bug that should be investigated? or > >>>> is this some known corner case of running stress-ng that is already > >>>> understood? > >>>> > >>>> thanks > >>>> Jorge > >>>> > >>>> > >>>> > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a > member of the Code Aurora Forum, hosted by The Linux Foundation Thanks Zhouyi _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-09-01 6:47 ` Zhouyi Zhou @ 2021-09-01 6:47 ` Zhouyi Zhou 0 siblings, 0 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 6:47 UTC (permalink / raw) To: Neeraj Upadhyay Cc: Jorge Ramirez-Ortiz, Foundries, paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel Hi Neeraj Thank you for reviewing the original logs provided by jorge and my new test logs. On Wed, Sep 1, 2021 at 12:08 PM Neeraj Upadhyay <neeraju@codeaurora.org> wrote: > > Hi, > > >[ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! > g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > >[ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU > time, OOM is now expected behavior. > > These logs tells us, rcu_preempt thread is in RUNNING state, however, > the thread is not scheduled for long. So, can you please check the > activity (scheduler traces, runqueue) on the CPU where rcu_preempt > kthread is running (->cpu=1 in this case) , to see why rcu_preempt > is starved of CPU cycles? I start the qemu virtual machine with gdb remote debugging kvm -cpu host -smp 8 -s -S -hda ./debian10.qcow2 -m 8192 -net user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 -vnc :30& gdb ./vmlinux gdb)target remote :1234 gdb)break rcu_check_gp_kthread_starvation gdb) continue then rerun my second test case (2. # CONFIG_RCU_BOOST is not set) the gdb breakpoint does not get a hit! then I use gdb command info threads, the system seems to struggling with memory subsystem (gdb) info threads Id Target Id Frame 8 Thread 8 (CPU#7 [running]) check_preemption_disabled ( what1=0xffffffff82604e07 "__this_cpu_", what2=0xffffffff82589772 "add_return") at lib/smp_processor_id.c:14 7 Thread 7 (CPU#6 [running]) __lock_is_held (read=<optimized out>, lock=<optimized out>) at kernel/locking/lockdep.c:5366 6 Thread 6 (CPU#5 [running]) nfsd_reply_cache_count ( shrink=0xffff88804d10fd10, sc=0xffffc9000eb4f788) at fs/nfsd/nfscache.c:295 5 Thread 5 (CPU#4 [running]) preempt_count () at ./arch/x86/include/asm/preempt.h:27 4 Thread 4 (CPU#3 [running]) rcu_read_lock () at ./include/linux/rcupdate.h:688 3 Thread 3 (CPU#2 [running]) 0xffffffff812c3b34 in find_next_bit ( offset=<optimized out>, size=<optimized out>, addr=<optimized out>) at ./include/asm-generic/bitops/find.h:36 * 2 Thread 2 (CPU#1 [running]) do_shrink_slab ( shrinkctl=0xffffc9000031f820, shrinker=0xffff888191c2c510, priority=1) at mm/vmscan.c:659 1 Thread 1 (CPU#0 [running]) check_wait_context (next=<optimized out>, curr=<optimized out>) at kernel/locking/lockdep.c:4715 (gdb) bt #0 do_shrink_slab (shrinkctl=0xffffc9000031f820, shrinker=0xffff888191c2c510, priority=1) at mm/vmscan.c:659 #1 0xffffffff812c3c75 in shrink_slab (gfp_mask=17829066, nid=0, memcg=0xffff888100322000, priority=1) at mm/vmscan.c:872 #2 0xffffffff812c656f in shrink_node_memcgs (sc=<optimized out>, pgdat=<optimized out>) at mm/vmscan.c:2879 #3 shrink_node (pgdat=0xffffc9000031f820, sc=0xffffc9000031f978) at mm/vmscan.c:2994 #4 0xffffffff812c6b9d in shrink_zones (sc=<optimized out>, zonelist=<optimized out>) at mm/vmscan.c:3197 #5 do_try_to_free_pages (zonelist=0xffff88823fffba00, sc=0xffffc9000031f978) at mm/vmscan.c:3252 #6 0xffffffff812c8357 in try_to_free_pages (zonelist=0xffff88823fffba00, order=0, gfp_mask=<optimized out>, nodemask=<optimized out>) at mm/vmscan.c:3491 #7 0xffffffff81315a9e in __perform_reclaim (ac=<optimized out>, ac=<optimized out>, order=<optimized out>, gfp_mask=<optimized out>) at mm/page_alloc.c:4607 #8 __alloc_pages_direct_reclaim (did_some_progress=<optimized out>, ac=<optimized out>, alloc_flags=<optimized out>, order=<optimized out>, gfp_mask=<optimized out>) at mm/page_alloc.c:4628 #9 __alloc_pages_slowpath (gfp_mask=3274784, order=17829066, ac=0xffffc9000031faf8) at mm/page_alloc.c:5032 ---Type <return> to continue, or q <return> to quit--- #10 0xffffffff81316b53 in __alloc_pages (gfp=17829066, order=0, preferred_nid=<optimized out>, nodemask=0x0 <fixed_percpu_data>) at mm/page_alloc.c:5403 #11 0xffffffff81334f13 in alloc_pages_vma (gfp=17829066, order=0, vma=0xffff888107ed4cc0, addr=<optimized out>, node=0, hugepage=<optimized out>) at mm/mempolicy.c:2208 #12 0xffffffff8131ee1e in __read_swap_cache_async (entry=..., gfp_mask=17829066, vma=0x1 <fixed_percpu_data+1>, addr=0, new_page_allocated=0x1 <fixed_percpu_data+1>) at mm/swap_state.c:459 #13 0xffffffff8131f0b8 in read_swap_cache_async (entry=..., gfp_mask=<optimized out>, vma=<optimized out>, addr=<optimized out>, do_poll=true) at mm/swap_state.c:525 #14 0xffffffff8131f2bd in swap_cluster_readahead (entry=..., gfp_mask=17829066, vmf=0x6d3b4) at mm/swap_state.c:661 #15 0xffffffff8131f4c5 in swapin_readahead (entry=..., gfp_mask=2445460752, vmf=0x1 <fixed_percpu_data+1>) at mm/swap_state.c:853 #16 0xffffffff812efcf0 in do_swap_page (vmf=0xffffc9000031fe00) at mm/memory.c:3551 #17 0xffffffff812f3a7d in handle_pte_fault (vmf=<optimized out>) at mm/memory.c:4562 #18 __handle_mm_fault (vma=<optimized out>, address=<optimized out>, flags=<optimized out>) at mm/memory.c:4693 #19 0xffffffff812f4633 in handle_mm_fault (vma=0xffff888107ed4cc0, ---Type <return> to continue, or q <return> to quit--- address=140202217193320, flags=596, regs=0xffffc9000031ff58) at mm/memory.c:4791 #20 0xffffffff8110e36e in do_user_addr_fault (regs=0xffffc9000031f820, error_code=18446612688810460432, address=1) at arch/x86/mm/fault.c:1390 #21 0xffffffff81c13d69 in handle_page_fault (address=<optimized out>, error_code=<optimized out>, regs=<optimized out>) at arch/x86/mm/fault.c:1475 #22 exc_page_fault (regs=0xffffc9000031f820, error_code=18446612688810460432) at arch/x86/mm/fault.c:1531 #23 0xffffffff81e00ace in asm_exc_page_fault () at ./arch/x86/include/asm/idtentry.h:568 #24 0x0000000000000016 in fixed_percpu_data () #25 0x0000000000000000 in ?? () So, I guess jorge's test may be more instructive than mine ;-) > > > Thanks > Neeraj > > On 9/1/2021 6:33 AM, Zhouyi Zhou wrote: > > Hi, > > > > I perform following two new rounds of experiments: > > > > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > > 8 -hda ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > > > 1. CONFIG_RCU_BOOST=y > > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > > > System begin OOM kill after 6 minutes: > > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > > stack: 0 pid: 1634 ppid: 2 flag\ > > s:0x00004000 > > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > > addrconf_verify_work > > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > > Aug 31 19:41:12 debian kernel: [ 847.171921] > > schedule_preempt_disabled+0x14/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > > Aug 31 19:41:12 debian kernel: [ 847.176007] > > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > > in the system: > > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > > ll_locks+0xe/0x1a0 > > > > 2. # CONFIG_RCU_BOOST is not set > > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > > The system is so dead, that I can't save the backtrace to file nor did > > kernel has a chance to save the log to /var/log/messages > > > > Thanks > > Zhouyi > > > > On Wed, Sep 1, 2021 at 1:11 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> > >> followings are some of my kernel config options: > >> CONFIG_PREEMPT=y > >> CONFIG_PREEMPT_COUNT=y > >> CONFIG_PREEMPTION=y > >> CONFIG_PREEMPT_DYNAMIC=y > >> CONFIG_PREEMPT_RCU=y > >> CONFIG_HAVE_PREEMPT_DYNAMIC=y > >> CONFIG_TREE_RCU=y > >> CONFIG_PREEMPT_RCU=y > >> CONFIG_RCU_EXPERT=y > >> CONFIG_SRCU=y > >> CONFIG_TREE_SRCU=y > >> CONFIG_TASKS_RCU_GENERIC=y > >> CONFIG_TASKS_RCU=y > >> CONFIG_TASKS_RUDE_RCU=y > >> CONFIG_TASKS_TRACE_RCU=y > >> CONFIG_RCU_STALL_COMMON=y > >> CONFIG_RCU_NEED_SEGCBLIST=y > >> CONFIG_RCU_FANOUT=64 > >> CONFIG_RCU_FANOUT_LEAF=16 > >> # CONFIG_RCU_FAST_NO_HZ is not set > >> # CONFIG_RCU_BOOST is not set > >> CONFIG_RCU_NOCB_CPU=y > >> # CONFIG_TASKS_TRACE_RCU_READ_MB is not set > >> # end of RCU Subsystem > >> CONFIG_MMU_GATHER_RCU_TABLE_FREE=y > >> CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > >> # RCU Debugging > >> CONFIG_PROVE_RCU=y > >> # CONFIG_PROVE_RCU_LIST is not set > >> # CONFIG_RCU_SCALE_TEST is not set > >> CONFIG_RCU_TORTURE_TEST=m > >> # CONFIG_RCU_REF_SCALE_TEST is not set > >> CONFIG_RCU_CPU_STALL_TIMEOUT=21 > >> # CONFIG_RCU_TRACE is not set > >> # CONFIG_RCU_EQS_DEBUG is not set > >> > >> Thanks > >> Zhouyi > >> > >> On Wed, Sep 1, 2021 at 1:01 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >>> > >>> I did an experiment just now on x86_64 virtual machines, rcu did not > >>> complain after 10 minutes's test, I hope my effort can provide some > >>> clue. > >>> > >>> 1. I clone the fresh new linux kernel (git clone > >>> https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git) > >>> 2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set) > >>> 3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16 -hda > >>> ./debian10.qcow2 -m 4096 -net > >>> user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > >>> -vnc :30) > >>> 4. run the test (stress-ng --sequential 16 --class scheduler -t 5m --times) > >>> 5. monitor the system by constantly typing top and dmesg > >>> 6. after 10 minutes, nothing else happens except that the dmesg report > >>> following two messages > >>> [ 672.528192] sched: DL replenish lagged too much > >>> [ 751.127790] hrtimer: interrupt took 12143 ns > >>> > >>> So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines > >>> > >>> Zhouyi > >>> > >>> On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries > >>> <jorge@foundries.io> wrote: > >>>> > >>>> Hi > >>>> > >>>> When enabling CONFIG_PREEMPT and running the stress-ng scheduler class > >>>> tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following. > >>>> > >>>> [ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > >>>> [ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972) > >>>> [ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root > >>>> +->qsmask 0x0 > >>>> [ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 > >>>> [ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > >>>> [ 62.622359] rcu: RCU grace-period kthread stack dump: > >>>> [ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028 > >>>> [ 62.637308] Call trace: > >>>> [ 62.639748] __switch_to+0x11c/0x190 > >>>> [ 62.643319] __schedule+0x3b8/0x8d8 > >>>> [ 62.646796] schedule+0x4c/0x108 > >>>> [ 62.650018] schedule_timeout+0x1ac/0x358 > >>>> [ 62.654021] rcu_gp_kthread+0x6a8/0x12b8 > >>>> [ 62.657933] kthread+0x14c/0x158 > >>>> [ 62.661153] ret_from_fork+0x10/0x18 > >>>> [ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002 > >>>> [ 62.689604] Preemption disabled at: > >>>> [ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58 > >>>> [ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5 > >>>> [ 62.706296] Hardware name: Zynqmp new (DT) > >>>> [ 62.710115] Call trace: > >>>> [ 62.712548] dump_backtrace+0x0/0x240 > >>>> [ 62.716202] show_stack+0x2c/0x38 > >>>> [ 62.719510] dump_stack+0xcc/0x104 > >>>> [ 62.722904] __schedule_bug+0x78/0xc8 > >>>> [ 62.726556] __schedule+0x70c/0x8d8 > >>>> [ 62.730037] schedule+0x4c/0x108 > >>>> [ 62.733259] do_notify_resume+0x224/0x5d8 > >>>> [ 62.737259] work_pending+0xc/0x2a4 > >>>> > >>>> The error results in OOM eventually. > >>>> > >>>> RCU priority boosting does work around this issue but it seems to me > >>>> a workaround more than a fix (otherwise boosting would be enabled > >>>> by CONFIG_PREEMPT for arm64 I guess?). > >>>> > >>>> The question is: is this an arm64 bug that should be investigated? or > >>>> is this some known corner case of running stress-ng that is already > >>>> understood? > >>>> > >>>> thanks > >>>> Jorge > >>>> > >>>> > >>>> > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a > member of the Code Aurora Forum, hosted by The Linux Foundation Thanks Zhouyi ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210901082321.nj61nA4mXy_hgClhySzqE-uoWYVzA0aI3c0f_uguoIs@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210901082321.nj61nA4mXy_hgClhySzqE-uoWYVzA0aI3c0f_uguoIs@z> @ 2021-09-01 8:23 ` Jorge Ramirez-Ortiz, Foundries [not found] ` <20210901091712.oSGeSTF7aXRLTs1rdT4GUH18sEFVlx_z4knvKekvi34@z> 0 siblings, 1 reply; 14+ messages in thread From: Jorge Ramirez-Ortiz, Foundries @ 2021-09-01 8:23 UTC (permalink / raw) To: Zhouyi Zhou Cc: Jorge Ramirez-Ortiz, Foundries, paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel On 01/09/21, Zhouyi Zhou wrote: > Hi, > > I perform following two new rounds of experiments: > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > 8 -hda ./debian10.qcow2 -m 4096 -net > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > -vnc :30) > > 1. CONFIG_RCU_BOOST=y > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > stack: 0 pid: 1634 ppid: 2 flag\ > s:0x00004000 > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > addrconf_verify_work > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > Aug 31 19:41:12 debian kernel: [ 847.171921] > schedule_preempt_disabled+0x14/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > Aug 31 19:41:12 debian kernel: [ 847.176007] > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > in the system: > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > ll_locks+0xe/0x1a0 > > 2. # CONFIG_RCU_BOOST is not set > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > --class scheduler -t 5m --times > System begin OOM kill after 6 minutes: > The system is so dead, that I can't save the backtrace to file nor did > kernel has a chance to save the log to /var/log/messages > all, Thanks for testing on x86. we can also reproduce on qemu arm64. So I think it will point out to the stress-ng test itself; I will debug it early next week - didnt expect so much support so fast TBH, it took me by surprise - and will report then (thanks again) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20210901091712.oSGeSTF7aXRLTs1rdT4GUH18sEFVlx_z4knvKekvi34@z>]
* Re: rcu_preempt detected stalls [not found] ` <20210901091712.oSGeSTF7aXRLTs1rdT4GUH18sEFVlx_z4knvKekvi34@z> @ 2021-09-01 9:17 ` Zhouyi Zhou 2021-09-01 9:17 ` Zhouyi Zhou 0 siblings, 1 reply; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 9:17 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries, Neeraj Upadhyay Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel On Wed, Sep 1, 2021 at 4:23 PM Jorge Ramirez-Ortiz, Foundries <jorge@foundries.io> wrote: > > On 01/09/21, Zhouyi Zhou wrote: > > Hi, > > > > I perform following two new rounds of experiments: > > > > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > > 8 -hda ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > > > 1. CONFIG_RCU_BOOST=y > > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > > > System begin OOM kill after 6 minutes: > > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > > stack: 0 pid: 1634 ppid: 2 flag\ > > s:0x00004000 > > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > > addrconf_verify_work > > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > > Aug 31 19:41:12 debian kernel: [ 847.171921] > > schedule_preempt_disabled+0x14/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > > Aug 31 19:41:12 debian kernel: [ 847.176007] > > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > > in the system: > > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > > ll_locks+0xe/0x1a0 > > > > 2. # CONFIG_RCU_BOOST is not set > > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > > The system is so dead, that I can't save the backtrace to file nor did > > kernel has a chance to save the log to /var/log/messages > > > > all, > > Thanks for testing on x86. we can also reproduce on qemu arm64. So I > think it will point out to the stress-ng test itself; I will debug it > early next week - didnt expect so much support so fast TBH, it took me > by surprise - and will report then (thanks again) You are very welcome ;-) I'm very glad that our effort can be of some help to you, I've learned a lot from both of you during this process. Looking forward to seeing your report. Thanks Zhouyi _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rcu_preempt detected stalls 2021-09-01 9:17 ` Zhouyi Zhou @ 2021-09-01 9:17 ` Zhouyi Zhou 0 siblings, 0 replies; 14+ messages in thread From: Zhouyi Zhou @ 2021-09-01 9:17 UTC (permalink / raw) To: Jorge Ramirez-Ortiz, Foundries, Neeraj Upadhyay Cc: paulmck, Josh Triplett, rostedt, Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Google, rcu, soc, linux-arm-kernel On Wed, Sep 1, 2021 at 4:23 PM Jorge Ramirez-Ortiz, Foundries <jorge@foundries.io> wrote: > > On 01/09/21, Zhouyi Zhou wrote: > > Hi, > > > > I perform following two new rounds of experiments: > > > > > > Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp > > 8 -hda ./debian10.qcow2 -m 4096 -net > > user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000 > > -vnc :30) > > > > 1. CONFIG_RCU_BOOST=y > > 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 1.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > > > System begin OOM kill after 6 minutes: > > 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D > > stack: 0 pid: 1634 ppid: 2 flag\ > > s:0x00004000 > > Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf > > addrconf_verify_work > > Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace: > > Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40 > > Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0 > > Aug 31 19:41:12 debian kernel: [ 847.171921] > > schedule_preempt_disabled+0x14/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10 > > Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20 > > Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0 > > Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0 > > Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180 > > Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40 > > Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30 > > Aug 31 19:41:12 debian kernel: [ 847.176007] > > Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held > > in the system: > > Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56: > > Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60 > > (rcu_read_lock){....}-{1:2}, at: debug_show_a\ > > ll_locks+0xe/0x1a0 > > > > 2. # CONFIG_RCU_BOOST is not set > > 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times > > 2.2 as regular user at the same time, run $stress-ng --sequential 100 > > --class scheduler -t 5m --times > > System begin OOM kill after 6 minutes: > > The system is so dead, that I can't save the backtrace to file nor did > > kernel has a chance to save the log to /var/log/messages > > > > all, > > Thanks for testing on x86. we can also reproduce on qemu arm64. So I > think it will point out to the stress-ng test itself; I will debug it > early next week - didnt expect so much support so fast TBH, it took me > by surprise - and will report then (thanks again) You are very welcome ;-) I'm very glad that our effort can be of some help to you, I've learned a lot from both of you during this process. Looking forward to seeing your report. Thanks Zhouyi ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-09-01 9:49 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20210831152144.VOyu0gjmwOCWZBtSzDaOYQE-YYszY9tK_z8p-fZZ_kM@z> 2021-08-31 15:21 ` rcu_preempt detected stalls Jorge Ramirez-Ortiz, Foundries [not found] ` <20210831155359.Lf_X-nNhopb2XqSc-L34l6LeYKbvMZK-etHxz_l8K68@z> 2021-08-31 15:53 ` Paul E. McKenney [not found] ` <20210831170111.ftOr-K7l2idP8zoj3V-y_wsmeIJ_u1im_3zfoOffRAs@z> 2021-08-31 17:01 ` Zhouyi Zhou 2021-08-31 17:01 ` Zhouyi Zhou [not found] ` <20210831171159.L6GWokiqRdP6iR7ifyRwmK4o4zLRDfv695oAA6v1sWc@z> 2021-08-31 17:11 ` Zhouyi Zhou 2021-08-31 17:11 ` Zhouyi Zhou [not found] ` <20210901010332.atavvn1KrA_ubxTIThV4OTrLSH5nQAB3vaUEFWHzgho@z> 2021-09-01 1:03 ` Zhouyi Zhou 2021-09-01 1:03 ` Zhouyi Zhou 2021-09-01 4:08 ` Neeraj Upadhyay [not found] ` <20210901064718.aKXabQBXfHO1cXDOrnGPmUo0Uikxa2W9WUpwaf6UQZs@z> 2021-09-01 6:47 ` Zhouyi Zhou 2021-09-01 6:47 ` Zhouyi Zhou [not found] ` <20210901082321.nj61nA4mXy_hgClhySzqE-uoWYVzA0aI3c0f_uguoIs@z> 2021-09-01 8:23 ` Jorge Ramirez-Ortiz, Foundries [not found] ` <20210901091712.oSGeSTF7aXRLTs1rdT4GUH18sEFVlx_z4knvKekvi34@z> 2021-09-01 9:17 ` Zhouyi Zhou 2021-09-01 9:17 ` Zhouyi Zhou
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).