All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhouyi Zhou <zhouzhouyi@gmail.com>
To: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
Cc: paulmck@kernel.org, Josh Triplett <josh@joshtriplett.org>,
	rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	rcu <rcu@vger.kernel.org>,
	soc@kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: rcu_preempt detected stalls
Date: Wed, 1 Sep 2021 01:01:11 +0800	[thread overview]
Message-ID: <CAABZP2xGnSLVbgxqjKMq=Oj_H7rYfYjuCvmBpmZ4tRptGs3SEw@mail.gmail.com> (raw)
Message-ID: <20210831170111.ZLS_kwXaNlX0nl_TtJAw9zH9GxVWdepDFu0DccAEE38@z> (raw)
In-Reply-To: <20210831152144.GA28128@trex>

I did an experiment just now on x86_64 virtual machines, rcu did not
complain after 10 minutes's test, I hope my effort can provide some
clue.

1. I clone the fresh new linux kernel (git clone
https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git)
2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set)
3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16  -hda
./debian10.qcow2 -m 4096 -net
user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000
-vnc :30)
4. run the test (stress-ng --sequential 16  --class scheduler -t 5m --times)
5. monitor the system by constantly typing top and dmesg
6. after 10 minutes, nothing else happens except that the dmesg report
following two messages
[  672.528192] sched: DL replenish lagged too much
[  751.127790] hrtimer: interrupt took 12143 ns

So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines

Zhouyi

On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries
<jorge@foundries.io> wrote:
>
> Hi
>
> When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
> tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.
>
> [   62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [   62.585015]  (detected by 0, t=5253 jiffies, g=3017, q=2972)
> [   62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root
> +->qsmask 0x0
> [   62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
> [   62.613246] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> [   62.622359] rcu: RCU grace-period kthread stack dump:
> [   62.627395] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00000028
> [   62.637308] Call trace:
> [   62.639748]  __switch_to+0x11c/0x190
> [   62.643319]  __schedule+0x3b8/0x8d8
> [   62.646796]  schedule+0x4c/0x108
> [   62.650018]  schedule_timeout+0x1ac/0x358
> [   62.654021]  rcu_gp_kthread+0x6a8/0x12b8
> [   62.657933]  kthread+0x14c/0x158
> [   62.661153]  ret_from_fork+0x10/0x18
> [   62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002
> [   62.689604] Preemption disabled at:
> [   62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58
> [   62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5
> [   62.706296] Hardware name: Zynqmp new (DT)
> [   62.710115] Call trace:
> [   62.712548]  dump_backtrace+0x0/0x240
> [   62.716202]  show_stack+0x2c/0x38
> [   62.719510]  dump_stack+0xcc/0x104
> [   62.722904]  __schedule_bug+0x78/0xc8
> [   62.726556]  __schedule+0x70c/0x8d8
> [   62.730037]  schedule+0x4c/0x108
> [   62.733259]  do_notify_resume+0x224/0x5d8
> [   62.737259]  work_pending+0xc/0x2a4
>
> The error results in OOM eventually.
>
> RCU priority boosting does work around this issue but it seems to me
> a workaround more than a fix (otherwise boosting would be enabled
> by CONFIG_PREEMPT for arm64 I guess?).
>
> The question is: is this an arm64 bug that should be investigated? or
> is this some known corner case of running stress-ng that is already
> understood?
>
> thanks
> Jorge
>
>
>

WARNING: multiple messages have this Message-ID (diff)
From: Zhouyi Zhou <zhouzhouyi@gmail.com>
To: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
Cc: paulmck@kernel.org, Josh Triplett <josh@joshtriplett.org>,
	 rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Lai Jiangshan <jiangshanlai@gmail.com>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	 rcu <rcu@vger.kernel.org>,
	soc@kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: rcu_preempt detected stalls
Date: Wed, 1 Sep 2021 01:01:11 +0800	[thread overview]
Message-ID: <CAABZP2xGnSLVbgxqjKMq=Oj_H7rYfYjuCvmBpmZ4tRptGs3SEw@mail.gmail.com> (raw)
Message-ID: <20210831170111.ftOr-K7l2idP8zoj3V-y_wsmeIJ_u1im_3zfoOffRAs@z> (raw)
In-Reply-To: <20210831152144.GA28128@trex>

I did an experiment just now on x86_64 virtual machines, rcu did not
complain after 10 minutes's test, I hope my effort can provide some
clue.

1. I clone the fresh new linux kernel (git clone
https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git)
2. compile the kernel without CONFIG_RCU_BOOST (: # CONFIG_RCU_BOOST is not set)
3. boot the kernel on a x86_64 VM (kvm -cpu host -smp 16  -hda
./debian10.qcow2 -m 4096 -net
user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000
-vnc :30)
4. run the test (stress-ng --sequential 16  --class scheduler -t 5m --times)
5. monitor the system by constantly typing top and dmesg
6. after 10 minutes, nothing else happens except that the dmesg report
following two messages
[  672.528192] sched: DL replenish lagged too much
[  751.127790] hrtimer: interrupt took 12143 ns

So, I guess CONFIG_RCU_BOOST is not necessary for x86_64 virtual machines

Zhouyi

On Tue, Aug 31, 2021 at 11:24 PM Jorge Ramirez-Ortiz, Foundries
<jorge@foundries.io> wrote:
>
> Hi
>
> When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
> tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.
>
> [   62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [   62.585015]  (detected by 0, t=5253 jiffies, g=3017, q=2972)
> [   62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root
> +->qsmask 0x0
> [   62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
> [   62.613246] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> [   62.622359] rcu: RCU grace-period kthread stack dump:
> [   62.627395] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00000028
> [   62.637308] Call trace:
> [   62.639748]  __switch_to+0x11c/0x190
> [   62.643319]  __schedule+0x3b8/0x8d8
> [   62.646796]  schedule+0x4c/0x108
> [   62.650018]  schedule_timeout+0x1ac/0x358
> [   62.654021]  rcu_gp_kthread+0x6a8/0x12b8
> [   62.657933]  kthread+0x14c/0x158
> [   62.661153]  ret_from_fork+0x10/0x18
> [   62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002
> [   62.689604] Preemption disabled at:
> [   62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58
> [   62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5
> [   62.706296] Hardware name: Zynqmp new (DT)
> [   62.710115] Call trace:
> [   62.712548]  dump_backtrace+0x0/0x240
> [   62.716202]  show_stack+0x2c/0x38
> [   62.719510]  dump_stack+0xcc/0x104
> [   62.722904]  __schedule_bug+0x78/0xc8
> [   62.726556]  __schedule+0x70c/0x8d8
> [   62.730037]  schedule+0x4c/0x108
> [   62.733259]  do_notify_resume+0x224/0x5d8
> [   62.737259]  work_pending+0xc/0x2a4
>
> The error results in OOM eventually.
>
> RCU priority boosting does work around this issue but it seems to me
> a workaround more than a fix (otherwise boosting would be enabled
> by CONFIG_PREEMPT for arm64 I guess?).
>
> The question is: is this an arm64 bug that should be investigated? or
> is this some known corner case of running stress-ng that is already
> understood?
>
> thanks
> Jorge
>
>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-08-31 17:01 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 15:21 rcu_preempt detected stalls Jorge Ramirez-Ortiz, Foundries
2021-08-31 15:21 ` Jorge Ramirez-Ortiz, Foundries
2021-08-31 15:53 ` Paul E. McKenney
2021-08-31 15:53   ` Paul E. McKenney
2021-08-31 17:01 ` Zhouyi Zhou [this message]
2021-08-31 17:01   ` Zhouyi Zhou
2021-08-31 17:01     ` Zhouyi Zhou
2021-08-31 17:11     ` Zhouyi Zhou
2021-08-31 17:11       ` Zhouyi Zhou
2021-08-31 17:11         ` Zhouyi Zhou
2021-09-01  1:03         ` Zhouyi Zhou
2021-09-01  1:03           ` Zhouyi Zhou
2021-09-01  1:03             ` Zhouyi Zhou
2021-09-01  4:08             ` Neeraj Upadhyay
2021-09-01  6:47               ` Zhouyi Zhou
2021-09-01  6:47                 ` Zhouyi Zhou
2021-09-01  6:47                   ` Zhouyi Zhou
2021-09-01  8:23             ` Jorge Ramirez-Ortiz, Foundries
2021-09-01  8:23               ` Jorge Ramirez-Ortiz, Foundries
2021-09-01  9:17               ` Zhouyi Zhou
2021-09-01  9:17                 ` Zhouyi Zhou
2021-09-01  9:17                   ` Zhouyi Zhou
  -- strict thread matches above, loose matches on Subject: below --
2014-10-13 17:35 Dave Jones
2014-10-15  2:35 ` Sasha Levin
2014-10-23 18:39   ` Paul E. McKenney
2014-10-23 18:55     ` Sasha Levin
2014-10-23 19:58       ` Paul E. McKenney
2014-10-24 12:28         ` Sasha Levin
2014-10-24 16:13           ` Paul E. McKenney
2014-10-24 16:39             ` Sasha Levin
2014-10-27 21:13               ` Paul E. McKenney
2014-10-27 23:44                 ` Paul E. McKenney
2014-10-27 23:44                   ` Paul E. McKenney
2014-11-13 23:07                   ` Paul E. McKenney
2014-11-13 23:07                     ` Paul E. McKenney
2014-11-13 23:10                     ` Sasha Levin
2014-11-13 23:10                       ` Sasha Levin
2014-10-30 23:41                 ` Sasha Levin
2014-10-23 18:32 ` Paul E. McKenney
2014-10-23 18:40   ` Dave Jones
2014-10-23 19:28     ` Paul E. McKenney
2014-10-23 19:37       ` Dave Jones
2014-10-23 19:52         ` Paul E. McKenney
2014-10-23 20:28           ` Dave Jones
2014-10-23 20:44             ` Paul E. McKenney
2014-10-23 19:13   ` Oleg Nesterov
2014-10-23 19:38     ` Paul E. McKenney
2014-10-23 19:53       ` Oleg Nesterov
2014-10-23 20:24         ` Paul E. McKenney
2014-10-23 21:13           ` Oleg Nesterov
2014-10-23 21:38             ` Paul E. McKenney
2014-10-25  3:16 ` Dâniel Fraga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAABZP2xGnSLVbgxqjKMq=Oj_H7rYfYjuCvmBpmZ4tRptGs3SEw@mail.gmail.com' \
    --to=zhouzhouyi@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=jorge@foundries.io \
    --cc=josh@joshtriplett.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=soc@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.