All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	rcu <rcu@vger.kernel.org>
Subject: Re: rcu_sched self-detected stall on CPU
Date: Thu, 7 Apr 2022 18:43:01 -0700	[thread overview]
Message-ID: <20220408014301.GW4285@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <CAABZP2yDe3dU0DtigvAE4CQLAipvT81Bw4LrF5WjLSiP5nq1UA@mail.gmail.com>

On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote:
> Dear Paul and Miguel
> 
> On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > Ah.  So you would instead look for boot to have completed within 10
> > > > seconds?  Either way, reliable automation might well more important than
> > > > reduction in time.
> > >
> > > No (although I guess that could be an option), I was only pointing out
> > > that when no stall is produced, the run should be much quicker than 30
> > > seconds (at least it was in my setup), which would be the majority of the runs.
> >
> > Ah, thank you for the clarification!
> Thank both of you for the information. In my setup (PPC cloud VM), the
> majority of the runs complete at least for 50 seconds. From last
> evening to this morning (Beijing Time), following experiments have
> been done:
> 1) torture mainline: the test quickly finished by hitting "rcu_sched
> self-detected stall" after 12 runs
> 2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
> have been done without trigger the bug
> 
> Conclusion:
> There must be a commit that causes the bug as Paul has pointed out.
> I am going to do the bisect, and estimate to locate the bug within a
> week (at most).
> This is a good learning experience, thanks for the guidance ;-)

Very good, and looking forward to seeing what you find.

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: rcu <rcu@vger.kernel.org>,
	Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: rcu_sched self-detected stall on CPU
Date: Thu, 7 Apr 2022 18:43:01 -0700	[thread overview]
Message-ID: <20220408014301.GW4285@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <CAABZP2yDe3dU0DtigvAE4CQLAipvT81Bw4LrF5WjLSiP5nq1UA@mail.gmail.com>

On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote:
> Dear Paul and Miguel
> 
> On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > Ah.  So you would instead look for boot to have completed within 10
> > > > seconds?  Either way, reliable automation might well more important than
> > > > reduction in time.
> > >
> > > No (although I guess that could be an option), I was only pointing out
> > > that when no stall is produced, the run should be much quicker than 30
> > > seconds (at least it was in my setup), which would be the majority of the runs.
> >
> > Ah, thank you for the clarification!
> Thank both of you for the information. In my setup (PPC cloud VM), the
> majority of the runs complete at least for 50 seconds. From last
> evening to this morning (Beijing Time), following experiments have
> been done:
> 1) torture mainline: the test quickly finished by hitting "rcu_sched
> self-detected stall" after 12 runs
> 2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
> have been done without trigger the bug
> 
> Conclusion:
> There must be a commit that causes the bug as Paul has pointed out.
> I am going to do the bisect, and estimate to locate the bug within a
> week (at most).
> This is a good learning experience, thanks for the guidance ;-)

Very good, and looking forward to seeing what you find.

							Thanx, Paul

  reply	other threads:[~2022-04-08  1:43 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06  9:31   ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-06 17:00     ` Paul E. McKenney
2022-04-06 18:25     ` Zhouyi Zhou
2022-04-06 18:25       ` Zhouyi Zhou
2022-04-06 19:50       ` Paul E. McKenney
2022-04-06 19:50         ` Paul E. McKenney
2022-04-07  2:26         ` Zhouyi Zhou
2022-04-07  2:26           ` Zhouyi Zhou
2022-04-07 10:07           ` Miguel Ojeda
2022-04-07 10:07             ` Miguel Ojeda
2022-04-07 15:15             ` Paul E. McKenney
2022-04-07 15:15               ` Paul E. McKenney
2022-04-07 17:05               ` Miguel Ojeda
2022-04-07 17:05                 ` Miguel Ojeda
2022-04-07 17:55                 ` Paul E. McKenney
2022-04-07 17:55                   ` Paul E. McKenney
2022-04-07 23:14                   ` Zhouyi Zhou
2022-04-07 23:14                     ` Zhouyi Zhou
2022-04-08  1:43                     ` Paul E. McKenney [this message]
2022-04-08  1:43                       ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 10:02       ` Zhouyi Zhou
2022-04-08 10:02         ` Zhouyi Zhou
2022-04-08 14:07         ` Paul E. McKenney
2022-04-08 14:07           ` Paul E. McKenney
2022-04-08 14:25           ` Zhouyi Zhou
2022-04-08 14:25             ` Zhouyi Zhou
2022-04-10 11:33             ` Michael Ellerman
2022-04-11  3:05               ` Paul E. McKenney
2022-04-11  3:05                 ` Paul E. McKenney
2022-04-12  6:53                 ` Michael Ellerman
2022-04-12  6:53                   ` Michael Ellerman
2022-04-12 13:36                   ` Paul E. McKenney
2022-04-12 13:36                     ` Paul E. McKenney
2022-04-08 13:52       ` Miguel Ojeda
2022-04-08 13:52         ` Miguel Ojeda
2022-04-08 14:06       ` Paul E. McKenney
2022-04-08 14:06         ` Paul E. McKenney
2022-04-08 14:42       ` Michael Ellerman
2022-04-08 15:52         ` Paul E. McKenney
2022-04-08 15:52           ` Paul E. McKenney
2022-04-08 17:02         ` Miguel Ojeda
2022-04-08 17:02           ` Miguel Ojeda
2022-04-13  5:11         ` Nicholas Piggin
2022-04-13  5:11           ` Nicholas Piggin
2022-04-13  6:10           ` Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU) Nicholas Piggin
2022-04-13  6:10             ` Nicholas Piggin
2022-04-14 17:15             ` Paul E. McKenney
2022-04-14 17:15               ` Paul E. McKenney
2022-04-22 15:53           ` Thomas Gleixner
2022-04-22 15:53             ` Re: Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin
2022-04-23  2:29               ` Re: Nicholas Piggin
  -- strict thread matches above, loose matches on Subject: below --
2016-09-15  4:02 rcu_sched self-detected stall on CPU NTU
2016-09-15  6:22 ` Mike Galbraith
2016-09-15 17:15   ` NTU
2016-09-20 14:34     ` NTU

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220408014301.GW4285@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=miguel.ojeda.sandonis@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=zhouzhouyi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.