All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Dexuan-Linux Cui <dexuan.linux@gmail.com>
Cc: Qian Cai <cai@lca.pw>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Tejun Heo <tj@kernel.org>, Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	rcu@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lili Deng <Lili.Deng@microsoft.com>,
	Dexuan Cui <decui@microsoft.com>,
	Baihua Lu <Baihua.Lu@microsoft.com>
Subject: Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
Date: Sun, 15 Dec 2019 12:20:23 -0800	[thread overview]
Message-ID: <20191215202023.GM2889@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>

On Sun, Dec 15, 2019 at 11:18:43AM -0800, Dexuan-Linux Cui wrote:
> On Fri, Dec 13, 2019 at 10:41 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Dec 13, 2019 at 06:11:16PM -0500, Qian Cai wrote:
> > >
> > >
> > > > On Dec 13, 2019, at 5:46 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > I am running this on a number of x86 systems, but will try it on a
> > >
> > > The config to reproduce includes several debugging options that might
> > > required to recreate.
> >
> > If you run without those debugging options, do you still see the hangs?
> > If not, please let me know which debugging are involved.
> >
> > > > wider variety.  If I cannot reproduce it, would you be willing to
> > > > run diagnostics?
> > >
> > > Yes.
> >
> > Very good!  Let me see what I can put together.  (No luck reproducing
> > at my end thus far.)
> >
> > > > Just to double-check...  Are you running rcutorture built into the kernel?
> > > > (My guess is "no", but figured that I should ask.)
> > >
> > > No as you can see from the config I linked in the original email.
> >
> > Fair point, and please accept my apologies for the pointless question.
> >
> >                                                         Thanx, Paul
> 
> Hi,
> We're seeing the same hang issue with a recent Linux next-20191213
> kernel.  If we revert the same commit 82150cb53dcb ("rcu: React to
> callback overload by aggressively seeking quiescent states”), the
> issue will go away.
> 
> Note: we're running the x86-64 Linux VM Hyper-V, and the the torture
> test is not used:
> 
> $ grep  -i torture  .config
> CONFIG_LOCK_TORTURE_TEST=m
> CONFIG_TORTURE_TEST=m
> # CONFIG_RCU_TORTURE_TEST is not set
> 
> (FYI: the kernel config and the serial console log are attached).
> 
> When the issue happens, I force a kernel panic by NMI several times
> and I can see the rcu_gp_kthread hangs at some places, but it looks
> all the places are in the below loop:
> 
> (The first panic log is in the attachment)
> (gdb) l *(rcu_gp_kthread+0x703)
> 0xffffffff811128c3 is in rcu_gp_kthread (kernel/rcu/tree.c:1763).
> 1758                    if (rnp == rdp->mynode)
> 1759                            needgp = __note_gp_changes(rnp, rdp) || needgp;
> 1760                    /* smp_mb() provided by prior unlock-lock pair. */
> 1761                    needgp = rcu_future_gp_cleanup(rnp) || needgp;
> 1762                    // Reset overload indication for CPUs no
> longer overloaded
> 1763                    for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
> 1764                            rdp = per_cpu_ptr(&rcu_data, cpu);
> 1765                            check_cb_ovld_locked(rdp, rnp);
> 1766                    }
> 1767                    sq = rcu_nocb_gp_get(rnp);

This is consistent with what I saw in Qian Cai's report, FYI.  So I
am very interested in learning whether the first patch in my reply [1]
helps you.

							Thanx, Paul

[1]  https://lore.kernel.org/lkml/20191215201646.GK2889@paulmck-ThinkPad-P72/

  parent reply	other threads:[~2019-12-15 20:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-13  6:13 "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot Qian Cai
2019-12-13 22:46 ` Paul E. McKenney
2019-12-13 23:11   ` Qian Cai
2019-12-14  6:40     ` Paul E. McKenney
     [not found]       ` <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>
2019-12-15 19:29         ` Dexuan-Linux Cui
2019-12-15 20:20         ` Paul E. McKenney [this message]
2019-12-15 20:40           ` Dexuan Cui
2019-12-15 20:56             ` Paul E. McKenney
2019-12-15 21:02               ` Dexuan Cui
     [not found] <BCD69C9E-4E61-405F-A514-36096E0F34F4@lca.pw>
2019-12-15 20:18 ` Paul E. McKenney
2019-12-15 20:19 ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191215202023.GM2889@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=Baihua.Lu@microsoft.com \
    --cc=Lili.Deng@microsoft.com \
    --cc=cai@lca.pw \
    --cc=decui@microsoft.com \
    --cc=dexuan.linux@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.