linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chao Zhou <chao@eero.com>
To: paulmck@kernel.org
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rcu: allow multiple stalls before panic
Date: Fri, 4 Sep 2020 14:20:50 -0700	[thread overview]
Message-ID: <CAOr4Z-u6yvcJZojrfMt_eeBxUfY8H7ede1rQRKi6PtPkKK9DxQ@mail.gmail.com> (raw)
In-Reply-To: <20200904203718.GO29330@paulmck-ThinkPad-P72>

Thanks Paul, sounds good!

Much appreciated your guidance.

Chao

eero inc.

660 3rd St, 4th Floor

San Francisco, CA 94107



On Fri, Sep 4, 2020 at 1:37 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Sep 04, 2020 at 12:40:29PM -0700, Chao Zhou wrote:
> > Thanks Paul. Appreciated it.
> >
> > Initial intent was to give users a way to make their system more
> > tolerable, but prudent enough to recover if suspicious behavior
> > reaches a watermark. If a system experiences multiple stalls in one
> > lifetime, no matter how healthy it looks or whether the stalls are
> > from different sources, we still want it to dramatically recover.
> > Please share your guidance?
>
> I have no guidance in this case.  I was just wanting to verify that the
> patch was in fact doing what you want it to.  And it sounds like it does,
> so good!
>
> I have queued this for testing and further review.  If all goes well,
> I would submit it upstream for the v5.11 merge window, that is, not the
> upcoming merge window, but the one after that.
>
>                                                         Thanx, Paul
>
> > eero inc.
> >
> > 660 3rd St, 4th Floor
> >
> > San Francisco, CA 94107
> >
> >
> >
> > On Fri, Sep 4, 2020 at 11:05 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Sun, Aug 30, 2020 at 11:41:17PM -0700, chao wrote:
> > > > Some stalls are transient and system can fully recover.
> > > > Allow users to configure the number of stalls experienced
> > > > to trigger kernel Panic.
> > > >
> > > > Signed-off-by: chao <chao@eero.com>
> > >
> > > Hearing no objections, I have queued this with wordsmithing as shown
> > > below.  Please let me know if I messed something up.
> > >
> > > One question, though.  It looks like setting this to (say) 5 would panic
> > > after the fifth RCU CPU stall warning message, regardless whether all
> > > five were reporting the same RCU CPU stall event or whether they instead
> > > were five widely separated transient RCU CPU stall events, where the
> > > system fully recovered from each event.  Is this the intent?
> > >
> > >                                                         Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > commit e710c928fb52d8e56bc6173515805301da6aa22b
> > > Author: chao <chao@eero.com>
> > > Date:   Sun Aug 30 23:41:17 2020 -0700
> > >
> > >     rcu: Panic after fixed number of stalls
> > >
> > >     Some stalls are transient, so that system fully recovers.  This commit
> > >     therefore allows users to configure the number of stalls that must happen
> > >     in order to trigger kernel panic.
> > >
> > >     Signed-off-by: chao <chao@eero.com>
> > >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > >
> > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > index 500def6..fc2dd3f 100644
> > > --- a/include/linux/kernel.h
> > > +++ b/include/linux/kernel.h
> > > @@ -536,6 +536,7 @@ extern int panic_on_warn;
> > >  extern unsigned long panic_on_taint;
> > >  extern bool panic_on_taint_nousertaint;
> > >  extern int sysctl_panic_on_rcu_stall;
> > > +extern int sysctl_max_rcu_stall_to_panic;
> > >  extern int sysctl_panic_on_stackoverflow;
> > >
> > >  extern bool crash_kexec_post_notifiers;
> > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > index 0fde39b..228c55f 100644
> > > --- a/kernel/rcu/tree_stall.h
> > > +++ b/kernel/rcu/tree_stall.h
> > > @@ -13,6 +13,7 @@
> > >
> > >  /* panic() on RCU Stall sysctl. */
> > >  int sysctl_panic_on_rcu_stall __read_mostly;
> > > +int sysctl_max_rcu_stall_to_panic __read_mostly;
> > >
> > >  #ifdef CONFIG_PROVE_RCU
> > >  #define RCU_STALL_DELAY_DELTA          (5 * HZ)
> > > @@ -106,6 +107,11 @@ early_initcall(check_cpu_stall_init);
> > >  /* If so specified via sysctl, panic, yielding cleaner stall-warning output. */
> > >  static void panic_on_rcu_stall(void)
> > >  {
> > > +       static int cpu_stall;
> > > +
> > > +       if (++cpu_stall < sysctl_max_rcu_stall_to_panic)
> > > +               return;
> > > +
> > >         if (sysctl_panic_on_rcu_stall)
> > >                 panic("RCU Stall\n");
> > >  }
> > > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > > index 287862f..1bca490 100644
> > > --- a/kernel/sysctl.c
> > > +++ b/kernel/sysctl.c
> > > @@ -2651,6 +2651,17 @@ static struct ctl_table kern_table[] = {
> > >                 .extra2         = SYSCTL_ONE,
> > >         },
> > >  #endif
> > > +#if defined(CONFIG_TREE_RCU)
> > > +       {
> > > +               .procname       = "max_rcu_stall_to_panic",
> > > +               .data           = &sysctl_max_rcu_stall_to_panic,
> > > +               .maxlen         = sizeof(sysctl_max_rcu_stall_to_panic),
> > > +               .mode           = 0644,
> > > +               .proc_handler   = proc_dointvec_minmax,
> > > +               .extra1         = SYSCTL_ONE,
> > > +               .extra2         = SYSCTL_INT_MAX,
> > > +       },
> > > +#endif
> > >  #ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
> > >         {
> > >                 .procname       = "stack_erasing",

      reply	other threads:[~2020-09-04 21:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-31  6:41 [PATCH] rcu: allow multiple stalls before panic chao
2020-09-04 17:50 ` Paul E. McKenney
2020-09-04 19:40   ` Chao Zhou
2020-09-04 20:37     ` Paul E. McKenney
2020-09-04 21:20       ` Chao Zhou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOr4Z-u6yvcJZojrfMt_eeBxUfY8H7ede1rQRKi6PtPkKK9DxQ@mail.gmail.com \
    --to=chao@eero.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).