linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Donald Buczek <buczek@molgen.mpg.de>,
	Paul Menzel <pmenzel@molgen.mpg.de>,
	dvteam@molgen.mpg.de, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Josh Triplett <josh@joshtriplett.org>
Subject: Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`
Date: Thu, 1 Dec 2016 06:30:35 +0100	[thread overview]
Message-ID: <20161201053035.GC3092@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20161130194019.GF3924@linux.vnet.ibm.com>

On Wed, Nov 30, 2016 at 11:40:19AM -0800, Paul E. McKenney wrote:

> > See commit:
> > 
> >   4a81e8328d37 ("rcu: Reduce overhead of cond_resched() checks for RCU")
> > 
> > Someone actually wrote down what the problem was.
> 
> Don't worry, it won't happen again.  ;-)
> 
> OK, so the regressions were in the "open1" test of Anton Blanchard's
> "will it scale" suite, and were due to faster (and thus more) grace
> periods rather than path length.
> 
> I could likely counter the grace-period speedup by regulating the rate
> at which the grace-period machinery pays attention to the rcu_qs_ctr
> per-CPU variable.  Actually, this looks pretty straightforward (famous
> last words).  But see patch below, which is untested and probably
> completely bogus.

Possible I suppose. Didn't look too hard at it.

> > > > Also, I seem to have missed, why are we going through this again?
> > > 
> > > Well, the point I've brought that up is because having basically two
> > > APIs for cond_resched is more than confusing. Basically all longer in
> > > kernel loops do cond_resched() but it seems that this will not help the
> > > silence RCU lockup detector in rare cases where nothing really wants to
> > > schedule. I am really not sure whether we want to sprinkle
> > > cond_resched_rcu_qs at random places just to silence RCU detector...
> > 
> > Right.. now, this is obviously all PREEMPT=n code, which therefore also
> > implies this is rcu-sched.
> > 
> > Paul, now doesn't rcu-sched, when the grace-period has been long in
> > coming, try and force it? And doesn't that forcing include prodding CPUs
> > with resched_cpu() ?
> 
> It does in the v4.8.4 kernel that Boris is running.  It still does in my
> -rcu tree, but only after an RCU CPU stall (something about people not
> liking IPIs).  I may need to do a resched_cpu() halfway to stall-warning
> time or some such.

Sure, we all dislike IPIs, but I'm thinking this half-way point is
sensible, no point in issuing user visible annoyance if indeed we can
prod things back to life, no?

Only if we utterly fail to make it respond should we bug the user with
our failure..

> > I'm thinking not, because if it did, that would make cond_resched()
> > actually schedule, which would then call into rcu_note_context_switch()
> > which would then make RCU progress, no?
> 
> Sounds plausible, but from what I can see some of the loops pointed
> out by Boris's stall-warning messages don't have cond_resched().
> There was another workload that apparently worked better when moved from
> cond_resched() to cond_resched_rcu_qs(), but I don't know what kernel
> version was running.

Egads.. cursed if you do, cursed if you dont eh..

  reply	other threads:[~2016-12-01  5:30 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <24c226a5-1a4a-173e-8b4e-5107a2baac04@molgen.mpg.de>
2016-11-08 12:22 ` INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node` Paul Menzel
2016-11-08 17:03   ` Paul E. McKenney
2016-11-08 17:38     ` Paul Menzel
2016-11-08 18:39       ` Paul E. McKenney
2016-11-16 17:01         ` Paul Menzel
2016-11-16 17:30           ` Paul E. McKenney
2016-11-21 13:41             ` Michal Hocko
2016-11-21 14:01               ` Paul E. McKenney
2016-11-21 14:18                 ` Michal Hocko
2016-11-21 14:29                   ` Paul E. McKenney
2016-11-21 15:35                     ` Donald Buczek
2016-11-24 10:15                       ` Michal Hocko
2016-11-24 18:50                         ` Donald Buczek
2016-11-27  9:37                           ` Paul Menzel
2016-11-27  5:32                         ` Christopher S. Aker
2016-11-27  9:19                         ` Donald Buczek
2016-11-28 11:04                           ` Michal Hocko
2016-11-28 12:26                             ` Paul Menzel
2016-11-30 10:28                               ` Donald Buczek
2016-11-30 11:09                                 ` Michal Hocko
2016-11-30 11:43                                   ` Donald Buczek
2016-12-02  9:14                                     ` Donald Buczek
2016-12-06  8:32                                       ` Donald Buczek
2016-11-30 11:53                                   ` Paul E. McKenney
2016-11-30 11:54                                     ` Paul E. McKenney
2016-11-30 12:31                                       ` Paul Menzel
2016-11-30 14:31                                         ` Paul E. McKenney
2016-11-30 13:19                                     ` Michal Hocko
2016-11-30 14:29                                       ` Paul E. McKenney
2016-11-30 16:38                                         ` Peter Zijlstra
2016-11-30 17:02                                           ` Paul E. McKenney
2016-11-30 17:05                                           ` Michal Hocko
2016-11-30 17:23                                             ` Paul E. McKenney
2016-11-30 17:34                                               ` Michal Hocko
2016-11-30 17:50                                             ` Peter Zijlstra
2016-11-30 19:40                                               ` Paul E. McKenney
2016-12-01  5:30                                                 ` Peter Zijlstra [this message]
2016-12-01 12:40                                                   ` Paul E. McKenney
2016-12-01 16:36                                                     ` Peter Zijlstra
2016-12-01 16:59                                                       ` Paul E. McKenney
2016-12-01 18:09                                                         ` Peter Zijlstra
2016-12-01 18:42                                                           ` Paul E. McKenney
2016-12-01 18:49                                                             ` Peter Zijlstra
     [not found] <20161125212000.GI31360@linux.vnet.ibm.com>
     [not found] ` <20161128095825.GI14788@dhcp22.suse.cz>
     [not found]   ` <20161128105425.GY31360@linux.vnet.ibm.com>
     [not found]     ` <3a4242cb-0198-0a3b-97ae-536fb5ff83ec@kernelpanic.ru>
     [not found]       ` <20161128143435.GC3924@linux.vnet.ibm.com>
     [not found]         ` <eba1571e-f7a8-09b3-5516-c2bc35b38a83@kernelpanic.ru>
     [not found]           ` <20161128150509.GG3924@linux.vnet.ibm.com>
     [not found]             ` <66fd50e1-a922-846a-f427-7654795bd4b5@kernelpanic.ru>
     [not found]               ` <20161130174802.GM18432@dhcp22.suse.cz>
     [not found]                 ` <fd34243c-2ebf-c14b-55e6-684a9dc614e7@kernelpanic.ru>
     [not found]                   ` <20161130182552.GN18432@dhcp22.suse.cz>
2016-12-01 18:10                     ` Boris Zhmurov
2016-12-01 19:39                       ` Paul E. McKenney
2016-12-02  9:37                       ` Michal Hocko
2016-12-02 13:52                         ` Paul E. McKenney
2016-12-02 16:39                       ` Boris Zhmurov
2016-12-02 16:44                         ` Paul E. McKenney
2016-12-02 17:02                           ` Michal Hocko
2016-12-02 17:15                             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161201053035.GC3092@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=buczek@molgen.mpg.de \
    --cc=dvteam@molgen.mpg.de \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pmenzel@molgen.mpg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).