linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Mike Galbraith <efault@gmx.de>
Cc: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Ross Zwisler <zwisler@gmail.com>,
	Matt Fleming <matt@codeblueprint.co.uk>,
	Michael Ellerman <mpe@ellerman.id.au>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-next@vger.kernel.org" <linux-next@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
Date: Fri, 3 Feb 2017 14:37:48 +0100	[thread overview]
Message-ID: <20170203133748.GB6515@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1486126774.4277.6.camel@gmx.de>

On Fri, Feb 03, 2017 at 01:59:34PM +0100, Mike Galbraith wrote:
> On Fri, 2017-02-03 at 09:53 +0100, Peter Zijlstra wrote:
> > On Fri, Feb 03, 2017 at 10:03:14AM +0530, Sachin Sant wrote:
> 
> > > I ran few cycles of cpu hot(un)plug tests. In most cases it works except one
> > > where I ran into rcu stall:
> > > 
> > > [  173.493453] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  173.493473] > > 	> > 8-...: (2 GPs behind) idle=006/140000000000000/0 softirq=0/0 fqs=2996 
> > > [  173.493476] > > 	> > (detected by 0, t=6002 jiffies, g=885, c=884, q=6350)
> > 
> > Right, I actually saw that too, but I don't think that would be related
> > to my patch. I'll see if I can dig into this though, ought to get fixed
> > regardless.
> 
> FWIW, I'm not seeing stalls/hangs while beating hotplug up in tip. (so
> next grew a wart?)

I've seen it on tip. It looks like hot unplug goes really slow when
there's running tasks on the CPU being taken down.

What I did was something like:

  taskset -p $((1<<1)) $$
  for ((i=0; i<20; i++)) do while :; do :; done & done

  taskset -p $((1<<0)) $$
  echo 0 > /sys/devices/system/cpu/cpu1/online

And with those 20 tasks stuck sucking cycles on CPU1, the unplug goes
_really_ slow and the RCU stall triggers. What I suspect happens is that
hotplug stops participating in the RCU state machine early, but only
tells RCU about it really late, and in between it gets suspicious it
takes too long.

I've yet to dig through the RCU code to figure out the exact sequence of
events, but found the above to be fairly reliable in triggering the
issue.

  reply	other threads:[~2017-02-03 13:38 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-21 13:38 [PATCH v2 0/7] sched: Diagnostic checks for missing rq clock updates Matt Fleming
2016-09-21 13:38 ` [PATCH v2 1/7] sched/fair: Update the rq clock before detaching tasks Matt Fleming
2016-10-03 12:49   ` Peter Zijlstra
2016-10-03 14:37     ` Matt Fleming
2016-10-03 14:42       ` Peter Zijlstra
2016-09-21 13:38 ` [PATCH v2 2/7] sched/fair: Update rq clock before waking up new task Matt Fleming
2016-09-21 13:38 ` [PATCH v2 3/7] sched/fair: Update rq clock in task_hot() Matt Fleming
2016-09-21 13:38 ` [PATCH v2 4/7] sched: Add wrappers for lockdep_(un)pin_lock() Matt Fleming
2017-01-14 12:40   ` [tip:sched/core] sched/core: " tip-bot for Matt Fleming
2016-09-21 13:38 ` [PATCH v2 5/7] sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock Matt Fleming
2017-01-14 12:41   ` [tip:sched/core] " tip-bot for Matt Fleming
2016-09-21 13:38 ` [PATCH v2 6/7] sched/fair: Push rq lock pin/unpin into idle_balance() Matt Fleming
2017-01-14 12:41   ` [tip:sched/core] " tip-bot for Matt Fleming
2016-09-21 13:38 ` [PATCH v2 7/7] sched/core: Add debug code to catch missing update_rq_clock() Matt Fleming
2016-09-21 15:58   ` Petr Mladek
2016-09-21 19:08     ` Matt Fleming
2016-09-21 19:46       ` Thomas Gleixner
2016-09-22  0:44       ` Sergey Senozhatsky
2016-09-22  8:04     ` Peter Zijlstra
2016-09-22  8:36       ` Jan Kara
2016-09-22  9:39         ` Peter Zijlstra
2016-09-22 10:17           ` Peter Zijlstra
2017-01-14 12:44   ` [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls tip-bot for Matt Fleming
     [not found]     ` <87tw8gutp6.fsf@concordia.ellerman.id.au>
2017-01-30 21:34       ` Matt Fleming
2017-01-31  8:35         ` Michael Ellerman
2017-01-31 11:00         ` Sachin Sant
2017-01-31 11:48           ` Mike Galbraith
2017-01-31 17:22             ` Ross Zwisler
2017-02-02 15:55               ` Peter Zijlstra
2017-02-02 22:01                 ` Matt Fleming
2017-02-03  3:05                 ` Mike Galbraith
2017-02-03  4:33                 ` Sachin Sant
2017-02-03  8:53                   ` Peter Zijlstra
2017-02-03 12:59                     ` Mike Galbraith
2017-02-03 13:37                       ` Peter Zijlstra [this message]
2017-02-03 13:52                         ` Mike Galbraith
2017-02-03 15:44                         ` Paul E. McKenney
2017-02-03 15:54                           ` Paul E. McKenney
2017-02-06  6:23                             ` Sachin Sant
2017-02-06 15:10                               ` Paul E. McKenney
2017-02-06 15:14                                 ` Paul E. McKenney
2017-02-03 13:04                 ` Borislav Petkov
2017-02-22  9:03                 ` Wanpeng Li
2017-02-24  9:16                 ` [tip:sched/urgent] sched/core: Fix update_rq_clock() splat on hotplug (and suspend/resume) tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170203133748.GB6515@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=sachinp@linux.vnet.ibm.com \
    --cc=zwisler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).