linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, juri.lelli@redhat.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	linux-kernel@vger.kernel.org, valentin.schneider@arm.com,
	qais.yousef@arm.com, ktkhai@virtuozzo.com
Subject: Re: [PATCH 1/7] sched: Fix pick_next_task() vs change pattern race
Date: Fri, 8 Nov 2019 14:28:43 +0000	[thread overview]
Message-ID: <20191108142843.GA123156@google.com> (raw)
In-Reply-To: <20191108131909.428842459@infradead.org>

On Friday 08 Nov 2019 at 14:15:54 (+0100), Peter Zijlstra wrote:
> Commit 67692435c411 ("sched: Rework pick_next_task() slow-path")
> inadvertly introduced a race because it changed a previously
> unexplored dependency between dropping the rq->lock and
> sched_class::put_prev_task().
> 
> The comments about dropping rq->lock, in for example
> newidle_balance(), only mentions the task being current and ->on_cpu
> being set. But when we look at the 'change' pattern (in for example
> sched_setnuma()):
> 
> 	queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
> 	running = task_current(rq, p); /* rq->curr == p */
> 
> 	if (queued)
> 		dequeue_task(...);
> 	if (running)
> 		put_prev_task(...);
> 
> 	/* change task properties */
> 
> 	if (queued)
> 		enqueue_task(...);
> 	if (running)
> 		set_next_task(...);
> 
> It becomes obvious that if we do this after put_prev_task() has
> already been called on @p, things go sideways. This is exactly what
> the commit in question allows to happen when it does:
> 
> 	prev->sched_class->put_prev_task(rq, prev, rf);
> 	if (!rq->nr_running)
> 		newidle_balance(rq, rf);
> 
> The newidle_balance() call will drop rq->lock after we've called
> put_prev_task() and that allows the above 'change' pattern to
> interleave and mess up the state.
> 
> Furthermore, it turns out we lost the RT-pull when we put the last DL
> task.
> 
> Fix both problems by extracting the balancing from put_prev_task() and
> doing a multi-class balance() pass before put_prev_task().
> 
> Fixes: 67692435c411 ("sched: Rework pick_next_task() slow-path")
> Reported-by: Quentin Perret <qperret@google.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

The reproducer that triggered in 30sec or so has now been running for
3 hours:

   Tested-by: Quentin Perret <qperret@google.com>

Thanks for fix,
Quentin

  reply	other threads:[~2019-11-08 14:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08 13:15 [PATCH 0/7] scheduler patches Peter Zijlstra
2019-11-08 13:15 ` [PATCH 1/7] sched: Fix pick_next_task() vs change pattern race Peter Zijlstra
2019-11-08 14:28   ` Quentin Perret [this message]
2019-11-08 16:05   ` Valentin Schneider
2019-11-08 20:49     ` Peter Zijlstra
2019-11-08 17:03   ` Qais Yousef
2019-11-08 13:15 ` [PATCH 2/7] sched/fair: Better document newidle_balance() Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2019-11-08 13:15 ` [PATCH 3/7] sched: Make pick_next_task_idle() more consistent Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] sched/core: " tip-bot2 for Peter Zijlstra
2019-11-08 13:15 ` [PATCH 4/7] sched: Optimize pick_next_task() Peter Zijlstra
2019-11-08 14:33   ` Quentin Perret
2019-11-08 16:46     ` Vincent Guittot
2019-11-08 20:46     ` Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] sched/core: " tip-bot2 for Peter Zijlstra
2019-11-08 13:15 ` [PATCH 5/7] sched: Simplify sched_class::pick_next_task() Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] sched/core: " tip-bot2 for Peter Zijlstra
2019-11-08 13:15 ` [PATCH 6/7] sched/fair: Use mul_u32_u32() Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2019-11-08 13:16 ` [PATCH 7/7] sched: Further clarify sched_class::set_next_task() Peter Zijlstra
2019-11-11  9:32   ` [tip: sched/core] sched/core: " tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191108142843.GA123156@google.com \
    --to=qperret@google.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).