LKML Archive on lore.kernel.org
 help / color / Atom feed
From: tip-bot for Peter Zijlstra <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: mingo@kernel.org, torvalds@linux-foundation.org, hpa@zytor.com,
	linux-kernel@vger.kernel.org, peterz@infradead.org,
	tglx@linutronix.de
Subject: [tip:sched/urgent] sched/fair: Plug hole between hotplug and active_load_balance()
Date: Tue, 12 Sep 2017 11:05:09 -0700
Message-ID: <tip-edd8e41d2e3cbd6ebe13ead30eb1adc6f48cbb33@git.kernel.org> (raw)
In-Reply-To: <20170907150614.044460912@infradead.org>

Commit-ID:  edd8e41d2e3cbd6ebe13ead30eb1adc6f48cbb33
Gitweb:     http://git.kernel.org/tip/edd8e41d2e3cbd6ebe13ead30eb1adc6f48cbb33
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Thu, 7 Sep 2017 17:03:51 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 12 Sep 2017 17:41:04 +0200

sched/fair: Plug hole between hotplug and active_load_balance()

The load balancer applies cpu_active_mask to whatever sched_domains it
finds, however in the case of active_balance there is a hole between
setting rq->{active_balance,push_cpu} and running the stop_machine
work doing the actual migration.

The @push_cpu can go offline in this window, which would result in us
moving a task onto a dead cpu, which is a fairly bad thing.

Double check the active mask before the stop work does the migration.

  CPU0					CPU1

  <SoftIRQ>
					stop_machine(takedown_cpu)
    load_balance()			cpu_stopper_thread()
      ...				  work = multi_cpu_stop
      stop_one_cpu_nowait(		    /* wait for CPU0 */
	.func = active_load_balance_cpu_stop
      );
  </SoftIRQ>

  cpu_stopper_thread()
    work = multi_cpu_stop
      /* sync with CPU1 */
					    take_cpu_down()
					<idle>
					  play_dead();

    work = active_load_balance_cpu_stop
      set_task_cpu(p, CPU1); /* oops!! */

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170907150614.044460912@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3bcea40..efeebed 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8560,6 +8560,13 @@ static int active_load_balance_cpu_stop(void *data)
 	struct rq_flags rf;
 
 	rq_lock_irq(busiest_rq, &rf);
+	/*
+	 * Between queueing the stop-work and running it is a hole in which
+	 * CPUs can become inactive. We should not move tasks from or to
+	 * inactive CPUs.
+	 */
+	if (!cpu_active(busiest_cpu) || !cpu_active(target_cpu))
+		goto out_unlock;
 
 	/* make sure the requested cpu hasn't gone down in the meantime */
 	if (unlikely(busiest_cpu != smp_processor_id() ||

  reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07 15:03 [PATCH 0/4] sched: Fix some load-balancer vs hotplug holes Peter Zijlstra
2017-09-07 15:03 ` [PATCH 1/4] sched/fair: Avoid newidle balance for !active CPUs Peter Zijlstra
2017-09-12 18:04   ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
2017-09-07 15:03 ` [PATCH 2/4] sched/fair: Plug hole between hotplug and active load_balance Peter Zijlstra
2017-09-12 18:05   ` tip-bot for Peter Zijlstra [this message]
2017-09-07 15:03 ` [PATCH 3/4] sched: WARN when migrating to an offline CPU Peter Zijlstra
2017-09-12 18:05   ` [tip:sched/urgent] sched/core: WARN() " tip-bot for Peter Zijlstra
2017-09-28  9:14   ` [PATCH 3/4] sched: WARN " Sasha Levin
2017-09-28 10:35     ` Peter Zijlstra
2017-09-28 11:03       ` Levin, Alexander (Sasha Levin)
2017-09-28 11:42         ` Peter Zijlstra
2017-09-29 11:11           ` Peter Zijlstra
2017-10-07  2:07             ` Levin, Alexander (Sasha Levin)
2017-10-07  9:15               ` Peter Zijlstra
     [not found]                 ` <20171007174327.ky6g5viokxg5ysdm@sasha-lappy>
2017-10-09  8:04                   ` Peter Zijlstra
2017-10-10  1:18                     ` Levin, Alexander (Sasha Levin)
2017-09-07 15:03 ` [PATCH 4/4] sched/debug: Add debugfs knob for "sched_debug" Peter Zijlstra
2017-09-12 18:05   ` [tip:sched/urgent] " tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-edd8e41d2e3cbd6ebe13ead30eb1adc6f48cbb33@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git