All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: tglx@linutronix.de, mingo@kernel.org
Cc: linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
	qais.yousef@arm.com, swood@redhat.com, peterz@infradead.org,
	valentin.schneider@arm.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, vincent.donnefort@arm.com, tj@kernel.org,
	ouwen210@hotmail.com
Subject: [PATCH v3 04/19] sched/core: Wait for tasks being pushed away on hotplug
Date: Thu, 15 Oct 2020 13:05:36 +0200	[thread overview]
Message-ID: <20201015110923.546159636@infradead.org> (raw)
In-Reply-To: 20201015110532.738127234@infradead.org

From: Thomas Gleixner <tglx@linutronix.de>

RT kernels need to ensure that all tasks which are not per CPU kthreads
have left the outgoing CPU to guarantee that no tasks are force migrated
within a migrate disabled section.

There is also some desire to (ab)use fine grained CPU hotplug control to
clear a CPU from active state to force migrate tasks which are not per CPU
kthreads away for power control purposes.

Add a mechanism which waits until all tasks which should leave the CPU
after the CPU active flag is cleared have moved to a different online CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/sched/core.c  |   40 +++++++++++++++++++++++++++++++++++++++-
 kernel/sched/sched.h |    4 ++++
 2 files changed, 43 insertions(+), 1 deletion(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6894,8 +6894,21 @@ static void balance_push(struct rq *rq)
 	 * Both the cpu-hotplug and stop task are in this case and are
 	 * required to complete the hotplug process.
 	 */
-	if (is_per_cpu_kthread(push_task))
+	if (is_per_cpu_kthread(push_task)) {
+		/*
+		 * If this is the idle task on the outgoing CPU try to wake
+		 * up the hotplug control thread which might wait for the
+		 * last task to vanish. The rcuwait_active() check is
+		 * accurate here because the waiter is pinned on this CPU
+		 * and can't obviously be running in parallel.
+		 */
+		if (!rq->nr_running && rcuwait_active(&rq->hotplug_wait)) {
+			raw_spin_unlock(&rq->lock);
+			rcuwait_wake_up(&rq->hotplug_wait);
+			raw_spin_lock(&rq->lock);
+		}
 		return;
+	}
 
 	get_task_struct(push_task);
 	/*
@@ -6926,6 +6939,20 @@ static void balance_push_set(int cpu, bo
 	rq_unlock_irqrestore(rq, &rf);
 }
 
+/*
+ * Invoked from a CPUs hotplug control thread after the CPU has been marked
+ * inactive. All tasks which are not per CPU kernel threads are either
+ * pushed off this CPU now via balance_push() or placed on a different CPU
+ * during wakeup. Wait until the CPU is quiescent.
+ */
+static void balance_hotplug_wait(void)
+{
+	struct rq *rq = this_rq();
+
+	rcuwait_wait_event(&rq->hotplug_wait, rq->nr_running == 1,
+			   TASK_UNINTERRUPTIBLE);
+}
+
 #else
 
 static inline void balance_push(struct rq *rq)
@@ -6936,6 +6963,10 @@ static inline void balance_push_set(int
 {
 }
 
+static inline void balance_hotplug_wait(void)
+{
+}
+
 #endif /* CONFIG_HOTPLUG_CPU */
 
 void set_rq_online(struct rq *rq)
@@ -7090,6 +7121,10 @@ int sched_cpu_deactivate(unsigned int cp
 		return ret;
 	}
 	sched_domains_numa_masks_clear(cpu);
+
+	/* Wait for all non per CPU kernel threads to vanish. */
+	balance_hotplug_wait();
+
 	return 0;
 }
 
@@ -7330,6 +7365,9 @@ void __init sched_init(void)
 
 		rq_csd_init(rq, &rq->nohz_csd, nohz_csd_func);
 #endif
+#ifdef CONFIG_HOTPLUG_CPU
+		rcuwait_init(&rq->hotplug_wait);
+#endif
 #endif /* CONFIG_SMP */
 		hrtick_rq_init(rq);
 		atomic_set(&rq->nr_iowait, 0);
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1004,6 +1004,10 @@ struct rq {
 
 	/* This is used to determine avg_idle's max value */
 	u64			max_idle_balance_cost;
+
+#ifdef CONFIG_HOTPLUG_CPU
+	struct rcuwait		hotplug_wait;
+#endif
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_IRQ_TIME_ACCOUNTING



  parent reply	other threads:[~2020-10-15 11:11 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 11:05 [PATCH v3 00/19] sched: Migrate disable support Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 01/19] stop_machine: Add function and caller debug info Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 02/19] sched: Fix balance_callback() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 03/19] sched/hotplug: Ensure only per-cpu kthreads run during hotplug Peter Zijlstra
2020-10-15 11:05 ` Peter Zijlstra [this message]
2020-10-15 11:05 ` [PATCH v3 05/19] workqueue: Manually break affinity on hotplug Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 06/19] sched/hotplug: Consolidate task migration on CPU unplug Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 07/19] sched: Fix hotplug vs CPU bandwidth control Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 08/19] sched: Massage set_cpus_allowed() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 09/19] sched: Add migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 10/19] sched: Fix migrate_disable() vs set_cpus_allowed_ptr() Peter Zijlstra
2020-10-15 13:54   ` Valentin Schneider
2020-10-15 14:19     ` Peter Zijlstra
2020-10-16 12:48   ` Valentin Schneider
     [not found]     ` <BN8PR12MB29784D239007D0D6CA3F4F2A9A010@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-18 15:51       ` Valentin Schneider
     [not found]         ` <BN8PR12MB2978F76887133CCA2102B7589A1E0@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-19 17:36           ` Valentin Schneider
     [not found]             ` <BN8PR12MB2978D36FF4A81C344DCF37FD9A1F0@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-20  7:56               ` Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 11/19] sched/core: Make migrate disable and CPU hotplug cooperative Peter Zijlstra
     [not found]   ` <BN8PR12MB2978A3BB8062DF4CEEB184109A030@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-16  9:35     ` Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 12/19] sched,rt: Use cpumask_any*_distribute() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 13/19] sched,rt: Use the full cpumask for balancing Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 14/19] sched, lockdep: Annotate ->pi_lock recursion Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 15/19] sched: Fix migrate_disable() vs rt/dl balancing Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 16/19] sched/proc: Print accurate cpumask vs migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 17/19] sched: Add migrate_disable() tracepoints Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 18/19] sched: Deny self-issued __set_cpus_allowed_ptr() when migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 19/19] sched: Comment affine_move_task() Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201015110923.546159636@infradead.org \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=ouwen210@hotmail.com \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.donnefort@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.