All of lore.kernel.org
 help / color / mirror / Atom feed
From: "tip-bot2 for Morten Rasmussen" <tip-bot2@linutronix.de>
To: linux-tip-commits@vger.kernel.org
Cc: Morten Rasmussen <morten.rasmussen@arm.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Quentin Perret <qperret@google.com>, x86 <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [tip: sched/core] sched/fair: Add asymmetric CPU capacity wakeup scan
Date: Tue, 11 Feb 2020 12:47:49 -0000	[thread overview]
Message-ID: <158142526956.411.9603515034414305045.tip-bot2@tip-bot2> (raw)
In-Reply-To: <20200206191957.12325-2-valentin.schneider@arm.com>

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     913c310c8e8abb6a9eb8b3c8bfc33bd1dddded04
Gitweb:        https://git.kernel.org/tip/913c310c8e8abb6a9eb8b3c8bfc33bd1dddded04
Author:        Morten Rasmussen <morten.rasmussen@arm.com>
AuthorDate:    Thu, 06 Feb 2020 19:19:54 
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 11 Feb 2020 13:01:43 +01:00

sched/fair: Add asymmetric CPU capacity wakeup scan

Issue
=====

On asymmetric CPU capacity topologies, we currently rely on wake_cap() to
drive select_task_rq_fair() towards either:

- its slow-path (find_idlest_cpu()) if either the previous or
  current (waking) CPU has too little capacity for the waking task
- its fast-path (select_idle_sibling()) otherwise

Commit:

  3273163c6775 ("sched/fair: Let asymmetric CPU configurations balance at wake-up")

points out that this relies on the assumption that "[...]the CPU capacities
within an SD_SHARE_PKG_RESOURCES domain (sd_llc) are homogeneous".

This assumption no longer holds on newer generations of big.LITTLE
systems (DynamIQ), which can accommodate CPUs of different compute capacity
within a single LLC domain. To hopefully paint a better picture, a regular
big.LITTLE topology would look like this:

  +---------+ +---------+
  |   L2    | |   L2    |
  +----+----+ +----+----+
  |CPU0|CPU1| |CPU2|CPU3|
  +----+----+ +----+----+
      ^^^         ^^^
    LITTLEs      bigs

which would result in the following scheduler topology:

  DIE [         ] <- sd_asym_cpucapacity
  MC  [   ] [   ] <- sd_llc
       0 1   2 3

Conversely, a DynamIQ topology could look like:

  +-------------------+
  |        L3         |
  +----+----+----+----+
  | L2 | L2 | L2 | L2 |
  +----+----+----+----+
  |CPU0|CPU1|CPU2|CPU3|
  +----+----+----+----+
     ^^^^^     ^^^^^
    LITTLEs    bigs

which would result in the following scheduler topology:

  MC [       ] <- sd_llc, sd_asym_cpucapacity
      0 1 2 3

What this means is that, on DynamIQ systems, we could pass the wake_cap()
test (IOW presume the waking task fits on the CPU capacities of some LLC
domain), thus go through select_idle_sibling().
This function operates on an LLC domain, which here spans both bigs and
LITTLEs, so it could very well pick a CPU of too small capacity for the
task, despite there being fitting idle CPUs - it very much depends on the
CPU iteration order, on which we have absolutely no guarantees
capacity-wise.

Implementation
==============

Introduce yet another select_idle_sibling() helper function that takes CPU
capacity into account. The policy is to pick the first idle CPU which is
big enough for the task (task_util * margin < cpu_capacity). If no
idle CPU is big enough, we pick the idle one with the highest capacity.

Unlike other select_idle_sibling() helpers, this one operates on the
sd_asym_cpucapacity sched_domain pointer, which is guaranteed to span all
known CPU capacities in the system. As such, this will work for both
"legacy" big.LITTLE (LITTLEs & bigs split at MC, joined at DIE) and for
newer DynamIQ systems (e.g. LITTLEs and bigs in the same MC domain).

Note that this limits the scope of select_idle_sibling() to
select_idle_capacity() for asymmetric CPU capacity systems - the LLC domain
will not be scanned, and no further heuristic will be applied.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://lkml.kernel.org/r/20200206191957.12325-2-valentin.schneider@arm.com
---
 kernel/sched/fair.c | 56 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1a0ce83..6fb47a2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5897,6 +5897,40 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 }
 
 /*
+ * Scan the asym_capacity domain for idle CPUs; pick the first idle one on which
+ * the task fits. If no CPU is big enough, but there are idle ones, try to
+ * maximize capacity.
+ */
+static int
+select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
+{
+	unsigned long best_cap = 0;
+	int cpu, best_cpu = -1;
+	struct cpumask *cpus;
+
+	sync_entity_load_avg(&p->se);
+
+	cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
+	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+
+	for_each_cpu_wrap(cpu, cpus, target) {
+		unsigned long cpu_cap = capacity_of(cpu);
+
+		if (!available_idle_cpu(cpu) && !sched_idle_cpu(cpu))
+			continue;
+		if (task_fits_capacity(p, cpu_cap))
+			return cpu;
+
+		if (cpu_cap > best_cap) {
+			best_cap = cpu_cap;
+			best_cpu = cpu;
+		}
+	}
+
+	return best_cpu;
+}
+
+/*
  * Try and locate an idle core/thread in the LLC cache domain.
  */
 static int select_idle_sibling(struct task_struct *p, int prev, int target)
@@ -5904,6 +5938,28 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	struct sched_domain *sd;
 	int i, recent_used_cpu;
 
+	/*
+	 * For asymmetric CPU capacity systems, our domain of interest is
+	 * sd_asym_cpucapacity rather than sd_llc.
+	 */
+	if (static_branch_unlikely(&sched_asym_cpucapacity)) {
+		sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, target));
+		/*
+		 * On an asymmetric CPU capacity system where an exclusive
+		 * cpuset defines a symmetric island (i.e. one unique
+		 * capacity_orig value through the cpuset), the key will be set
+		 * but the CPUs within that cpuset will not have a domain with
+		 * SD_ASYM_CPUCAPACITY. These should follow the usual symmetric
+		 * capacity path.
+		 */
+		if (!sd)
+			goto symmetric;
+
+		i = select_idle_capacity(p, sd, target);
+		return ((unsigned)i < nr_cpumask_bits) ? i : target;
+	}
+
+symmetric:
 	if (available_idle_cpu(target) || sched_idle_cpu(target))
 		return target;
 

  parent reply	other threads:[~2020-02-11 12:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 19:19 [PATCH v4 0/4] sched/fair: Capacity aware wakeup rework Valentin Schneider
2020-02-06 19:19 ` [PATCH v4 1/4] sched/fair: Add asymmetric CPU capacity wakeup scan Valentin Schneider
2020-02-07  5:08   ` Pavan Kondeti
2020-02-07 10:18     ` Valentin Schneider
2020-02-07 11:01   ` Quentin Perret
2020-02-11 12:47   ` tip-bot2 for Morten Rasmussen [this message]
2020-02-20 20:09   ` [tip: sched/core] " tip-bot2 for Morten Rasmussen
2020-02-06 19:19 ` [PATCH v4 2/4] sched/topology: Remove SD_BALANCE_WAKE on asymmetric capacity systems Valentin Schneider
2020-02-07 11:03   ` Quentin Perret
2020-02-11 12:47   ` [tip: sched/core] " tip-bot2 for Morten Rasmussen
2020-02-20 20:09   ` tip-bot2 for Morten Rasmussen
2020-02-06 19:19 ` [PATCH v4 3/4] sched: Remove for_each_lower_domain() Valentin Schneider
2020-02-07 11:04   ` Quentin Perret
2020-02-11 12:47   ` [tip: sched/core] sched/core: " tip-bot2 for Valentin Schneider
2020-02-20 20:09   ` tip-bot2 for Valentin Schneider
2020-02-06 19:19 ` [PATCH v4 4/4] sched/fair: Kill wake_cap() Valentin Schneider
2020-02-07 11:19   ` Quentin Perret
2020-02-07 12:48     ` Valentin Schneider
2020-02-11 12:47   ` [tip: sched/core] sched/fair: Remove wake_cap() tip-bot2 for Morten Rasmussen
2020-02-20 20:09   ` tip-bot2 for Morten Rasmussen
2020-02-07 10:42 ` [PATCH v4 0/4] sched/fair: Capacity aware wakeup rework Quentin Perret
2020-02-07 12:41   ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=158142526956.411.9603515034414305045.tip-bot2@tip-bot2 \
    --to=tip-bot2@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=valentin.schneider@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.