LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: Barry Song <song.bao.hua@hisilicon.com>
Cc: catalin.marinas@arm.com, will@kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, gregkh@linuxfoundation.org,
	Jonathan.Cameron@huawei.com, mingo@redhat.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linuxarm@huawei.com, xuwei5@huawei.com, prime.zeng@hisilicon.com
Subject: Re: [RFC PATCH v2 2/2] scheduler: add scheduler level for clusters
Date: Tue, 01 Dec 2020 16:04:04 +0000
Message-ID: <jhj1rg9v7gr.mognet@arm.com> (raw)
In-Reply-To: <20201201025944.18260-3-song.bao.hua@hisilicon.com>


On 01/12/20 02:59, Barry Song wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1a68a05..ae8ec910 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6106,6 +6106,37 @@ static inline int select_idle_smt(struct task_struct *p, int target)
>  
>  #endif /* CONFIG_SCHED_SMT */
>  
> +#ifdef CONFIG_SCHED_CLUSTER
> +/*
> + * Scan the local CLUSTER mask for idle CPUs.
> + */
> +static int select_idle_cluster(struct task_struct *p, int target)
> +{
> +	int cpu;
> +
> +	/* right now, no hardware with both cluster and smt to run */
> +	if (sched_smt_active())
> +		return -1;
> +
> +	for_each_cpu_wrap(cpu, cpu_cluster_mask(target), target) {

Gating this behind this new config only leveraged by arm64 doesn't make it
very generic. Note that powerpc also has this newish "CACHE" level which
seems to overlap in function with your "CLUSTER" one (both are arch
specific, though).

I think what you are after here is an SD_SHARE_PKG_RESOURCES domain walk,
i.e. scan CPUs by increasing cache "distance". We already have it in some
form, as we scan SMT & LLC domains; AFAICT LLC always maps to MC, except
for said powerpc's CACHE thingie.

*If* we are to generally support more levels with SD_SHARE_PKG_RESOURCES,
we could say frob something into select_idle_cpu(). I'm thinking of
something like the incomplete, untested below: 

---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ae7ceba8fd4f..70692888db00 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6120,7 +6120,7 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
-	struct sched_domain *this_sd;
+	struct sched_domain *this_sd, *child = NULL;
 	u64 avg_cost, avg_idle;
 	u64 time;
 	int this = smp_processor_id();
@@ -6150,14 +6150,22 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 
 	time = cpu_clock(this);
 
-	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+	do {
+		/* XXX: sd should start as SMT's parent */
+		cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+		if (child)
+			cpumask_andnot(cpus, cpus, sched_domain_span(child));
+
+		for_each_cpu_wrap(cpu, cpus, target) {
+			if (!--nr)
+				return -1;
+			if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
+				break;
+		}
 
-	for_each_cpu_wrap(cpu, cpus, target) {
-		if (!--nr)
-			return -1;
-		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
-			break;
-	}
+		child = sd;
+		sd = sd->parent;
+	} while (sd && sd->flags & SD_SHARE_PKG_RESOURCES);
 
 	time = cpu_clock(this) - time;
 	update_avg(&this_sd->avg_scan_cost, time);

  reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01  2:59 [RFC PATCH v2 0/2] scheduler: expose the topology of clusters and add cluster scheduler Barry Song
2020-12-01  2:59 ` [RFC PATCH v2 1/2] topology: Represent clusters of CPUs within a die Barry Song
2020-12-01 16:03   ` Valentin Schneider
2020-12-02  9:55     ` Sudeep Holla
2020-12-01  2:59 ` [RFC PATCH v2 2/2] scheduler: add scheduler level for clusters Barry Song
2020-12-01 16:04   ` Valentin Schneider [this message]
2020-12-03  9:28     ` Peter Zijlstra
2020-12-03  9:49       ` Mel Gorman
2020-12-03  9:57       ` Song Bao Hua (Barry Song)
2020-12-03 10:07         ` Peter Zijlstra
2020-12-02  8:27   ` Vincent Guittot
2020-12-02  9:20     ` Song Bao Hua (Barry Song)
2020-12-02 10:16       ` Vincent Guittot
2020-12-02 10:45         ` Song Bao Hua (Barry Song)
2020-12-02 10:48         ` Song Bao Hua (Barry Song)
2020-12-02 20:58         ` Song Bao Hua (Barry Song)
2020-12-03  9:03           ` Vincent Guittot
2020-12-03  9:11             ` Song Bao Hua (Barry Song)
2020-12-03  9:39               ` Vincent Guittot
2020-12-03  9:54                 ` Vincent Guittot
2020-12-07  9:59                 ` Song Bao Hua (Barry Song)
2020-12-07 15:29                   ` Vincent Guittot
2020-12-09 11:35                     ` Song Bao Hua (Barry Song)
2020-12-01 10:46 ` [RFC PATCH v2 0/2] scheduler: expose the topology of clusters and add cluster scheduler Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhj1rg9v7gr.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=bsegall@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=juri.lelli@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=vincent.guittot@linaro.org \
    --cc=will@kernel.org \
    --cc=xuwei5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git