All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] sched/numa: remove unused code
@ 2018-08-29 13:19 Vincent Guittot
  2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: Vincent Guittot, Srikar Dronamraju, Rik van Riel

With:
  commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
some code and fields are not used anymore.

Patch 1-2 remove the unused code.

Patch 3 goes a bit further and removes smt_gain field from sched_domain.

Patch 4 removes sd parameter from arch_scale_cpu_capacity once the last
user has been removed by patch 3

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org (open list)

Vincent Guittot (4):
  sched/numa: remove unused code from update_numa_stats()
  sched/numa: remove unused nr_running field
  sched/numa: remove smt_gain
  sched/topology: remove unused sd param from arch_scale_cpu_capacity()

 arch/arm/kernel/topology.c       |  2 +-
 drivers/base/arch_topology.c     |  6 +++---
 include/linux/arch_topology.h    |  2 +-
 include/linux/sched/topology.h   |  1 -
 kernel/sched/cpufreq_schedutil.c |  2 +-
 kernel/sched/deadline.c          |  2 +-
 kernel/sched/fair.c              | 32 +++++---------------------------
 kernel/sched/pelt.c              |  2 +-
 kernel/sched/sched.h             |  7 ++-----
 kernel/sched/topology.c          |  2 --
 10 files changed, 15 insertions(+), 43 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] sched/numa: remove unused code from update_numa_stats()
  2018-08-29 13:19 [PATCH 0/4] sched/numa: remove unused code Vincent Guittot
@ 2018-08-29 13:19 ` Vincent Guittot
  2018-09-04  6:39   ` Srikar Dronamraju
  2018-09-10 10:19   ` [tip:sched/core] sched/numa: Remove " tip-bot for Vincent Guittot
  2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: Vincent Guittot, Srikar Dronamraju, Rik van Riel

With :
  commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")

the local variables smt, cpus and capacity and their results are not used
anymore in numa_has_capacity()

Remove this unused code

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org (open list)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 309c93f..c2b8bf4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1463,8 +1463,7 @@ struct numa_stats {
  */
 static void update_numa_stats(struct numa_stats *ns, int nid)
 {
-	int smt, cpu, cpus = 0;
-	unsigned long capacity;
+	int cpu;
 
 	memset(ns, 0, sizeof(*ns));
 	for_each_cpu(cpu, cpumask_of_node(nid)) {
@@ -1473,26 +1472,8 @@ static void update_numa_stats(struct numa_stats *ns, int nid)
 		ns->nr_running += rq->nr_running;
 		ns->load += weighted_cpuload(rq);
 		ns->compute_capacity += capacity_of(cpu);
-
-		cpus++;
 	}
 
-	/*
-	 * If we raced with hotplug and there are no CPUs left in our mask
-	 * the @ns structure is NULL'ed and task_numa_compare() will
-	 * not find this node attractive.
-	 *
-	 * We'll detect a huge imbalance and bail there.
-	 */
-	if (!cpus)
-		return;
-
-	/* smt := ceil(cpus / capacity), assumes: 1 < smt_power < 2 */
-	smt = DIV_ROUND_UP(SCHED_CAPACITY_SCALE * cpus, ns->compute_capacity);
-	capacity = cpus / smt; /* cores */
-
-	capacity = min_t(unsigned, capacity,
-		DIV_ROUND_CLOSEST(ns->compute_capacity, SCHED_CAPACITY_SCALE));
 }
 
 struct task_numa_env {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] sched/numa: remove unused nr_running field
  2018-08-29 13:19 [PATCH 0/4] sched/numa: remove unused code Vincent Guittot
  2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
@ 2018-08-29 13:19 ` Vincent Guittot
  2018-09-04  6:40   ` Srikar Dronamraju
  2018-09-10 10:20   ` [tip:sched/core] sched/numa: Remove unused numa_stats::nr_running field tip-bot for Vincent Guittot
  2018-08-29 13:19 ` [RFC PATCH 3/4] sched/topology: remove smt_gain Vincent Guittot
  2018-08-29 13:19   ` Vincent Guittot
  3 siblings, 2 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: Vincent Guittot, Srikar Dronamraju, Rik van Riel

nr_running in struct numa_stats is not used anywhere in the code.

Remove it.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org (open list)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c2b8bf4..cff1682 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1454,8 +1454,6 @@ struct numa_stats {
 
 	/* Total compute capacity of CPUs on a node */
 	unsigned long compute_capacity;
-
-	unsigned int nr_running;
 };
 
 /*
@@ -1469,7 +1467,6 @@ static void update_numa_stats(struct numa_stats *ns, int nid)
 	for_each_cpu(cpu, cpumask_of_node(nid)) {
 		struct rq *rq = cpu_rq(cpu);
 
-		ns->nr_running += rq->nr_running;
 		ns->load += weighted_cpuload(rq);
 		ns->compute_capacity += capacity_of(cpu);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-08-29 13:19 [PATCH 0/4] sched/numa: remove unused code Vincent Guittot
  2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
  2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
@ 2018-08-29 13:19 ` Vincent Guittot
  2018-08-29 14:08   ` Qais Yousef
                     ` (2 more replies)
  2018-08-29 13:19   ` Vincent Guittot
  3 siblings, 3 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel; +Cc: Vincent Guittot

smt_gain is used to compute the capacity of CPUs of a SMT core with the
constraint 1 < smt_gain < 2 in order to be able to compute number of CPUs
per core. The field has_free_capacity of struct numa_stat, which was the
last user of this computation of number of CPUs per core, has been removed
by :

  commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")

We can now remove this constraint on core capacity and use the defautl value
SCHED_CAPACITY_SCALE for SMT CPUs. With this remove, SCHED_CAPACITY_SCALE
becomes the maximum compute capacity of CPUs on every systems. This should
help to simplify some code and remove fields like rd->max_cpu_capacity

Furthermore, arch_scale_cpu_capacity() is used with a NULL sd in several other
places in the code when it wants the capacity of a CPUs to scale
some metrics like in pelt, deadline or schedutil. In case on SMT, the value
returned is not the capacity of SMT CPUs but default SCHED_CAPACITY_SCALE.

Remove the smt_gain field from sched_domain struct

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org (open list)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 include/linux/sched/topology.h | 1 -
 kernel/sched/sched.h           | 3 ---
 kernel/sched/topology.c        | 2 --
 3 files changed, 6 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 2634774..212792b 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -89,7 +89,6 @@ struct sched_domain {
 	unsigned int newidle_idx;
 	unsigned int wake_idx;
 	unsigned int forkexec_idx;
-	unsigned int smt_gain;
 
 	int nohz_idle;			/* NOHZ IDLE status */
 	int flags;			/* See SD_* */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4a2e8ca..b1715b8 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1758,9 +1758,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
 static __always_inline
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
-		return sd->smt_gain / sd->span_weight;
-
 	return SCHED_CAPACITY_SCALE;
 }
 #endif
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 56a0fed..069c924 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1129,7 +1129,6 @@ sd_init(struct sched_domain_topology_level *tl,
 
 		.last_balance		= jiffies,
 		.balance_interval	= sd_weight,
-		.smt_gain		= 0,
 		.max_newidle_lb_cost	= 0,
 		.next_decay_max_lb_cost	= jiffies,
 		.child			= child,
@@ -1155,7 +1154,6 @@ sd_init(struct sched_domain_topology_level *tl,
 	if (sd->flags & SD_SHARE_CPUCAPACITY) {
 		sd->flags |= SD_PREFER_SIBLING;
 		sd->imbalance_pct = 110;
-		sd->smt_gain = 1178; /* ~15% */
 
 	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
 		sd->flags |= SD_PREFER_SIBLING;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 4/4] sched/topology: remove unused sd param from arch_scale_cpu_capacity()
  2018-08-29 13:19 [PATCH 0/4] sched/numa: remove unused code Vincent Guittot
@ 2018-08-29 13:19   ` Vincent Guittot
  2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: Vincent Guittot, Russell King, Greg Kroah-Hartman,
	Rafael J. Wysocki, linux-arm-kernel

struct sched_domain *sd parameter is no more used in arch_scale_cpu_capacity()
so we can remote it.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

---
 arch/arm/kernel/topology.c       | 2 +-
 drivers/base/arch_topology.c     | 6 +++---
 include/linux/arch_topology.h    | 2 +-
 kernel/sched/cpufreq_schedutil.c | 2 +-
 kernel/sched/deadline.c          | 2 +-
 kernel/sched/fair.c              | 8 ++++----
 kernel/sched/pelt.c              | 2 +-
 kernel/sched/sched.h             | 4 ++--
 8 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 24ac3ca..d3d75c5 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -175,7 +175,7 @@ static void update_cpu_capacity(unsigned int cpu)
 	topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
 
 	pr_info("CPU%u: update cpu_capacity %lu\n",
-		cpu, topology_get_cpu_scale(NULL, cpu));
+		cpu, topology_get_cpu_scale(cpu));
 }
 
 #else
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7cb0c6..6dc9339 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -44,7 +44,7 @@ static ssize_t cpu_capacity_show(struct device *dev,
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
 
-	return sprintf(buf, "%lu\n", topology_get_cpu_scale(NULL, cpu->dev.id));
+	return sprintf(buf, "%lu\n", topology_get_cpu_scale(cpu->dev.id));
 }
 
 static ssize_t cpu_capacity_store(struct device *dev,
@@ -124,7 +124,7 @@ void topology_normalize_cpu_scale(void)
 			/ capacity_scale;
 		topology_set_cpu_scale(cpu, capacity);
 		pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n",
-			cpu, topology_get_cpu_scale(NULL, cpu));
+			cpu, topology_get_cpu_scale(cpu));
 	}
 	mutex_unlock(&cpu_scale_mutex);
 }
@@ -194,7 +194,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
 	cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
 
 	for_each_cpu(cpu, policy->related_cpus) {
-		raw_capacity[cpu] = topology_get_cpu_scale(NULL, cpu) *
+		raw_capacity[cpu] = topology_get_cpu_scale(cpu) *
 				    policy->cpuinfo.max_freq / 1000UL;
 		capacity_scale = max(raw_capacity[cpu], capacity_scale);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 2b70941..5df6773 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -17,7 +17,7 @@ DECLARE_PER_CPU(unsigned long, cpu_scale);
 
 struct sched_domain;
 static inline
-unsigned long topology_get_cpu_scale(struct sched_domain *sd, int cpu)
+unsigned long topology_get_cpu_scale(int cpu)
 {
 	return per_cpu(cpu_scale, cpu);
 }
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 3fffad3..01b95057 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -202,7 +202,7 @@ static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu)
 	struct rq *rq = cpu_rq(sg_cpu->cpu);
 	unsigned long util, irq, max;
 
-	sg_cpu->max = max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
+	sg_cpu->max = max = arch_scale_cpu_capacity(sg_cpu->cpu);
 	sg_cpu->bw_dl = cpu_bw_dl(rq);
 
 	if (rt_rq_is_runnable(&rq->rt))
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 997ea7b..5f763b1 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1196,7 +1196,7 @@ static void update_curr_dl(struct rq *rq)
 						 &curr->dl);
 	} else {
 		unsigned long scale_freq = arch_scale_freq_capacity(cpu);
-		unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+		unsigned long scale_cpu = arch_scale_cpu_capacity(cpu);
 
 		scaled_delta_exec = cap_scale(delta_exec, scale_freq);
 		scaled_delta_exec = cap_scale(scaled_delta_exec, scale_cpu);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cff1682..2eeac7c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -748,7 +748,7 @@ void post_init_entity_util_avg(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 	struct sched_avg *sa = &se->avg;
-	long cpu_scale = arch_scale_cpu_capacity(NULL, cpu_of(rq_of(cfs_rq)));
+	long cpu_scale = arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	long cap = (long)(cpu_scale - cfs_rq->avg.util_avg) / 2;
 
 	if (cap > 0) {
@@ -3175,7 +3175,7 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
 	 * is not we rescale running_sum 1st
 	 */
 	running_sum = se->avg.util_sum /
-		arch_scale_cpu_capacity(NULL, cpu_of(rq_of(cfs_rq)));
+		arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	runnable_sum = max(runnable_sum, running_sum);
 
 	load_sum = (s64)se_weight(se) * runnable_sum;
@@ -7462,7 +7462,7 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 static unsigned long scale_rt_capacity(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
+	unsigned long max = arch_scale_cpu_capacity(cpu);
 	unsigned long used, free;
 	unsigned long irq;
 
@@ -7487,7 +7487,7 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu)
 	unsigned long capacity = scale_rt_capacity(cpu);
 	struct sched_group *sdg = sd->groups;
 
-	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(sd, cpu);
+	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu);
 
 	if (!capacity)
 		capacity = 1;
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 35475c0..5efa152 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -114,7 +114,7 @@ accumulate_sum(u64 delta, int cpu, struct sched_avg *sa,
 	u64 periods;
 
 	scale_freq = arch_scale_freq_capacity(cpu);
-	scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+	scale_cpu = arch_scale_cpu_capacity(cpu);
 
 	delta += sa->period_contrib;
 	periods = delta / 1024; /* A period is 1024us (~1ms) */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b1715b8..8b306ce 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1756,7 +1756,7 @@ unsigned long arch_scale_freq_capacity(int cpu)
 #ifdef CONFIG_SMP
 #ifndef arch_scale_cpu_capacity
 static __always_inline
-unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
+unsigned long arch_scale_cpu_capacity(int cpu)
 {
 	return SCHED_CAPACITY_SCALE;
 }
@@ -1764,7 +1764,7 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 #else
 #ifndef arch_scale_cpu_capacity
 static __always_inline
-unsigned long arch_scale_cpu_capacity(void __always_unused *sd, int cpu)
+unsigned long arch_scale_cpu_capacity(int cpu)
 {
 	return SCHED_CAPACITY_SCALE;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 4/4] sched/topology: remove unused sd param from arch_scale_cpu_capacity()
@ 2018-08-29 13:19   ` Vincent Guittot
  0 siblings, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

struct sched_domain *sd parameter is no more used in arch_scale_cpu_capacity()
so we can remote it.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-arm-kernel at lists.infradead.org
Cc: linux-kernel at vger.kernel.org
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

---
 arch/arm/kernel/topology.c       | 2 +-
 drivers/base/arch_topology.c     | 6 +++---
 include/linux/arch_topology.h    | 2 +-
 kernel/sched/cpufreq_schedutil.c | 2 +-
 kernel/sched/deadline.c          | 2 +-
 kernel/sched/fair.c              | 8 ++++----
 kernel/sched/pelt.c              | 2 +-
 kernel/sched/sched.h             | 4 ++--
 8 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 24ac3ca..d3d75c5 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -175,7 +175,7 @@ static void update_cpu_capacity(unsigned int cpu)
 	topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
 
 	pr_info("CPU%u: update cpu_capacity %lu\n",
-		cpu, topology_get_cpu_scale(NULL, cpu));
+		cpu, topology_get_cpu_scale(cpu));
 }
 
 #else
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7cb0c6..6dc9339 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -44,7 +44,7 @@ static ssize_t cpu_capacity_show(struct device *dev,
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
 
-	return sprintf(buf, "%lu\n", topology_get_cpu_scale(NULL, cpu->dev.id));
+	return sprintf(buf, "%lu\n", topology_get_cpu_scale(cpu->dev.id));
 }
 
 static ssize_t cpu_capacity_store(struct device *dev,
@@ -124,7 +124,7 @@ void topology_normalize_cpu_scale(void)
 			/ capacity_scale;
 		topology_set_cpu_scale(cpu, capacity);
 		pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n",
-			cpu, topology_get_cpu_scale(NULL, cpu));
+			cpu, topology_get_cpu_scale(cpu));
 	}
 	mutex_unlock(&cpu_scale_mutex);
 }
@@ -194,7 +194,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
 	cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
 
 	for_each_cpu(cpu, policy->related_cpus) {
-		raw_capacity[cpu] = topology_get_cpu_scale(NULL, cpu) *
+		raw_capacity[cpu] = topology_get_cpu_scale(cpu) *
 				    policy->cpuinfo.max_freq / 1000UL;
 		capacity_scale = max(raw_capacity[cpu], capacity_scale);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 2b70941..5df6773 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -17,7 +17,7 @@ DECLARE_PER_CPU(unsigned long, cpu_scale);
 
 struct sched_domain;
 static inline
-unsigned long topology_get_cpu_scale(struct sched_domain *sd, int cpu)
+unsigned long topology_get_cpu_scale(int cpu)
 {
 	return per_cpu(cpu_scale, cpu);
 }
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 3fffad3..01b95057 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -202,7 +202,7 @@ static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu)
 	struct rq *rq = cpu_rq(sg_cpu->cpu);
 	unsigned long util, irq, max;
 
-	sg_cpu->max = max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
+	sg_cpu->max = max = arch_scale_cpu_capacity(sg_cpu->cpu);
 	sg_cpu->bw_dl = cpu_bw_dl(rq);
 
 	if (rt_rq_is_runnable(&rq->rt))
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 997ea7b..5f763b1 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1196,7 +1196,7 @@ static void update_curr_dl(struct rq *rq)
 						 &curr->dl);
 	} else {
 		unsigned long scale_freq = arch_scale_freq_capacity(cpu);
-		unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+		unsigned long scale_cpu = arch_scale_cpu_capacity(cpu);
 
 		scaled_delta_exec = cap_scale(delta_exec, scale_freq);
 		scaled_delta_exec = cap_scale(scaled_delta_exec, scale_cpu);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cff1682..2eeac7c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -748,7 +748,7 @@ void post_init_entity_util_avg(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 	struct sched_avg *sa = &se->avg;
-	long cpu_scale = arch_scale_cpu_capacity(NULL, cpu_of(rq_of(cfs_rq)));
+	long cpu_scale = arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	long cap = (long)(cpu_scale - cfs_rq->avg.util_avg) / 2;
 
 	if (cap > 0) {
@@ -3175,7 +3175,7 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
 	 * is not we rescale running_sum 1st
 	 */
 	running_sum = se->avg.util_sum /
-		arch_scale_cpu_capacity(NULL, cpu_of(rq_of(cfs_rq)));
+		arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	runnable_sum = max(runnable_sum, running_sum);
 
 	load_sum = (s64)se_weight(se) * runnable_sum;
@@ -7462,7 +7462,7 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 static unsigned long scale_rt_capacity(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
+	unsigned long max = arch_scale_cpu_capacity(cpu);
 	unsigned long used, free;
 	unsigned long irq;
 
@@ -7487,7 +7487,7 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu)
 	unsigned long capacity = scale_rt_capacity(cpu);
 	struct sched_group *sdg = sd->groups;
 
-	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(sd, cpu);
+	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu);
 
 	if (!capacity)
 		capacity = 1;
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 35475c0..5efa152 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -114,7 +114,7 @@ accumulate_sum(u64 delta, int cpu, struct sched_avg *sa,
 	u64 periods;
 
 	scale_freq = arch_scale_freq_capacity(cpu);
-	scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+	scale_cpu = arch_scale_cpu_capacity(cpu);
 
 	delta += sa->period_contrib;
 	periods = delta / 1024; /* A period is 1024us (~1ms) */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b1715b8..8b306ce 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1756,7 +1756,7 @@ unsigned long arch_scale_freq_capacity(int cpu)
 #ifdef CONFIG_SMP
 #ifndef arch_scale_cpu_capacity
 static __always_inline
-unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
+unsigned long arch_scale_cpu_capacity(int cpu)
 {
 	return SCHED_CAPACITY_SCALE;
 }
@@ -1764,7 +1764,7 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 #else
 #ifndef arch_scale_cpu_capacity
 static __always_inline
-unsigned long arch_scale_cpu_capacity(void __always_unused *sd, int cpu)
+unsigned long arch_scale_cpu_capacity(int cpu)
 {
 	return SCHED_CAPACITY_SCALE;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-08-29 13:19 ` [RFC PATCH 3/4] sched/topology: remove smt_gain Vincent Guittot
@ 2018-08-29 14:08   ` Qais Yousef
  2018-08-29 14:42     ` Vincent Guittot
  2018-09-07 12:42     ` Peter Zijlstra
  2018-09-04  8:24   ` Srikar Dronamraju
  2018-12-11 15:31   ` [tip:sched/core] sched/topology: Remove the ::smt_gain field from 'struct sched_domain' tip-bot for Vincent Guittot
  2 siblings, 2 replies; 25+ messages in thread
From: Qais Yousef @ 2018-08-29 14:08 UTC (permalink / raw)
  To: Vincent Guittot, peterz, mingo, linux-kernel

On 29/08/18 14:19, Vincent Guittot wrote:
> smt_gain is used to compute the capacity of CPUs of a SMT core with the
> constraint 1 < smt_gain < 2 in order to be able to compute number of CPUs
> per core. The field has_free_capacity of struct numa_stat, which was the
> last user of this computation of number of CPUs per core, has been removed
> by :
>
>    commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
>
> We can now remove this constraint on core capacity and use the defautl value
> SCHED_CAPACITY_SCALE for SMT CPUs. With this remove, SCHED_CAPACITY_SCALE
> becomes the maximum compute capacity of CPUs on every systems. This should
> help to simplify some code and remove fields like rd->max_cpu_capacity
>
> Furthermore, arch_scale_cpu_capacity() is used with a NULL sd in several other
> places in the code when it wants the capacity of a CPUs to scale
> some metrics like in pelt, deadline or schedutil. In case on SMT, the value
> returned is not the capacity of SMT CPUs but default SCHED_CAPACITY_SCALE.
>
> Remove the smt_gain field from sched_domain struct

You beat me to it, I got confused by smt_gain recently when I stumbled 
on it as I found out on ARM it's not used and had to spend sometime to 
convince myself it's not really necessary to use it.

It was hard to track the history of this and *why* it's needed.

The only 'theoretical' case I found smt_gain can be useful is when you 
have asymmetric system, for example:

Node_A: 1 Core 2 Threads
Node_B: 1 Core 4 Threads

Then with smt_gain the group_capacity at the core level will be limited 
to smt_gain. But without smt_gain Node_B will look twice as powerful as 
Node_A - which will affect balancing AFAICT causing Node_B's single core 
to be oversubscribed as the 4 threads will still have to share the same 
underlying hardware resources. I don't think in practice such systems 
exists, or even make sense, though.

So +1 from my side for the removal.

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org (open list)
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>   include/linux/sched/topology.h | 1 -
>   kernel/sched/sched.h           | 3 ---
>   kernel/sched/topology.c        | 2 --
>   3 files changed, 6 deletions(-)
>
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 2634774..212792b 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -89,7 +89,6 @@ struct sched_domain {
>   	unsigned int newidle_idx;
>   	unsigned int wake_idx;
>   	unsigned int forkexec_idx;
> -	unsigned int smt_gain;
>   
>   	int nohz_idle;			/* NOHZ IDLE status */
>   	int flags;			/* See SD_* */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 4a2e8ca..b1715b8 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1758,9 +1758,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
>   static __always_inline
>   unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
>   {
> -	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> -		return sd->smt_gain / sd->span_weight;
> -
>   	return SCHED_CAPACITY_SCALE;
>   }
>   #endif
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 56a0fed..069c924 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1129,7 +1129,6 @@ sd_init(struct sched_domain_topology_level *tl,
>   
>   		.last_balance		= jiffies,
>   		.balance_interval	= sd_weight,
> -		.smt_gain		= 0,
>   		.max_newidle_lb_cost	= 0,
>   		.next_decay_max_lb_cost	= jiffies,
>   		.child			= child,
> @@ -1155,7 +1154,6 @@ sd_init(struct sched_domain_topology_level *tl,
>   	if (sd->flags & SD_SHARE_CPUCAPACITY) {
>   		sd->flags |= SD_PREFER_SIBLING;
>   		sd->imbalance_pct = 110;
> -		sd->smt_gain = 1178; /* ~15% */
>   
>   	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
>   		sd->flags |= SD_PREFER_SIBLING;


-- 
Qais Yousef


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-08-29 14:08   ` Qais Yousef
@ 2018-08-29 14:42     ` Vincent Guittot
  2018-09-07 12:42     ` Peter Zijlstra
  1 sibling, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-08-29 14:42 UTC (permalink / raw)
  To: qais.yousef; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Wed, 29 Aug 2018 at 16:08, Qais Yousef <qais.yousef@arm.com> wrote:
>

> You beat me to it, I got confused by smt_gain recently when I stumbled
> on it as I found out on ARM it's not used and had to spend sometime to
> convince myself it's not really necessary to use it.
>
> It was hard to track the history of this and *why* it's needed.
>
> The only 'theoretical' case I found smt_gain can be useful is when you
> have asymmetric system, for example:
>
> Node_A: 1 Core 2 Threads
> Node_B: 1 Core 4 Threads
>
> Then with smt_gain the group_capacity at the core level will be limited
> to smt_gain. But without smt_gain Node_B will look twice as powerful as
> Node_A - which will affect balancing AFAICT causing Node_B's single core
> to be oversubscribed as the 4 threads will still have to share the same
> underlying hardware resources. I don't think in practice such systems
> exists, or even make sense, though.

Yes I came to the same conclusion

>
> So +1 from my side for the removal.

Thanks

>
>
> --
> Qais Yousef
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] sched/numa: remove unused code from update_numa_stats()
  2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
@ 2018-09-04  6:39   ` Srikar Dronamraju
  2018-09-10 10:19   ` [tip:sched/core] sched/numa: Remove " tip-bot for Vincent Guittot
  1 sibling, 0 replies; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-04  6:39 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: peterz, mingo, linux-kernel, Rik van Riel

> With :
>   commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
> 
> the local variables smt, cpus and capacity and their results are not used
> anymore in numa_has_capacity()
> 
> Remove this unused code
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: linux-kernel@vger.kernel.org (open list)
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 21 +--------------------
>  1 file changed, 1 insertion(+), 20 deletions(-)

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] sched/numa: remove unused nr_running field
  2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
@ 2018-09-04  6:40   ` Srikar Dronamraju
  2018-09-10 10:20   ` [tip:sched/core] sched/numa: Remove unused numa_stats::nr_running field tip-bot for Vincent Guittot
  1 sibling, 0 replies; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-04  6:40 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: peterz, mingo, linux-kernel, Rik van Riel

* Vincent Guittot <vincent.guittot@linaro.org> [2018-08-29 15:19:10]:

> nr_running in struct numa_stats is not used anywhere in the code.
> 
> Remove it.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: linux-kernel@vger.kernel.org (open list)
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 3 ---
>  1 file changed, 3 deletions(-)

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-08-29 13:19 ` [RFC PATCH 3/4] sched/topology: remove smt_gain Vincent Guittot
  2018-08-29 14:08   ` Qais Yousef
@ 2018-09-04  8:24   ` Srikar Dronamraju
  2018-09-04  9:36     ` Vincent Guittot
  2018-12-11 15:31   ` [tip:sched/core] sched/topology: Remove the ::smt_gain field from 'struct sched_domain' tip-bot for Vincent Guittot
  2 siblings, 1 reply; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-04  8:24 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: peterz, mingo, linux-kernel

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org (open list)
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 4a2e8ca..b1715b8 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1758,9 +1758,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
>  static __always_inline
>  unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
>  {
> -	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> -		return sd->smt_gain / sd->span_weight;
> -
>  	return SCHED_CAPACITY_SCALE;

Without this change, the capacity_orig of an SMT would have been based
on the number of threads.
For example on SMT2, capacity_orig would have been 589 and
for SMT 8, capacity_orig would have been 148.

However after this change, capacity_orig of each SMT thread would be
1024. For example SMT 8 core capacity_orig would now be 8192.

smt_gain was suppose to make a multi threaded core was slightly more
powerful than a single threaded core. I suspect if that sometimes hurt
us when doing load balance between 2 cores i.e at MC or DIE sched
domain. Even with 2 threads running on a core, the core might look
lightly loaded 2048/8192. Hence might dissuade movement to a idle core.

I always wonder why arch_scale_cpu_capacity() is called with NULL
sched_domain, in scale_rt_capacity(). This way capacity might actually
be more than the capacity_orig. I am always under an impression that
capacity_orig > capacity.  Or am I misunderstanding that?

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-04  8:24   ` Srikar Dronamraju
@ 2018-09-04  9:36     ` Vincent Guittot
  2018-09-04 10:37       ` Srikar Dronamraju
  2018-09-10 10:07       ` [tip:sched/core] sched/fair: Fix scale_rt_capacity() for SMT tip-bot for Vincent Guittot
  0 siblings, 2 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-09-04  9:36 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: peterz, mingo, linux-kernel

Hi Srikar,

Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: linux-kernel@vger.kernel.org (open list)
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 4a2e8ca..b1715b8 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -1758,9 +1758,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
> >  static __always_inline
> >  unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
> >  {
> > -	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> > -		return sd->smt_gain / sd->span_weight;
> > -
> >  	return SCHED_CAPACITY_SCALE;
> 
> Without this change, the capacity_orig of an SMT would have been based
> on the number of threads.
> For example on SMT2, capacity_orig would have been 589 and
> for SMT 8, capacity_orig would have been 148.
> 
> However after this change, capacity_orig of each SMT thread would be
> 1024. For example SMT 8 core capacity_orig would now be 8192.
> 
> smt_gain was suppose to make a multi threaded core was slightly more
> powerful than a single threaded core. I suspect if that sometimes hurt

Is there system with both single threaded and multi threaded core ?
That was the main open point for me (and for Qais too)


> us when doing load balance between 2 cores i.e at MC or DIE sched
> domain. Even with 2 threads running on a core, the core might look
> lightly loaded 2048/8192. Hence might dissuade movement to a idle core.

Then, there is the sibling flag at SMT level that normally ensures 1 task per
core for such UC

> 
> I always wonder why arch_scale_cpu_capacity() is called with NULL
> sched_domain, in scale_rt_capacity(). This way capacity might actually

Probably because until this v4.19-rcxx version, the rt scaling was done
relatively to local cpu capacity:
  capacity  = arch_scale_cpu() * scale_rt_capacity / SCHED_CAPACITY_SCALE

Whereas now, it directly returns the remaining capacity

> be more than the capacity_orig. I am always under an impression that
> capacity_orig > capacity.  Or am I misunderstanding that?

You are right, there is a bug for SMT and the patch below should fix it.
Nevertheless, we still have the problem in some other places in the code.

Subject: [PATCH] sched/fair: fix scale_rt_capacity() for SMT

Since commit:
  commit 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
scale_rt_capacity() returns the remaining capacity and not a scale factor
to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
scale_rt_capacity() so we must take the sched_domain argument

Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 309c93f..c73e1fa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7481,10 +7481,10 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 	return load_idx;
 }
 
-static unsigned long scale_rt_capacity(int cpu)
+static unsigned long scale_rt_capacity(struct sched_domain *sd, int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
+	unsigned long max = arch_scale_cpu_capacity(sd, cpu);
 	unsigned long used, free;
 	unsigned long irq;
 
@@ -7506,7 +7506,7 @@ static unsigned long scale_rt_capacity(int cpu)
 
 static void update_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	unsigned long capacity = scale_rt_capacity(cpu);
+	unsigned long capacity = scale_rt_capacity(sd, cpu);
 	struct sched_group *sdg = sd->groups;
 
 	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(sd, cpu);
-- 
2.7.4

> 
> -- 
> Thanks and Regards
> Srikar Dronamraju
> 

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-04  9:36     ` Vincent Guittot
@ 2018-09-04 10:37       ` Srikar Dronamraju
  2018-09-05  7:36         ` Vincent Guittot
  2018-09-10 11:05         ` Qais Yousef
  2018-09-10 10:07       ` [tip:sched/core] sched/fair: Fix scale_rt_capacity() for SMT tip-bot for Vincent Guittot
  1 sibling, 2 replies; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-04 10:37 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: peterz, mingo, linux-kernel

* Vincent Guittot <vincent.guittot@linaro.org> [2018-09-04 11:36:26]:

> Hi Srikar,
> 
> Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
> > However after this change, capacity_orig of each SMT thread would be
> > 1024. For example SMT 8 core capacity_orig would now be 8192.
> > 
> > smt_gain was suppose to make a multi threaded core was slightly more
> > powerful than a single threaded core. I suspect if that sometimes hurt
> 
> Is there system with both single threaded and multi threaded core ?
> That was the main open point for me (and for Qais too)
> 

I dont know of any systems that have come with single threaded and
multithreaded. However some user can still offline few threads in a core
while leaving other cores untouched. I dont really know why somebody
would want to do it.  For example, some customer was toying with SMT 3
mode in a SMT 8 power8 box.

> 
> > us when doing load balance between 2 cores i.e at MC or DIE sched
> > domain. Even with 2 threads running on a core, the core might look
> > lightly loaded 2048/8192. Hence might dissuade movement to a idle core.
> 
> Then, there is the sibling flag at SMT level that normally ensures 1 task per
> core for such UC
> 

Agree.

> > 
> > I always wonder why arch_scale_cpu_capacity() is called with NULL
> > sched_domain, in scale_rt_capacity(). This way capacity might actually
> 
> Probably because until this v4.19-rcxx version, the rt scaling was done
> relatively to local cpu capacity:
>   capacity  = arch_scale_cpu() * scale_rt_capacity / SCHED_CAPACITY_SCALE
> 
> Whereas now, it directly returns the remaining capacity
> 
> > be more than the capacity_orig. I am always under an impression that
> > capacity_orig > capacity.  Or am I misunderstanding that?
> 
> You are right, there is a bug for SMT and the patch below should fix it.
> Nevertheless, we still have the problem in some other places in the code.
> 
> Subject: [PATCH] sched/fair: fix scale_rt_capacity() for SMT
> 
> Since commit:
>   commit 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> scale_rt_capacity() returns the remaining capacity and not a scale factor
> to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
> scale_rt_capacity() so we must take the sched_domain argument
> 
> Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-04 10:37       ` Srikar Dronamraju
@ 2018-09-05  7:36         ` Vincent Guittot
  2018-09-05  8:50           ` Srikar Dronamraju
  2018-09-10 11:05         ` Qais Yousef
  1 sibling, 1 reply; 25+ messages in thread
From: Vincent Guittot @ 2018-09-05  7:36 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Tue, 4 Sep 2018 at 12:37, Srikar Dronamraju
<srikar@linux.vnet.ibm.com> wrote:
>
> * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-04 11:36:26]:
>
> > Hi Srikar,
> >
> > Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
> > > However after this change, capacity_orig of each SMT thread would be
> > > 1024. For example SMT 8 core capacity_orig would now be 8192.
> > >
> > > smt_gain was suppose to make a multi threaded core was slightly more
> > > powerful than a single threaded core. I suspect if that sometimes hurt
> >
> > Is there system with both single threaded and multi threaded core ?
> > That was the main open point for me (and for Qais too)
> >
>
> I dont know of any systems that have come with single threaded and
> multithreaded. However some user can still offline few threads in a core
> while leaving other cores untouched. I dont really know why somebody
> would want to do it.  For example, some customer was toying with SMT 3
> mode in a SMT 8 power8 box.

In this case, it means that we have the same core capacity whatever
the number of CPUs
and a core with SMT 3 will be set with the same compute capacity as
the core with SMT 8.
Does it still make sense ?

>
> >
> > > us when doing load balance between 2 cores i.e at MC or DIE sched
> > > domain. Even with 2 threads running on a core, the core might look
> > > lightly loaded 2048/8192. Hence might dissuade movement to a idle core.
> >
> > Then, there is the sibling flag at SMT level that normally ensures 1 task per
> > core for such UC
> >
>
> Agree.
>
> > >
> > > I always wonder why arch_scale_cpu_capacity() is called with NULL
> > > sched_domain, in scale_rt_capacity(). This way capacity might actually
> >
> > Probably because until this v4.19-rcxx version, the rt scaling was done
> > relatively to local cpu capacity:
> >   capacity  = arch_scale_cpu() * scale_rt_capacity / SCHED_CAPACITY_SCALE
> >
> > Whereas now, it directly returns the remaining capacity
> >
> > > be more than the capacity_orig. I am always under an impression that
> > > capacity_orig > capacity.  Or am I misunderstanding that?
> >
> > You are right, there is a bug for SMT and the patch below should fix it.
> > Nevertheless, we still have the problem in some other places in the code.
> >
> > Subject: [PATCH] sched/fair: fix scale_rt_capacity() for SMT
> >
> > Since commit:
> >   commit 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> > scale_rt_capacity() returns the remaining capacity and not a scale factor
> > to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
> > scale_rt_capacity() so we must take the sched_domain argument
> >
> > Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> > Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
>
> --
> Thanks and Regards
> Srikar Dronamraju
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-05  7:36         ` Vincent Guittot
@ 2018-09-05  8:50           ` Srikar Dronamraju
  2018-09-05  9:11             ` Vincent Guittot
  0 siblings, 1 reply; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-05  8:50 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

* Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 09:36:42]:

> >
> > I dont know of any systems that have come with single threaded and
> > multithreaded. However some user can still offline few threads in a core
> > while leaving other cores untouched. I dont really know why somebody
> > would want to do it.  For example, some customer was toying with SMT 3
> > mode in a SMT 8 power8 box.
> 
> In this case, it means that we have the same core capacity whatever
> the number of CPUs
> and a core with SMT 3 will be set with the same compute capacity as
> the core with SMT 8.
> Does it still make sense ?
> 

To me it make sense atleast from  a power 8 perspective, because SMT 1 >
SMT 2 > SMT 4 > SMT8. So if one core is configured for SMT 2 and other
core is configured for SMT4; all threads being busy, the  individual
threads running on SMT2 core will complete more work than SMT 4 core
threads.

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-05  8:50           ` Srikar Dronamraju
@ 2018-09-05  9:11             ` Vincent Guittot
  2018-09-05 11:14               ` Srikar Dronamraju
  0 siblings, 1 reply; 25+ messages in thread
From: Vincent Guittot @ 2018-09-05  9:11 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Wed, 5 Sep 2018 at 10:50, Srikar Dronamraju
<srikar@linux.vnet.ibm.com> wrote:
>
> * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 09:36:42]:
>
> > >
> > > I dont know of any systems that have come with single threaded and
> > > multithreaded. However some user can still offline few threads in a core
> > > while leaving other cores untouched. I dont really know why somebody
> > > would want to do it.  For example, some customer was toying with SMT 3
> > > mode in a SMT 8 power8 box.
> >
> > In this case, it means that we have the same core capacity whatever
> > the number of CPUs
> > and a core with SMT 3 will be set with the same compute capacity as
> > the core with SMT 8.
> > Does it still make sense ?
> >
>
> To me it make sense atleast from  a power 8 perspective, because SMT 1 >
> SMT 2 > SMT 4 > SMT8. So if one core is configured for SMT 2 and other
> core is configured for SMT4; all threads being busy, the  individual
> threads running on SMT2 core will complete more work than SMT 4 core
> threads.

I agree for individual thread capacity but at core group level, the
core SMT 1 will have the same capacity as core group SMT 8 so load
balance will try to balance evenly the tasks between the 2 cores
whereas core SMT 8 > core SMT1 , isn't it ?

>
> --
> Thanks and Regards
> Srikar Dronamraju
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-05  9:11             ` Vincent Guittot
@ 2018-09-05 11:14               ` Srikar Dronamraju
  2018-09-06  9:21                 ` Vincent Guittot
  0 siblings, 1 reply; 25+ messages in thread
From: Srikar Dronamraju @ 2018-09-05 11:14 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

* Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 11:11:35]:

> On Wed, 5 Sep 2018 at 10:50, Srikar Dronamraju
> <srikar@linux.vnet.ibm.com> wrote:
> >
> > * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 09:36:42]:
> >
> > > >
> > > > I dont know of any systems that have come with single threaded and
> > > > multithreaded. However some user can still offline few threads in a core
> > > > while leaving other cores untouched. I dont really know why somebody
> > > > would want to do it.  For example, some customer was toying with SMT 3
> > > > mode in a SMT 8 power8 box.
> > >
> > > In this case, it means that we have the same core capacity whatever
> > > the number of CPUs
> > > and a core with SMT 3 will be set with the same compute capacity as
> > > the core with SMT 8.
> > > Does it still make sense ?
> > >
> >
> > To me it make sense atleast from  a power 8 perspective, because SMT 1 >
> > SMT 2 > SMT 4 > SMT8. So if one core is configured for SMT 2 and other
> > core is configured for SMT4; all threads being busy, the  individual
> > threads running on SMT2 core will complete more work than SMT 4 core
> > threads.
> 
> I agree for individual thread capacity but at core group level, the
> core SMT 1 will have the same capacity as core group SMT 8 so load
> balance will try to balance evenly the tasks between the 2 cores
> whereas core SMT 8 > core SMT1 , isn't it ?
> 

I believe that Core capacity irrespective of the number of threads
should be similar. We wanted to give a small benefit if the core has
multiple threads and that was smt_gain. Lets say we have 8 equal sw
threads running on 2 cores; one being SMT 2 and other being SMT4.
then 4 threads should be spread to each core. So that we would be fair
to each of the 8 SW threads.

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-05 11:14               ` Srikar Dronamraju
@ 2018-09-06  9:21                 ` Vincent Guittot
  0 siblings, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2018-09-06  9:21 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Wed, 5 Sep 2018 at 13:14, Srikar Dronamraju
<srikar@linux.vnet.ibm.com> wrote:
>
> * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 11:11:35]:
>
> > On Wed, 5 Sep 2018 at 10:50, Srikar Dronamraju
> > <srikar@linux.vnet.ibm.com> wrote:
> > >
> > > * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-05 09:36:42]:
> > >
> > > > >
> > > > > I dont know of any systems that have come with single threaded and
> > > > > multithreaded. However some user can still offline few threads in a core
> > > > > while leaving other cores untouched. I dont really know why somebody
> > > > > would want to do it.  For example, some customer was toying with SMT 3
> > > > > mode in a SMT 8 power8 box.
> > > >
> > > > In this case, it means that we have the same core capacity whatever
> > > > the number of CPUs
> > > > and a core with SMT 3 will be set with the same compute capacity as
> > > > the core with SMT 8.
> > > > Does it still make sense ?
> > > >
> > >
> > > To me it make sense atleast from  a power 8 perspective, because SMT 1 >
> > > SMT 2 > SMT 4 > SMT8. So if one core is configured for SMT 2 and other
> > > core is configured for SMT4; all threads being busy, the  individual
> > > threads running on SMT2 core will complete more work than SMT 4 core
> > > threads.
> >
> > I agree for individual thread capacity but at core group level, the
> > core SMT 1 will have the same capacity as core group SMT 8 so load
> > balance will try to balance evenly the tasks between the 2 cores
> > whereas core SMT 8 > core SMT1 , isn't it ?
> >
>
> I believe that Core capacity irrespective of the number of threads
> should be similar. We wanted to give a small benefit if the core has
> multiple threads and that was smt_gain. Lets say we have 8 equal sw
> threads running on 2 cores; one being SMT 2 and other being SMT4.
> then 4 threads should be spread to each core. So that we would be fair
> to each of the 8 SW threads.

Do you mean that it would be the same with SMT 2 and SMT 8 ?
evenly spread the 8 SW threads between the 2 cores would be better
than 2 SW threads on core SMT 2 and 6 on core SMT8

>
> --
> Thanks and Regards
> Srikar Dronamraju
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-08-29 14:08   ` Qais Yousef
  2018-08-29 14:42     ` Vincent Guittot
@ 2018-09-07 12:42     ` Peter Zijlstra
  2018-09-10 11:23       ` Qais Yousef
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2018-09-07 12:42 UTC (permalink / raw)
  To: Qais Yousef; +Cc: Vincent Guittot, mingo, linux-kernel

On Wed, Aug 29, 2018 at 03:08:38PM +0100, Qais Yousef wrote:
> It was hard to track the history of this and *why* it's needed.

Yeah, my bad..

So at some point I wanted to do dynamic capacity and dynamic smt gain by
using the x86 APERF/MPERF stuff. But it never quite worked and we got
stuck with the remnants.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched/fair: Fix scale_rt_capacity() for SMT
  2018-09-04  9:36     ` Vincent Guittot
  2018-09-04 10:37       ` Srikar Dronamraju
@ 2018-09-10 10:07       ` tip-bot for Vincent Guittot
  1 sibling, 0 replies; 25+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-09-10 10:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, vincent.guittot, srikar, tglx, torvalds,
	hpa, peterz

Commit-ID:  287cdaac5700c5b8970d739f73d742d863d3e2ca
Gitweb:     https://git.kernel.org/tip/287cdaac5700c5b8970d739f73d742d863d3e2ca
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Tue, 4 Sep 2018 11:36:26 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 10 Sep 2018 10:13:47 +0200

sched/fair: Fix scale_rt_capacity() for SMT

Since commit:

  523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")

scale_rt_capacity() returns the remaining capacity and not a scale factor
to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
scale_rt_capacity() so we must take the sched_domain argument.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
Link: http://lkml.kernel.org/r/20180904093626.GA23936@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c6b7d6daab20..f12d004be6a1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7488,10 +7488,10 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 	return load_idx;
 }
 
-static unsigned long scale_rt_capacity(int cpu)
+static unsigned long scale_rt_capacity(struct sched_domain *sd, int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
+	unsigned long max = arch_scale_cpu_capacity(sd, cpu);
 	unsigned long used, free;
 	unsigned long irq;
 
@@ -7513,7 +7513,7 @@ static unsigned long scale_rt_capacity(int cpu)
 
 static void update_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	unsigned long capacity = scale_rt_capacity(cpu);
+	unsigned long capacity = scale_rt_capacity(sd, cpu);
 	struct sched_group *sdg = sd->groups;
 
 	cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(sd, cpu);

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched/numa: Remove unused code from update_numa_stats()
  2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
  2018-09-04  6:39   ` Srikar Dronamraju
@ 2018-09-10 10:19   ` tip-bot for Vincent Guittot
  1 sibling, 0 replies; 25+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-09-10 10:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, mingo, mgorman, peterz, riel, tglx, srikar,
	torvalds, vincent.guittot

Commit-ID:  d90707ebebe03596e19de3abbf79b766e72a3465
Gitweb:     https://git.kernel.org/tip/d90707ebebe03596e19de3abbf79b766e72a3465
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Wed, 29 Aug 2018 15:19:09 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 10 Sep 2018 11:05:55 +0200

sched/numa: Remove unused code from update_numa_stats()

With:

  commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")

the local variables 'smt', 'cpus' and 'capacity' and their results are not used
anymore in numa_has_capacity()

Remove this unused code.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1535548752-4434-2-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 06ff75f4ac7b..b65596fae06b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1463,8 +1463,7 @@ struct numa_stats {
  */
 static void update_numa_stats(struct numa_stats *ns, int nid)
 {
-	int smt, cpu, cpus = 0;
-	unsigned long capacity;
+	int cpu;
 
 	memset(ns, 0, sizeof(*ns));
 	for_each_cpu(cpu, cpumask_of_node(nid)) {
@@ -1473,26 +1472,8 @@ static void update_numa_stats(struct numa_stats *ns, int nid)
 		ns->nr_running += rq->nr_running;
 		ns->load += weighted_cpuload(rq);
 		ns->compute_capacity += capacity_of(cpu);
-
-		cpus++;
 	}
 
-	/*
-	 * If we raced with hotplug and there are no CPUs left in our mask
-	 * the @ns structure is NULL'ed and task_numa_compare() will
-	 * not find this node attractive.
-	 *
-	 * We'll detect a huge imbalance and bail there.
-	 */
-	if (!cpus)
-		return;
-
-	/* smt := ceil(cpus / capacity), assumes: 1 < smt_power < 2 */
-	smt = DIV_ROUND_UP(SCHED_CAPACITY_SCALE * cpus, ns->compute_capacity);
-	capacity = cpus / smt; /* cores */
-
-	capacity = min_t(unsigned, capacity,
-		DIV_ROUND_CLOSEST(ns->compute_capacity, SCHED_CAPACITY_SCALE));
 }
 
 struct task_numa_env {

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched/numa: Remove unused numa_stats::nr_running field
  2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
  2018-09-04  6:40   ` Srikar Dronamraju
@ 2018-09-10 10:20   ` tip-bot for Vincent Guittot
  1 sibling, 0 replies; 25+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-09-10 10:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, peterz, hpa, mgorman, riel, tglx,
	vincent.guittot, srikar, torvalds

Commit-ID:  7477a3504e619768c9e972dafe2907e6b8ed9823
Gitweb:     https://git.kernel.org/tip/7477a3504e619768c9e972dafe2907e6b8ed9823
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Wed, 29 Aug 2018 15:19:10 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 10 Sep 2018 11:05:56 +0200

sched/numa: Remove unused numa_stats::nr_running field

nr_running in struct numa_stats is not used anywhere in the code.

Remove it.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1535548752-4434-3-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b65596fae06b..6bd142d19549 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1454,8 +1454,6 @@ struct numa_stats {
 
 	/* Total compute capacity of CPUs on a node */
 	unsigned long compute_capacity;
-
-	unsigned int nr_running;
 };
 
 /*
@@ -1469,7 +1467,6 @@ static void update_numa_stats(struct numa_stats *ns, int nid)
 	for_each_cpu(cpu, cpumask_of_node(nid)) {
 		struct rq *rq = cpu_rq(cpu);
 
-		ns->nr_running += rq->nr_running;
 		ns->load += weighted_cpuload(rq);
 		ns->compute_capacity += capacity_of(cpu);
 	}

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-04 10:37       ` Srikar Dronamraju
  2018-09-05  7:36         ` Vincent Guittot
@ 2018-09-10 11:05         ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2018-09-10 11:05 UTC (permalink / raw)
  To: Srikar Dronamraju, Vincent Guittot; +Cc: peterz, mingo, linux-kernel

On 04/09/18 11:37, Srikar Dronamraju wrote:
> * Vincent Guittot <vincent.guittot@linaro.org> [2018-09-04 11:36:26]:
>
>> Hi Srikar,
>>
>> Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
>>> However after this change, capacity_orig of each SMT thread would be
>>> 1024. For example SMT 8 core capacity_orig would now be 8192.
>>>
>>> smt_gain was suppose to make a multi threaded core was slightly more
>>> powerful than a single threaded core. I suspect if that sometimes hurt
>> Is there system with both single threaded and multi threaded core ?
>> That was the main open point for me (and for Qais too)
>>
> I dont know of any systems that have come with single threaded and
> multithreaded. However some user can still offline few threads in a core
> while leaving other cores untouched. I dont really know why somebody
> would want to do it.  For example, some customer was toying with SMT 3
> mode in a SMT 8 power8 box.

What was the customer trying to achieve by this? Did this end up being 
useful?

If we get mixed SMT 3 and SMT 8 in the system we might have issues, but 
again I don't see how this makes sense from system point of view. I 
don't think power savings will be significant by turning off few 
hardware threads since the core which I'd expect to be the main energy 
consumer is still on. From performance perspective, SMT 3 might have 
less contention (hence better 'performance'), but we don't do anything 
special to decide to schedule work in this case to take advantage of 
that. So I don't think the setup is useful to cater for or encourage.

-- 
Qais Yousef


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/4] sched/topology: remove smt_gain
  2018-09-07 12:42     ` Peter Zijlstra
@ 2018-09-10 11:23       ` Qais Yousef
  0 siblings, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2018-09-10 11:23 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Vincent Guittot, mingo, linux-kernel

On 07/09/18 13:42, Peter Zijlstra wrote:
> On Wed, Aug 29, 2018 at 03:08:38PM +0100, Qais Yousef wrote:
>> It was hard to track the history of this and *why* it's needed.
> Yeah, my bad..
>
> So at some point I wanted to do dynamic capacity and dynamic smt gain by
> using the x86 APERF/MPERF stuff. But it never quite worked and we got
> stuck with the remnants.


OK thanks for the info. I think that was 10 years ago? With all the code 
moves and renaming git log -L was getting more difficult to use as I dug 
deeper in history :p

Is this work officially dead then? If yes, it ended up not being useful 
or just never got around finishing it?

Thanks

-- 
Qais Yousef


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [tip:sched/core] sched/topology: Remove the ::smt_gain field from 'struct sched_domain'
  2018-08-29 13:19 ` [RFC PATCH 3/4] sched/topology: remove smt_gain Vincent Guittot
  2018-08-29 14:08   ` Qais Yousef
  2018-09-04  8:24   ` Srikar Dronamraju
@ 2018-12-11 15:31   ` tip-bot for Vincent Guittot
  2 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-12-11 15:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, mingo, vincent.guittot, linux-kernel, peterz, torvalds,
	hpa, efault

Commit-ID:  765d0af19f5f388a34bf4533378f8398b72ded46
Gitweb:     https://git.kernel.org/tip/765d0af19f5f388a34bf4533378f8398b72ded46
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Wed, 29 Aug 2018 15:19:11 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 11 Dec 2018 15:16:57 +0100

sched/topology: Remove the ::smt_gain field from 'struct sched_domain'

::smt_gain is used to compute the capacity of CPUs of a SMT core with the
constraint 1 < ::smt_gain < 2 in order to be able to compute number of CPUs
per core. The field has_free_capacity of struct numa_stat, which was the
last user of this computation of number of CPUs per core, has been removed
by:

  2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")

We can now remove this constraint on core capacity and use the defautl value
SCHED_CAPACITY_SCALE for SMT CPUs. With this remove, SCHED_CAPACITY_SCALE
becomes the maximum compute capacity of CPUs on every systems. This should
help to simplify some code and remove fields like rd->max_cpu_capacity

Furthermore, arch_scale_cpu_capacity() is used with a NULL sd in several other
places in the code when it wants the capacity of a CPUs to scale
some metrics like in pelt, deadline or schedutil. In case on SMT, the value
returned is not the capacity of SMT CPUs but default SCHED_CAPACITY_SCALE.

So remove it.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1535548752-4434-4-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/sched/topology.h | 1 -
 kernel/sched/sched.h           | 3 ---
 kernel/sched/topology.c        | 2 --
 3 files changed, 6 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 6b9976180c1e..7fa0bc17cd8c 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -89,7 +89,6 @@ struct sched_domain {
 	unsigned int newidle_idx;
 	unsigned int wake_idx;
 	unsigned int forkexec_idx;
-	unsigned int smt_gain;
 
 	int nohz_idle;			/* NOHZ IDLE status */
 	int flags;			/* See SD_* */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9bde60a11805..ceb896404869 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1864,9 +1864,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
 static __always_inline
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
-		return sd->smt_gain / sd->span_weight;
-
 	return SCHED_CAPACITY_SCALE;
 }
 #endif
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 8d7f15ba5916..7364e0b427b7 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1133,7 +1133,6 @@ sd_init(struct sched_domain_topology_level *tl,
 
 		.last_balance		= jiffies,
 		.balance_interval	= sd_weight,
-		.smt_gain		= 0,
 		.max_newidle_lb_cost	= 0,
 		.next_decay_max_lb_cost	= jiffies,
 		.child			= child,
@@ -1164,7 +1163,6 @@ sd_init(struct sched_domain_topology_level *tl,
 
 	if (sd->flags & SD_SHARE_CPUCAPACITY) {
 		sd->imbalance_pct = 110;
-		sd->smt_gain = 1178; /* ~15% */
 
 	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
 		sd->imbalance_pct = 117;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-12-11 15:31 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-29 13:19 [PATCH 0/4] sched/numa: remove unused code Vincent Guittot
2018-08-29 13:19 ` [PATCH 1/4] sched/numa: remove unused code from update_numa_stats() Vincent Guittot
2018-09-04  6:39   ` Srikar Dronamraju
2018-09-10 10:19   ` [tip:sched/core] sched/numa: Remove " tip-bot for Vincent Guittot
2018-08-29 13:19 ` [PATCH 2/4] sched/numa: remove unused nr_running field Vincent Guittot
2018-09-04  6:40   ` Srikar Dronamraju
2018-09-10 10:20   ` [tip:sched/core] sched/numa: Remove unused numa_stats::nr_running field tip-bot for Vincent Guittot
2018-08-29 13:19 ` [RFC PATCH 3/4] sched/topology: remove smt_gain Vincent Guittot
2018-08-29 14:08   ` Qais Yousef
2018-08-29 14:42     ` Vincent Guittot
2018-09-07 12:42     ` Peter Zijlstra
2018-09-10 11:23       ` Qais Yousef
2018-09-04  8:24   ` Srikar Dronamraju
2018-09-04  9:36     ` Vincent Guittot
2018-09-04 10:37       ` Srikar Dronamraju
2018-09-05  7:36         ` Vincent Guittot
2018-09-05  8:50           ` Srikar Dronamraju
2018-09-05  9:11             ` Vincent Guittot
2018-09-05 11:14               ` Srikar Dronamraju
2018-09-06  9:21                 ` Vincent Guittot
2018-09-10 11:05         ` Qais Yousef
2018-09-10 10:07       ` [tip:sched/core] sched/fair: Fix scale_rt_capacity() for SMT tip-bot for Vincent Guittot
2018-12-11 15:31   ` [tip:sched/core] sched/topology: Remove the ::smt_gain field from 'struct sched_domain' tip-bot for Vincent Guittot
2018-08-29 13:19 ` [RFC PATCH 4/4] sched/topology: remove unused sd param from arch_scale_cpu_capacity() Vincent Guittot
2018-08-29 13:19   ` Vincent Guittot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.