All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-08 14:09 ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	juri.lelli, broonie

Hi all,

this is take 2 of https://lkml.org/lkml/2015/11/23/391; some context follows.

ARM systems may be configured to have CPUs with different power/performance
characteristics within the same chip. In this case, additional information has
to be made available to the kernel (the scheduler in particular) for it to be
aware of such differences and take decisions accordingly. This RFC stems from
the ongoing discussion about introducing a simple platform energy cost model to
guide scheduling decisions (a.k.a Energy Aware Scheduling [1]), but also aims
to be an independent track aimed to standardise the way we make the scheduler
aware of heterogenous CPU systems. With these patches and in addition patches
from [1] (that make the scheduler wakeup paths aware of heterogenous CPU
systems) we enable the scheduler to have good default performance on such
systems. In addition, we get a clearly defined way of providing the scheduler
with needed information about CPU capacity on such systems.

CPU capacity is defined in this context as a number that provides the scheduler
information about CPUs heterogeneity. Such heterogeneity can come from
micro-architectural differences (e.g., ARM big.LITTLE systems) or maximum
frequency at which CPUs can run (e.g., SMP systems with multiple frequency
domains and different max frequencies). Heterogeneity in this context is about
differing performance characteristics; in practice, the binding that we propose
in this RFC tries to capture a first-order approximation of the relative
performance of CPUs.

Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
problem of how do we init CPUs original capacity: we run a bogus benchmark (for
this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
some integer computation, I'm sure there are better benchmarks around) on the
first cpu of each frequency domain (assuming no u-arch differences inside
domains), measure time to complete a fixed number of iterations and then
normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
polishing this up or thinking about a better benchmark, as this is an RFC and
I'd like discussion happening before we make this solution better
working/looking. However, surprisingly, results are not that bad already:

                          LITTLE      big

 TC2-userspace_profile     430        1024
 TC2-dynamic_profile      ~490        1024

 JUNO-userspace_profile    446        1024
 JUNO-dynamic_profile     ~424        1024

Considering v1 approach as well, there are currently three proposals for
providing CPUs capacity information; each one has of course pros and cons. I'll
try to summarize the long discussion we had about v1 in the list that follows
(mixing in my personal view points :-)), please don't hesitate to add/comment
(and thanks a lot for the time spent reviewing v1!):

 1. DT (v1)

    pros: - clean and easy to implement
          - standard for both arm and arm64 (and possibly other archs)
          - requires profiling only once and in userspace

    cons: - capacity is not a physical, unequivocally definable property
          - might be incorrecly used for tuning purposes
          - it's a standard, so it requires additional care when defining it

 2. Dynamic profiling at boot (v2)

    pros: - does not require a standardized definition of capacity
          - cannot be incorrectly tuned (once benchmark is fixed)
          - does not require user/integrator work

    cons: - not easy to come up with a clean solution, as it seems interaction
            with several subsystems (e.g., cpufreq) is required
          - not easy to agree upon a single benchmark (that has to be both
            representative and simple enough to run at boot)
          - numbers might (and do) vary from boot to boot

 3. sysfs (v1)

    pros: - clean and super easy to implement
          - values don't require to be physical properties, defining them is
            probably easier

    cons: - CPUs capacity have to be provided after boot (by some init script?)
          - API is modified, still some discussion/review is needed
          - values can still be incorrectly used for runtime tuning purposes

Patches high level description:

 o 01/04 cleans up how cpu_scale is initialized in arm (already landed on
   Russell's patch system)
 o 02/04 introduces dynamic profiling of CPUs capacity at boot
 o [03-04]/04 enable dynamic profiling for arm and arm64.

The patchset is based on top of tip/sched/core as of today (4.4.0-rc8).

In case you would like to test this out, I pushed a branch here:

 git://linux-arm.org/linux-jl.git upstream/default_caps_v2

This branch contains additional patches, useful to better understand how CPU
capacity information is actually used by the scheduler. Discussion regarding
these additional patches will be started with a different posting in the
future. We just didn't want to make discussion too broad, as we realize that
this set can be controversial already on its own.

Comments, concerns and rants are more than welcome!

Best,

- Juri

Juri Lelli (4):
  ARM: initialize cpu_scale to its default
  drivers/cpufreq: implement init_cpu_capacity_default()
  arm: Enable dynamic CPU capacity initialization
  arm64: Enable dynamic CPU capacity initialization

 arch/arm/kernel/topology.c         |  11 ++-
 arch/arm64/kernel/topology.c       |  17 ++++
 drivers/cpufreq/Makefile           |   2 +-
 drivers/cpufreq/cpufreq.c          |   1 +
 drivers/cpufreq/cpufreq_capacity.c | 161 +++++++++++++++++++++++++++++++++++++
 include/linux/cpufreq.h            |   2 +
 6 files changed, 189 insertions(+), 5 deletions(-)
 create mode 100644 drivers/cpufreq/cpufreq_capacity.c

-- 
2.2.2

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-08 14:09 ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

this is take 2 of https://lkml.org/lkml/2015/11/23/391; some context follows.

ARM systems may be configured to have CPUs with different power/performance
characteristics within the same chip. In this case, additional information has
to be made available to the kernel (the scheduler in particular) for it to be
aware of such differences and take decisions accordingly. This RFC stems from
the ongoing discussion about introducing a simple platform energy cost model to
guide scheduling decisions (a.k.a Energy Aware Scheduling [1]), but also aims
to be an independent track aimed to standardise the way we make the scheduler
aware of heterogenous CPU systems. With these patches and in addition patches
from [1] (that make the scheduler wakeup paths aware of heterogenous CPU
systems) we enable the scheduler to have good default performance on such
systems. In addition, we get a clearly defined way of providing the scheduler
with needed information about CPU capacity on such systems.

CPU capacity is defined in this context as a number that provides the scheduler
information about CPUs heterogeneity. Such heterogeneity can come from
micro-architectural differences (e.g., ARM big.LITTLE systems) or maximum
frequency at which CPUs can run (e.g., SMP systems with multiple frequency
domains and different max frequencies). Heterogeneity in this context is about
differing performance characteristics; in practice, the binding that we propose
in this RFC tries to capture a first-order approximation of the relative
performance of CPUs.

Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
problem of how do we init CPUs original capacity: we run a bogus benchmark (for
this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
some integer computation, I'm sure there are better benchmarks around) on the
first cpu of each frequency domain (assuming no u-arch differences inside
domains), measure time to complete a fixed number of iterations and then
normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
polishing this up or thinking about a better benchmark, as this is an RFC and
I'd like discussion happening before we make this solution better
working/looking. However, surprisingly, results are not that bad already:

                          LITTLE      big

 TC2-userspace_profile     430        1024
 TC2-dynamic_profile      ~490        1024

 JUNO-userspace_profile    446        1024
 JUNO-dynamic_profile     ~424        1024

Considering v1 approach as well, there are currently three proposals for
providing CPUs capacity information; each one has of course pros and cons. I'll
try to summarize the long discussion we had about v1 in the list that follows
(mixing in my personal view points :-)), please don't hesitate to add/comment
(and thanks a lot for the time spent reviewing v1!):

 1. DT (v1)

    pros: - clean and easy to implement
          - standard for both arm and arm64 (and possibly other archs)
          - requires profiling only once and in userspace

    cons: - capacity is not a physical, unequivocally definable property
          - might be incorrecly used for tuning purposes
          - it's a standard, so it requires additional care when defining it

 2. Dynamic profiling at boot (v2)

    pros: - does not require a standardized definition of capacity
          - cannot be incorrectly tuned (once benchmark is fixed)
          - does not require user/integrator work

    cons: - not easy to come up with a clean solution, as it seems interaction
            with several subsystems (e.g., cpufreq) is required
          - not easy to agree upon a single benchmark (that has to be both
            representative and simple enough to run at boot)
          - numbers might (and do) vary from boot to boot

 3. sysfs (v1)

    pros: - clean and super easy to implement
          - values don't require to be physical properties, defining them is
            probably easier

    cons: - CPUs capacity have to be provided after boot (by some init script?)
          - API is modified, still some discussion/review is needed
          - values can still be incorrectly used for runtime tuning purposes

Patches high level description:

 o 01/04 cleans up how cpu_scale is initialized in arm (already landed on
   Russell's patch system)
 o 02/04 introduces dynamic profiling of CPUs capacity at boot
 o [03-04]/04 enable dynamic profiling for arm and arm64.

The patchset is based on top of tip/sched/core as of today (4.4.0-rc8).

In case you would like to test this out, I pushed a branch here:

 git://linux-arm.org/linux-jl.git upstream/default_caps_v2

This branch contains additional patches, useful to better understand how CPU
capacity information is actually used by the scheduler. Discussion regarding
these additional patches will be started with a different posting in the
future. We just didn't want to make discussion too broad, as we realize that
this set can be controversial already on its own.

Comments, concerns and rants are more than welcome!

Best,

- Juri

Juri Lelli (4):
  ARM: initialize cpu_scale to its default
  drivers/cpufreq: implement init_cpu_capacity_default()
  arm: Enable dynamic CPU capacity initialization
  arm64: Enable dynamic CPU capacity initialization

 arch/arm/kernel/topology.c         |  11 ++-
 arch/arm64/kernel/topology.c       |  17 ++++
 drivers/cpufreq/Makefile           |   2 +-
 drivers/cpufreq/cpufreq.c          |   1 +
 drivers/cpufreq/cpufreq_capacity.c | 161 +++++++++++++++++++++++++++++++++++++
 include/linux/cpufreq.h            |   2 +
 6 files changed, 189 insertions(+), 5 deletions(-)
 create mode 100644 drivers/cpufreq/cpufreq_capacity.c

-- 
2.2.2

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 1/4] ARM: initialize cpu_scale to its default
  2016-01-08 14:09 ` Juri Lelli
  (?)
@ 2016-01-08 14:09   ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	juri.lelli, broonie

Instead of looping through all cpus calling set_capacity_scale, we can
initialise cpu_scale per-cpu variables to SCHED_CAPACITY_SCALE with their
definition.

Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 08b7847..ec279d1 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -40,7 +40,7 @@
  * to run the rebalance_domains for all idle cores and the cpu_capacity can be
  * updated during this sequence.
  */
-static DEFINE_PER_CPU(unsigned long, cpu_scale);
+static DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
 
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
@@ -306,8 +306,6 @@ void __init init_cpu_topology(void)
 		cpu_topo->socket_id = -1;
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_clear(&cpu_topo->thread_sibling);
-
-		set_capacity_scale(cpu, SCHED_CAPACITY_SCALE);
 	}
 	smp_wmb();
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 1/4] ARM: initialize cpu_scale to its default
@ 2016-01-08 14:09   ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: mark.rutland, lorenzo.pieralisi, vincent.guittot, juri.lelli,
	linux-pm, peterz, catalin.marinas, broonie, will.deacon,
	dietmar.eggemann, robh+dt, sudeep.holla, linux, morten.rasmussen,
	linux-arm-kernel

Instead of looping through all cpus calling set_capacity_scale, we can
initialise cpu_scale per-cpu variables to SCHED_CAPACITY_SCALE with their
definition.

Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 08b7847..ec279d1 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -40,7 +40,7 @@
  * to run the rebalance_domains for all idle cores and the cpu_capacity can be
  * updated during this sequence.
  */
-static DEFINE_PER_CPU(unsigned long, cpu_scale);
+static DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
 
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
@@ -306,8 +306,6 @@ void __init init_cpu_topology(void)
 		cpu_topo->socket_id = -1;
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_clear(&cpu_topo->thread_sibling);
-
-		set_capacity_scale(cpu, SCHED_CAPACITY_SCALE);
 	}
 	smp_wmb();
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 1/4] ARM: initialize cpu_scale to its default
@ 2016-01-08 14:09   ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

Instead of looping through all cpus calling set_capacity_scale, we can
initialise cpu_scale per-cpu variables to SCHED_CAPACITY_SCALE with their
definition.

Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 08b7847..ec279d1 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -40,7 +40,7 @@
  * to run the rebalance_domains for all idle cores and the cpu_capacity can be
  * updated during this sequence.
  */
-static DEFINE_PER_CPU(unsigned long, cpu_scale);
+static DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
 
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
@@ -306,8 +306,6 @@ void __init init_cpu_topology(void)
 		cpu_topo->socket_id = -1;
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_clear(&cpu_topo->thread_sibling);
-
-		set_capacity_scale(cpu, SCHED_CAPACITY_SCALE);
 	}
 	smp_wmb();
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 2/4] drivers/cpufreq: implement init_cpu_capacity_default()
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-08 14:09   ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	juri.lelli, broonie, Rafael J. Wysocki, Viresh Kumar

To get default values for CPUs capacity we profile a simple (bogus)
integer benchmark on such CPUs; then we normalize results to 1024
(highest capacity in the system).

Architectures that want this during boot have to register a cpufreq
driver callback and call this function from there (as we require
cpufreq to be up and running).

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c         |   2 +-
 arch/arm64/kernel/topology.c       |  12 +++
 drivers/cpufreq/Makefile           |   2 +-
 drivers/cpufreq/cpufreq.c          |   1 +
 drivers/cpufreq/cpufreq_capacity.c | 161 +++++++++++++++++++++++++++++++++++++
 include/linux/cpufreq.h            |   2 +
 6 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 drivers/cpufreq/cpufreq_capacity.c

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index ec279d1..c9c87a5 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -47,7 +47,7 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 	return per_cpu(cpu_scale, cpu);
 }
 
-static void set_capacity_scale(unsigned int cpu, unsigned long capacity)
+void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 {
 	per_cpu(cpu_scale, cpu) = capacity;
 }
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 694f6de..3b75d63 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -23,6 +23,18 @@
 #include <asm/cputype.h>
 #include <asm/topology.h>
 
+static DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
+
+unsigned long arm_arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
+{
+	return per_cpu(cpu_scale, cpu);
+}
+
+void set_capacity_scale(unsigned int cpu, unsigned long capacity)
+{
+	per_cpu(cpu_scale, cpu) = capacity;
+}
+
 static int __init get_cpu_for_node(struct device_node *node)
 {
 	struct device_node *cpu_node;
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index c0af1a1..ca47aea 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -1,5 +1,5 @@
 # CPUfreq core
-obj-$(CONFIG_CPU_FREQ)			+= cpufreq.o freq_table.o
+obj-$(CONFIG_CPU_FREQ)			+= cpufreq.o freq_table.o cpufreq_capacity.o
 
 # CPUfreq stats
 obj-$(CONFIG_CPU_FREQ_STAT)             += cpufreq_stats.o
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 8412ce5..8720228 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2452,6 +2452,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	}
 
 	register_hotcpu_notifier(&cpufreq_cpu_notifier);
+	cpufreq_init_cpu_capacity();
 	pr_debug("driver %s up and running\n", driver_data->name);
 
 out:
diff --git a/drivers/cpufreq/cpufreq_capacity.c b/drivers/cpufreq/cpufreq_capacity.c
new file mode 100644
index 0000000..2fd5248
--- /dev/null
+++ b/drivers/cpufreq/cpufreq_capacity.c
@@ -0,0 +1,161 @@
+/*
+ * Default CPU capacity calculation for u-arch invariance
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Juri Lelli <juri.lelli@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#include <linux/cpufreq.h>
+#include <linux/sched.h>
+
+static unsigned long long elapsed[NR_CPUS];
+
+/*
+ * Don't let compiler optimize following two functions;
+ * otherwise we might loose u-arch differences.
+ * Also, my_int_sqrt is cut-and-paste from lib/int_sqrt.c.
+ */
+static unsigned long __attribute__((optimize("O0")))
+my_int_sqrt(unsigned long x)
+{
+	unsigned long b, m, y = 0;
+
+	if (x <= 1)
+		return x;
+
+	m = 1UL << (BITS_PER_LONG - 2);
+	while (m != 0) {
+		b = y + m;
+		y >>= 1;
+
+		if (x >= b) {
+			x -= b;
+			y += m;
+		}
+		m >>= 2;
+	}
+
+	return y;
+}
+
+static unsigned long __attribute__((optimize("O0")))
+bogus_bench(void)
+{
+	unsigned long i, res;
+
+	for (i = 0; i < 100000; i++)
+		res = my_int_sqrt(i);
+
+	return res;
+}
+
+static int run_bogus_benchmark(int cpu)
+{
+	int ret, trials = 25;
+	u64 begin, end, diff, diff_avg = 0, count = 0;
+	unsigned long res;
+
+	ret = set_cpus_allowed_ptr(current, cpumask_of(cpu));
+	if (ret) {
+		pr_warn("%s: failed to set allowed ptr\n", __func__);
+		return -EINVAL;
+	}
+
+	while (trials--) {
+		begin = local_clock();
+		res = bogus_bench();
+		end = local_clock();
+		diff = end - begin;
+		diff_avg = diff_avg * count + diff;
+		diff_avg = div64_u64(diff_avg, ++count);
+		pr_debug("%s: cpu=%d begin=%llu end=%llu"
+			 " diff=%llu diff_avg=%llu count=%llu res=%lu\n",
+			__func__, cpu, begin, end, diff,
+			diff_avg, count, res);
+	}
+	elapsed[cpu] = diff_avg;
+
+	ret = set_cpus_allowed_ptr(current, cpu_active_mask);
+	if (ret) {
+		pr_warn("%s: failed to set allowed ptr\n", __func__);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+bool __weak arch_wants_init_cpu_capacity(void)
+{
+	return false;
+}
+
+void __weak set_capacity_scale(int cpu, unsigned long capacity) { }
+
+void cpufreq_init_cpu_capacity(void)
+{
+	int cpu, fcpu;
+	unsigned long long elapsed_min = ULLONG_MAX;
+	unsigned int curr_min, curr_max;
+	struct cpufreq_policy *policy;
+
+	if (!arch_wants_init_cpu_capacity())
+		return;
+
+	for_each_possible_cpu(cpu) {
+		policy = cpufreq_cpu_get(cpu);
+		if (IS_ERR_OR_NULL(policy))
+			return;
+
+		/*
+		 * We profile only first CPU of each frequency domain;
+		 * and use that value as capacity of every CPU in the domain.
+		 */
+		fcpu = cpumask_first(policy->related_cpus);
+		if (cpu != fcpu) {
+			elapsed[cpu] = elapsed[fcpu];
+			cpufreq_cpu_put(policy);
+			continue;
+		}
+
+		down_write(&policy->rwsem);
+		curr_min = policy->user_policy.min;
+		curr_max = policy->user_policy.max;
+		policy->user_policy.min = policy->cpuinfo.max_freq;
+		policy->user_policy.max = policy->cpuinfo.max_freq;
+		up_write(&policy->rwsem);
+		cpufreq_cpu_put(policy);
+		cpufreq_update_policy(cpu);
+
+		run_bogus_benchmark(cpu);
+		if (elapsed[cpu] < elapsed_min)
+			elapsed_min = elapsed[cpu];
+		pr_debug("%s: cpu=%d elapsed=%llu (min=%llu)\n",
+				__func__, cpu, elapsed[cpu], elapsed_min);
+
+		policy = cpufreq_cpu_get(cpu);
+		down_write(&policy->rwsem);
+		policy->user_policy.min = curr_min;
+		policy->user_policy.max = curr_max;
+		up_write(&policy->rwsem);
+		cpufreq_cpu_put(policy);
+		cpufreq_update_policy(cpu);
+	}
+
+	for_each_possible_cpu(cpu) {
+		unsigned long capacity;
+
+		capacity = div64_u64((elapsed_min << 10), elapsed[cpu]);
+		pr_debug("%s: CPU%d capacity=%lu\n", __func__, cpu, capacity);
+		set_capacity_scale(cpu, capacity);
+	}
+
+	pr_info("dynamic CPUs capacity installed\n");
+}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 177c768..968be47 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -420,6 +420,8 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
 #endif
 }
 
+void cpufreq_init_cpu_capacity(void);
+
 /*********************************************************************
  *                          CPUFREQ GOVERNORS                        *
  *********************************************************************/
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 2/4] drivers/cpufreq: implement init_cpu_capacity_default()
@ 2016-01-08 14:09   ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

To get default values for CPUs capacity we profile a simple (bogus)
integer benchmark on such CPUs; then we normalize results to 1024
(highest capacity in the system).

Architectures that want this during boot have to register a cpufreq
driver callback and call this function from there (as we require
cpufreq to be up and running).

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c         |   2 +-
 arch/arm64/kernel/topology.c       |  12 +++
 drivers/cpufreq/Makefile           |   2 +-
 drivers/cpufreq/cpufreq.c          |   1 +
 drivers/cpufreq/cpufreq_capacity.c | 161 +++++++++++++++++++++++++++++++++++++
 include/linux/cpufreq.h            |   2 +
 6 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 drivers/cpufreq/cpufreq_capacity.c

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index ec279d1..c9c87a5 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -47,7 +47,7 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 	return per_cpu(cpu_scale, cpu);
 }
 
-static void set_capacity_scale(unsigned int cpu, unsigned long capacity)
+void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 {
 	per_cpu(cpu_scale, cpu) = capacity;
 }
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 694f6de..3b75d63 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -23,6 +23,18 @@
 #include <asm/cputype.h>
 #include <asm/topology.h>
 
+static DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
+
+unsigned long arm_arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
+{
+	return per_cpu(cpu_scale, cpu);
+}
+
+void set_capacity_scale(unsigned int cpu, unsigned long capacity)
+{
+	per_cpu(cpu_scale, cpu) = capacity;
+}
+
 static int __init get_cpu_for_node(struct device_node *node)
 {
 	struct device_node *cpu_node;
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index c0af1a1..ca47aea 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -1,5 +1,5 @@
 # CPUfreq core
-obj-$(CONFIG_CPU_FREQ)			+= cpufreq.o freq_table.o
+obj-$(CONFIG_CPU_FREQ)			+= cpufreq.o freq_table.o cpufreq_capacity.o
 
 # CPUfreq stats
 obj-$(CONFIG_CPU_FREQ_STAT)             += cpufreq_stats.o
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 8412ce5..8720228 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2452,6 +2452,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	}
 
 	register_hotcpu_notifier(&cpufreq_cpu_notifier);
+	cpufreq_init_cpu_capacity();
 	pr_debug("driver %s up and running\n", driver_data->name);
 
 out:
diff --git a/drivers/cpufreq/cpufreq_capacity.c b/drivers/cpufreq/cpufreq_capacity.c
new file mode 100644
index 0000000..2fd5248
--- /dev/null
+++ b/drivers/cpufreq/cpufreq_capacity.c
@@ -0,0 +1,161 @@
+/*
+ * Default CPU capacity calculation for u-arch invariance
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Juri Lelli <juri.lelli@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#include <linux/cpufreq.h>
+#include <linux/sched.h>
+
+static unsigned long long elapsed[NR_CPUS];
+
+/*
+ * Don't let compiler optimize following two functions;
+ * otherwise we might loose u-arch differences.
+ * Also, my_int_sqrt is cut-and-paste from lib/int_sqrt.c.
+ */
+static unsigned long __attribute__((optimize("O0")))
+my_int_sqrt(unsigned long x)
+{
+	unsigned long b, m, y = 0;
+
+	if (x <= 1)
+		return x;
+
+	m = 1UL << (BITS_PER_LONG - 2);
+	while (m != 0) {
+		b = y + m;
+		y >>= 1;
+
+		if (x >= b) {
+			x -= b;
+			y += m;
+		}
+		m >>= 2;
+	}
+
+	return y;
+}
+
+static unsigned long __attribute__((optimize("O0")))
+bogus_bench(void)
+{
+	unsigned long i, res;
+
+	for (i = 0; i < 100000; i++)
+		res = my_int_sqrt(i);
+
+	return res;
+}
+
+static int run_bogus_benchmark(int cpu)
+{
+	int ret, trials = 25;
+	u64 begin, end, diff, diff_avg = 0, count = 0;
+	unsigned long res;
+
+	ret = set_cpus_allowed_ptr(current, cpumask_of(cpu));
+	if (ret) {
+		pr_warn("%s: failed to set allowed ptr\n", __func__);
+		return -EINVAL;
+	}
+
+	while (trials--) {
+		begin = local_clock();
+		res = bogus_bench();
+		end = local_clock();
+		diff = end - begin;
+		diff_avg = diff_avg * count + diff;
+		diff_avg = div64_u64(diff_avg, ++count);
+		pr_debug("%s: cpu=%d begin=%llu end=%llu"
+			 " diff=%llu diff_avg=%llu count=%llu res=%lu\n",
+			__func__, cpu, begin, end, diff,
+			diff_avg, count, res);
+	}
+	elapsed[cpu] = diff_avg;
+
+	ret = set_cpus_allowed_ptr(current, cpu_active_mask);
+	if (ret) {
+		pr_warn("%s: failed to set allowed ptr\n", __func__);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+bool __weak arch_wants_init_cpu_capacity(void)
+{
+	return false;
+}
+
+void __weak set_capacity_scale(int cpu, unsigned long capacity) { }
+
+void cpufreq_init_cpu_capacity(void)
+{
+	int cpu, fcpu;
+	unsigned long long elapsed_min = ULLONG_MAX;
+	unsigned int curr_min, curr_max;
+	struct cpufreq_policy *policy;
+
+	if (!arch_wants_init_cpu_capacity())
+		return;
+
+	for_each_possible_cpu(cpu) {
+		policy = cpufreq_cpu_get(cpu);
+		if (IS_ERR_OR_NULL(policy))
+			return;
+
+		/*
+		 * We profile only first CPU of each frequency domain;
+		 * and use that value as capacity of every CPU in the domain.
+		 */
+		fcpu = cpumask_first(policy->related_cpus);
+		if (cpu != fcpu) {
+			elapsed[cpu] = elapsed[fcpu];
+			cpufreq_cpu_put(policy);
+			continue;
+		}
+
+		down_write(&policy->rwsem);
+		curr_min = policy->user_policy.min;
+		curr_max = policy->user_policy.max;
+		policy->user_policy.min = policy->cpuinfo.max_freq;
+		policy->user_policy.max = policy->cpuinfo.max_freq;
+		up_write(&policy->rwsem);
+		cpufreq_cpu_put(policy);
+		cpufreq_update_policy(cpu);
+
+		run_bogus_benchmark(cpu);
+		if (elapsed[cpu] < elapsed_min)
+			elapsed_min = elapsed[cpu];
+		pr_debug("%s: cpu=%d elapsed=%llu (min=%llu)\n",
+				__func__, cpu, elapsed[cpu], elapsed_min);
+
+		policy = cpufreq_cpu_get(cpu);
+		down_write(&policy->rwsem);
+		policy->user_policy.min = curr_min;
+		policy->user_policy.max = curr_max;
+		up_write(&policy->rwsem);
+		cpufreq_cpu_put(policy);
+		cpufreq_update_policy(cpu);
+	}
+
+	for_each_possible_cpu(cpu) {
+		unsigned long capacity;
+
+		capacity = div64_u64((elapsed_min << 10), elapsed[cpu]);
+		pr_debug("%s: CPU%d capacity=%lu\n", __func__, cpu, capacity);
+		set_capacity_scale(cpu, capacity);
+	}
+
+	pr_info("dynamic CPUs capacity installed\n");
+}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 177c768..968be47 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -420,6 +420,8 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
 #endif
 }
 
+void cpufreq_init_cpu_capacity(void);
+
 /*********************************************************************
  *                          CPUFREQ GOVERNORS                        *
  *********************************************************************/
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 3/4] arm: Enable dynamic CPU capacity initialization
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-08 14:09   ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	juri.lelli, broonie

Define arch_wants_init_cpu_capacity() to return true; so that
cpufreq_init_cpu_capacity() can go ahead and profile CPU capacities
at boot time.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index c9c87a5..7d7fc2c 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -52,6 +52,11 @@ void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 	per_cpu(cpu_scale, cpu) = capacity;
 }
 
+bool arch_wants_init_cpu_capacity(void)
+{
+	return true;
+}
+
 #ifdef CONFIG_OF
 struct cpu_efficiency {
 	const char *compatible;
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 3/4] arm: Enable dynamic CPU capacity initialization
@ 2016-01-08 14:09   ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

Define arch_wants_init_cpu_capacity() to return true; so that
cpufreq_init_cpu_capacity() can go ahead and profile CPU capacities
at boot time.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm/kernel/topology.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index c9c87a5..7d7fc2c 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -52,6 +52,11 @@ void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 	per_cpu(cpu_scale, cpu) = capacity;
 }
 
+bool arch_wants_init_cpu_capacity(void)
+{
+	return true;
+}
+
 #ifdef CONFIG_OF
 struct cpu_efficiency {
 	const char *compatible;
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 4/4] arm64: Enable dynamic CPU capacity initialization
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-08 14:09   ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	juri.lelli, broonie

Define arch_wants_init_cpu_capacity() to return true; so that
cpufreq_init_cpu_capacity() can go ahead and profile CPU capacities
at boot time.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm64/kernel/topology.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 3b75d63..f2513a6 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -35,6 +35,11 @@ void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 	per_cpu(cpu_scale, cpu) = capacity;
 }
 
+bool arch_wants_init_cpu_capacity(void)
+{
+	return true;
+}
+
 static int __init get_cpu_for_node(struct device_node *node)
 {
 	struct device_node *cpu_node;
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 4/4] arm64: Enable dynamic CPU capacity initialization
@ 2016-01-08 14:09   ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-08 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

Define arch_wants_init_cpu_capacity() to return true; so that
cpufreq_init_cpu_capacity() can go ahead and profile CPU capacities
at boot time.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 arch/arm64/kernel/topology.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 3b75d63..f2513a6 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -35,6 +35,11 @@ void set_capacity_scale(unsigned int cpu, unsigned long capacity)
 	per_cpu(cpu_scale, cpu) = capacity;
 }
 
+bool arch_wants_init_cpu_capacity(void)
+{
+	return true;
+}
+
 static int __init get_cpu_for_node(struct device_node *node)
 {
 	struct device_node *cpu_node;
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-15 18:01   ` Mark Brown
  -1 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-15 18:01 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, linux-arm-kernel, peterz,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann

[-- Attachment #1: Type: text/plain, Size: 1688 bytes --]

On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:

> Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> some integer computation, I'm sure there are better benchmarks around) on the
> first cpu of each frequency domain (assuming no u-arch differences inside
> domains), measure time to complete a fixed number of iterations and then
> normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> polishing this up or thinking about a better benchmark, as this is an RFC and
> I'd like discussion happening before we make this solution better
> working/looking. However, surprisingly, results are not that bad already:

This approach looks good to me - certainly vastly preferable to putting
the numbers into DT.

>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work

>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required

This actually seems to be pretty clean.

>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

This does come back to the question of how accurate the numbers need to
be - is "good enough" fine?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-15 18:01   ` Mark Brown
  0 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-15 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:

> Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> some integer computation, I'm sure there are better benchmarks around) on the
> first cpu of each frequency domain (assuming no u-arch differences inside
> domains), measure time to complete a fixed number of iterations and then
> normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> polishing this up or thinking about a better benchmark, as this is an RFC and
> I'd like discussion happening before we make this solution better
> working/looking. However, surprisingly, results are not that bad already:

This approach looks good to me - certainly vastly preferable to putting
the numbers into DT.

>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work

>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required

This actually seems to be pretty clean.

>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

This does come back to the question of how accurate the numbers need to
be - is "good enough" fine?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160115/c5e77b24/attachment.sig>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-15 19:50   ` Steve Muckle
  -1 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-15 19:50 UTC (permalink / raw)
  To: Juri Lelli, linux-kernel
  Cc: linux-pm, linux-arm-kernel, peterz, vincent.guittot, robh+dt,
	mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	broonie

On 01/08/2016 06:09 AM, Juri Lelli wrote:
>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work
> 
>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required
>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

An important additional con that was mentioned earlier IIRC was the
additional boot time required for the benchmark. Perhaps there could be
a kernel command line argument to bypass the benchmark if it is known
that predetermined values will be provided via sysfs later?

Though there may be another issue with that as mentioned below.

>  3. sysfs (v1)
> 
>     pros: - clean and super easy to implement
>           - values don't require to be physical properties, defining them is
>             probably easier
> 
>     cons: - CPUs capacity have to be provided after boot (by some init script?)
>           - API is modified, still some discussion/review is needed
>           - values can still be incorrectly used for runtime tuning purposes

Initializing the values via userspace init will cause more of the boot
process to run with incorrect CPU capacity values. Boot times may be
increased with tasks running on suboptimal CPUs. Such increases may also
not be deterministic.

Extending the kernel command line idea above, perhaps capacity values
could be provided there as well, similar to the lpj parameter? That has
scalability issues though if there's a huge highly heterogeneous platform...

DT solves these issues and would be the perfect place for this - we are
defining the compute capacity of a CPU which is a property of the
hardware. However there are a couple things forcing us to compromise.
One is that the amount and detail of information required to adequately
capture the computational abilities of a CPU across all possible
workloads seem onerous to collect and enumerate. The second is that even
if we were willing to undertake that, CPU vendors probably won't be
forthcoming with that information.

Despite this DT still seems to me like the best way to go. At their
heart these are properties of the hardware, even if we can't specify
them as such per se because of the problems above. The capacity would
have to be defined as a relative value among CPUs. And while it's true
it may be abused for tuning purposes, that's true of any strategy.
Certainly the sysfs strategy and even if only a dynamic option is
provided, it is guaranteed to be hacked by platform vendors.

thanks,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-15 19:50   ` Steve Muckle
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-15 19:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/08/2016 06:09 AM, Juri Lelli wrote:
>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work
> 
>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required
>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

An important additional con that was mentioned earlier IIRC was the
additional boot time required for the benchmark. Perhaps there could be
a kernel command line argument to bypass the benchmark if it is known
that predetermined values will be provided via sysfs later?

Though there may be another issue with that as mentioned below.

>  3. sysfs (v1)
> 
>     pros: - clean and super easy to implement
>           - values don't require to be physical properties, defining them is
>             probably easier
> 
>     cons: - CPUs capacity have to be provided after boot (by some init script?)
>           - API is modified, still some discussion/review is needed
>           - values can still be incorrectly used for runtime tuning purposes

Initializing the values via userspace init will cause more of the boot
process to run with incorrect CPU capacity values. Boot times may be
increased with tasks running on suboptimal CPUs. Such increases may also
not be deterministic.

Extending the kernel command line idea above, perhaps capacity values
could be provided there as well, similar to the lpj parameter? That has
scalability issues though if there's a huge highly heterogeneous platform...

DT solves these issues and would be the perfect place for this - we are
defining the compute capacity of a CPU which is a property of the
hardware. However there are a couple things forcing us to compromise.
One is that the amount and detail of information required to adequately
capture the computational abilities of a CPU across all possible
workloads seem onerous to collect and enumerate. The second is that even
if we were willing to undertake that, CPU vendors probably won't be
forthcoming with that information.

Despite this DT still seems to me like the best way to go. At their
heart these are properties of the hardware, even if we can't specify
them as such per se because of the problems above. The capacity would
have to be defined as a relative value among CPUs. And while it's true
it may be abused for tuning purposes, that's true of any strategy.
Certainly the sysfs strategy and even if only a dynamic option is
provided, it is guaranteed to be hacked by platform vendors.

thanks,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-15 18:01   ` Mark Brown
@ 2016-01-18 15:01     ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 15:01 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, linux-pm, linux-arm-kernel, peterz,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann

Hi Mark,

On 15/01/16 18:01, Mark Brown wrote:
> On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> 
> > Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> > problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> > this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> > some integer computation, I'm sure there are better benchmarks around) on the
> > first cpu of each frequency domain (assuming no u-arch differences inside
> > domains), measure time to complete a fixed number of iterations and then
> > normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> > polishing this up or thinking about a better benchmark, as this is an RFC and
> > I'd like discussion happening before we make this solution better
> > working/looking. However, surprisingly, results are not that bad already:
> 
> This approach looks good to me - certainly vastly preferable to putting
> the numbers into DT.
> 
> >  2. Dynamic profiling at boot (v2)
> > 
> >     pros: - does not require a standardized definition of capacity
> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >           - does not require user/integrator work
> 
> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> 
> This actually seems to be pretty clean.
> 

Oh, maybe I was overly critic of my code then. :)

> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot
> 
> This does come back to the question of how accurate the numbers need to
> be - is "good enough" fine?

Here we probably also have to account for the fact that numbers might
actually change from boot to boot. They might be good enough in average,
but you might have cases in which, for some reason, they turn out to be
far from usual values. Not that I see that happening much on my boxes,
but I guess it can happen.

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 15:01     ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 15:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On 15/01/16 18:01, Mark Brown wrote:
> On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> 
> > Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> > problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> > this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> > some integer computation, I'm sure there are better benchmarks around) on the
> > first cpu of each frequency domain (assuming no u-arch differences inside
> > domains), measure time to complete a fixed number of iterations and then
> > normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> > polishing this up or thinking about a better benchmark, as this is an RFC and
> > I'd like discussion happening before we make this solution better
> > working/looking. However, surprisingly, results are not that bad already:
> 
> This approach looks good to me - certainly vastly preferable to putting
> the numbers into DT.
> 
> >  2. Dynamic profiling at boot (v2)
> > 
> >     pros: - does not require a standardized definition of capacity
> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >           - does not require user/integrator work
> 
> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> 
> This actually seems to be pretty clean.
> 

Oh, maybe I was overly critic of my code then. :)

> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot
> 
> This does come back to the question of how accurate the numbers need to
> be - is "good enough" fine?

Here we probably also have to account for the fact that numbers might
actually change from boot to boot. They might be good enough in average,
but you might have cases in which, for some reason, they turn out to be
far from usual values. Not that I see that happening much on my boxes,
but I guess it can happen.

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-15 19:50   ` Steve Muckle
@ 2016-01-18 15:13     ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 15:13 UTC (permalink / raw)
  To: Steve Muckle
  Cc: linux-kernel, linux-pm, linux-arm-kernel, peterz,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann, broonie

Hi Steve,

On 15/01/16 11:50, Steve Muckle wrote:
> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >  2. Dynamic profiling at boot (v2)
> > 
> >     pros: - does not require a standardized definition of capacity
> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >           - does not require user/integrator work
> > 
> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot
> 
> An important additional con that was mentioned earlier IIRC was the
> additional boot time required for the benchmark.

Right. I forgot about that.

> Perhaps there could be
> a kernel command line argument to bypass the benchmark if it is known
> that predetermined values will be provided via sysfs later?
> 

This might work, yes.

> Though there may be another issue with that as mentioned below.
> 
> >  3. sysfs (v1)
> > 
> >     pros: - clean and super easy to implement
> >           - values don't require to be physical properties, defining them is
> >             probably easier
> > 
> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >           - API is modified, still some discussion/review is needed
> >           - values can still be incorrectly used for runtime tuning purposes
> 
> Initializing the values via userspace init will cause more of the boot
> process to run with incorrect CPU capacity values. Boot times may be
> increased with tasks running on suboptimal CPUs. Such increases may also
> not be deterministic.
> 
> Extending the kernel command line idea above, perhaps capacity values
> could be provided there as well, similar to the lpj parameter? That has
> scalability issues though if there's a huge highly heterogeneous platform...
> 

Yeah, adding such option is not difficult, but I'm also a bit concerned
about the scalability of such a thing.

> DT solves these issues and would be the perfect place for this - we are
> defining the compute capacity of a CPU which is a property of the
> hardware. However there are a couple things forcing us to compromise.
> One is that the amount and detail of information required to adequately
> capture the computational abilities of a CPU across all possible
> workloads seem onerous to collect and enumerate. The second is that even
> if we were willing to undertake that, CPU vendors probably won't be
> forthcoming with that information.
> 

You mean because they won't publish performance data of their hw?

But we already use per platform normalized values (as you are proposing
below). So that a platform to platform comparison doesn't make sense.

> Despite this DT still seems to me like the best way to go. At their
> heart these are properties of the hardware, even if we can't specify
> them as such per se because of the problems above. The capacity would
> have to be defined as a relative value among CPUs. And while it's true
> it may be abused for tuning purposes, that's true of any strategy.
> Certainly the sysfs strategy and even if only a dynamic option is
> provided, it is guaranteed to be hacked by platform vendors.

I also like the DT approach and consider the sysfs option as something
that can go together with any solution we want to adopt.

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 15:13     ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Steve,

On 15/01/16 11:50, Steve Muckle wrote:
> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >  2. Dynamic profiling at boot (v2)
> > 
> >     pros: - does not require a standardized definition of capacity
> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >           - does not require user/integrator work
> > 
> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot
> 
> An important additional con that was mentioned earlier IIRC was the
> additional boot time required for the benchmark.

Right. I forgot about that.

> Perhaps there could be
> a kernel command line argument to bypass the benchmark if it is known
> that predetermined values will be provided via sysfs later?
> 

This might work, yes.

> Though there may be another issue with that as mentioned below.
> 
> >  3. sysfs (v1)
> > 
> >     pros: - clean and super easy to implement
> >           - values don't require to be physical properties, defining them is
> >             probably easier
> > 
> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >           - API is modified, still some discussion/review is needed
> >           - values can still be incorrectly used for runtime tuning purposes
> 
> Initializing the values via userspace init will cause more of the boot
> process to run with incorrect CPU capacity values. Boot times may be
> increased with tasks running on suboptimal CPUs. Such increases may also
> not be deterministic.
> 
> Extending the kernel command line idea above, perhaps capacity values
> could be provided there as well, similar to the lpj parameter? That has
> scalability issues though if there's a huge highly heterogeneous platform...
> 

Yeah, adding such option is not difficult, but I'm also a bit concerned
about the scalability of such a thing.

> DT solves these issues and would be the perfect place for this - we are
> defining the compute capacity of a CPU which is a property of the
> hardware. However there are a couple things forcing us to compromise.
> One is that the amount and detail of information required to adequately
> capture the computational abilities of a CPU across all possible
> workloads seem onerous to collect and enumerate. The second is that even
> if we were willing to undertake that, CPU vendors probably won't be
> forthcoming with that information.
> 

You mean because they won't publish performance data of their hw?

But we already use per platform normalized values (as you are proposing
below). So that a platform to platform comparison doesn't make sense.

> Despite this DT still seems to me like the best way to go. At their
> heart these are properties of the hardware, even if we can't specify
> them as such per se because of the problems above. The capacity would
> have to be defined as a relative value among CPUs. And while it's true
> it may be abused for tuning purposes, that's true of any strategy.
> Certainly the sysfs strategy and even if only a dynamic option is
> provided, it is guaranteed to be hacked by platform vendors.

I also like the DT approach and consider the sysfs option as something
that can go together with any solution we want to adopt.

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 15:13     ` Juri Lelli
@ 2016-01-18 16:13       ` Vincent Guittot
  -1 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 16:13 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Steve Muckle, linux-kernel, linux-pm, LAK, Peter Zijlstra,
	Rob Herring, Mark Rutland, Russell King - ARM Linux,
	Sudeep Holla, Lorenzo Pieralisi, Catalin Marinas, Will Deacon,
	Morten Rasmussen, Dietmar Eggemann, Mark Brown

On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi Steve,
>
> On 15/01/16 11:50, Steve Muckle wrote:
>> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >  2. Dynamic profiling at boot (v2)
>> >
>> >     pros: - does not require a standardized definition of capacity
>> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >           - does not require user/integrator work
>> >
>> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >             with several subsystems (e.g., cpufreq) is required
>> >           - not easy to agree upon a single benchmark (that has to be both
>> >             representative and simple enough to run at boot)
>> >           - numbers might (and do) vary from boot to boot
>>
>> An important additional con that was mentioned earlier IIRC was the
>> additional boot time required for the benchmark.
>
> Right. I forgot about that.
>
>> Perhaps there could be
>> a kernel command line argument to bypass the benchmark if it is known
>> that predetermined values will be provided via sysfs later?
>>
>
> This might work, yes.

Instead of command line, I prefer to use DT.

Can't we use something similar to what is currently done in arm arch
for the early stage of the boot ? We don't have to provide performance
value for which it's difficult to find a consensus on how to define it
and which benchmark should be used. We use the micro arch and the
frequency of the core to define a relative capacity. This give us a
relatively good idea of the capacity of each core.
Then, the dynamic profiling can update it with a more accurate value
during the boot.

>
>> Though there may be another issue with that as mentioned below.
>>
>> >  3. sysfs (v1)
>> >
>> >     pros: - clean and super easy to implement
>> >           - values don't require to be physical properties, defining them is
>> >             probably easier
>> >
>> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >           - API is modified, still some discussion/review is needed
>> >           - values can still be incorrectly used for runtime tuning purposes
>>
>> Initializing the values via userspace init will cause more of the boot
>> process to run with incorrect CPU capacity values. Boot times may be
>> increased with tasks running on suboptimal CPUs. Such increases may also
>> not be deterministic.
>>
>> Extending the kernel command line idea above, perhaps capacity values
>> could be provided there as well, similar to the lpj parameter? That has
>> scalability issues though if there's a huge highly heterogeneous platform...
>>
>
> Yeah, adding such option is not difficult, but I'm also a bit concerned
> about the scalability of such a thing.
>
>> DT solves these issues and would be the perfect place for this - we are
>> defining the compute capacity of a CPU which is a property of the
>> hardware. However there are a couple things forcing us to compromise.
>> One is that the amount and detail of information required to adequately
>> capture the computational abilities of a CPU across all possible
>> workloads seem onerous to collect and enumerate. The second is that even
>> if we were willing to undertake that, CPU vendors probably won't be
>> forthcoming with that information.
>>
>
> You mean because they won't publish performance data of their hw?
>
> But we already use per platform normalized values (as you are proposing
> below). So that a platform to platform comparison doesn't make sense.
>
>> Despite this DT still seems to me like the best way to go. At their
>> heart these are properties of the hardware, even if we can't specify
>> them as such per se because of the problems above. The capacity would
>> have to be defined as a relative value among CPUs. And while it's true
>> it may be abused for tuning purposes, that's true of any strategy.
>> Certainly the sysfs strategy and even if only a dynamic option is
>> provided, it is guaranteed to be hacked by platform vendors.
>
> I also like the DT approach and consider the sysfs option as something
> that can go together with any solution we want to adopt.

I'm not sure that we should consider sysfs as an option because of all
the concerned that as already been put forward.
I would prefer a debugfs if we have play with these capacity values in
order to test their accuracy.

Regards,
Vincent
>
> Best,
>
> - Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 16:13       ` Vincent Guittot
  0 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi Steve,
>
> On 15/01/16 11:50, Steve Muckle wrote:
>> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >  2. Dynamic profiling at boot (v2)
>> >
>> >     pros: - does not require a standardized definition of capacity
>> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >           - does not require user/integrator work
>> >
>> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >             with several subsystems (e.g., cpufreq) is required
>> >           - not easy to agree upon a single benchmark (that has to be both
>> >             representative and simple enough to run at boot)
>> >           - numbers might (and do) vary from boot to boot
>>
>> An important additional con that was mentioned earlier IIRC was the
>> additional boot time required for the benchmark.
>
> Right. I forgot about that.
>
>> Perhaps there could be
>> a kernel command line argument to bypass the benchmark if it is known
>> that predetermined values will be provided via sysfs later?
>>
>
> This might work, yes.

Instead of command line, I prefer to use DT.

Can't we use something similar to what is currently done in arm arch
for the early stage of the boot ? We don't have to provide performance
value for which it's difficult to find a consensus on how to define it
and which benchmark should be used. We use the micro arch and the
frequency of the core to define a relative capacity. This give us a
relatively good idea of the capacity of each core.
Then, the dynamic profiling can update it with a more accurate value
during the boot.

>
>> Though there may be another issue with that as mentioned below.
>>
>> >  3. sysfs (v1)
>> >
>> >     pros: - clean and super easy to implement
>> >           - values don't require to be physical properties, defining them is
>> >             probably easier
>> >
>> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >           - API is modified, still some discussion/review is needed
>> >           - values can still be incorrectly used for runtime tuning purposes
>>
>> Initializing the values via userspace init will cause more of the boot
>> process to run with incorrect CPU capacity values. Boot times may be
>> increased with tasks running on suboptimal CPUs. Such increases may also
>> not be deterministic.
>>
>> Extending the kernel command line idea above, perhaps capacity values
>> could be provided there as well, similar to the lpj parameter? That has
>> scalability issues though if there's a huge highly heterogeneous platform...
>>
>
> Yeah, adding such option is not difficult, but I'm also a bit concerned
> about the scalability of such a thing.
>
>> DT solves these issues and would be the perfect place for this - we are
>> defining the compute capacity of a CPU which is a property of the
>> hardware. However there are a couple things forcing us to compromise.
>> One is that the amount and detail of information required to adequately
>> capture the computational abilities of a CPU across all possible
>> workloads seem onerous to collect and enumerate. The second is that even
>> if we were willing to undertake that, CPU vendors probably won't be
>> forthcoming with that information.
>>
>
> You mean because they won't publish performance data of their hw?
>
> But we already use per platform normalized values (as you are proposing
> below). So that a platform to platform comparison doesn't make sense.
>
>> Despite this DT still seems to me like the best way to go. At their
>> heart these are properties of the hardware, even if we can't specify
>> them as such per se because of the problems above. The capacity would
>> have to be defined as a relative value among CPUs. And while it's true
>> it may be abused for tuning purposes, that's true of any strategy.
>> Certainly the sysfs strategy and even if only a dynamic option is
>> provided, it is guaranteed to be hacked by platform vendors.
>
> I also like the DT approach and consider the sysfs option as something
> that can go together with any solution we want to adopt.

I'm not sure that we should consider sysfs as an option because of all
the concerned that as already been put forward.
I would prefer a debugfs if we have play with these capacity values in
order to test their accuracy.

Regards,
Vincent
>
> Best,
>
> - Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 16:13       ` Vincent Guittot
@ 2016-01-18 16:30         ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 16:30 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Steve Muckle, linux-kernel, linux-pm, LAK, Peter Zijlstra,
	Rob Herring, Mark Rutland, Russell King - ARM Linux,
	Sudeep Holla, Lorenzo Pieralisi, Catalin Marinas, Will Deacon,
	Morten Rasmussen, Dietmar Eggemann, Mark Brown

Hi Vincent,

On 18/01/16 17:13, Vincent Guittot wrote:
> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > Hi Steve,
> >
> > On 15/01/16 11:50, Steve Muckle wrote:
> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >  2. Dynamic profiling at boot (v2)
> >> >
> >> >     pros: - does not require a standardized definition of capacity
> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >           - does not require user/integrator work
> >> >
> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >             with several subsystems (e.g., cpufreq) is required
> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >             representative and simple enough to run at boot)
> >> >           - numbers might (and do) vary from boot to boot
> >>
> >> An important additional con that was mentioned earlier IIRC was the
> >> additional boot time required for the benchmark.
> >
> > Right. I forgot about that.
> >
> >> Perhaps there could be
> >> a kernel command line argument to bypass the benchmark if it is known
> >> that predetermined values will be provided via sysfs later?
> >>
> >
> > This might work, yes.
> 
> Instead of command line, I prefer to use DT.
> 
> Can't we use something similar to what is currently done in arm arch
> for the early stage of the boot ? We don't have to provide performance
> value for which it's difficult to find a consensus on how to define it
> and which benchmark should be used. We use the micro arch and the
> frequency of the core to define a relative capacity. This give us a
> relatively good idea of the capacity of each core.

I'm not sure I understand what you are proposing. arm arch is currently
based on having static hardcoded data (efficiency values). But, this has
already been NACKed for arm64 during last review of this RFC.

Are you proposing something different?

Thanks,

- Juri

> Then, the dynamic profiling can update it with a more accurate value
> during the boot.
> 
> >
> >> Though there may be another issue with that as mentioned below.
> >>
> >> >  3. sysfs (v1)
> >> >
> >> >     pros: - clean and super easy to implement
> >> >           - values don't require to be physical properties, defining them is
> >> >             probably easier
> >> >
> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >> >           - API is modified, still some discussion/review is needed
> >> >           - values can still be incorrectly used for runtime tuning purposes
> >>
> >> Initializing the values via userspace init will cause more of the boot
> >> process to run with incorrect CPU capacity values. Boot times may be
> >> increased with tasks running on suboptimal CPUs. Such increases may also
> >> not be deterministic.
> >>
> >> Extending the kernel command line idea above, perhaps capacity values
> >> could be provided there as well, similar to the lpj parameter? That has
> >> scalability issues though if there's a huge highly heterogeneous platform...
> >>
> >
> > Yeah, adding such option is not difficult, but I'm also a bit concerned
> > about the scalability of such a thing.
> >
> >> DT solves these issues and would be the perfect place for this - we are
> >> defining the compute capacity of a CPU which is a property of the
> >> hardware. However there are a couple things forcing us to compromise.
> >> One is that the amount and detail of information required to adequately
> >> capture the computational abilities of a CPU across all possible
> >> workloads seem onerous to collect and enumerate. The second is that even
> >> if we were willing to undertake that, CPU vendors probably won't be
> >> forthcoming with that information.
> >>
> >
> > You mean because they won't publish performance data of their hw?
> >
> > But we already use per platform normalized values (as you are proposing
> > below). So that a platform to platform comparison doesn't make sense.
> >
> >> Despite this DT still seems to me like the best way to go. At their
> >> heart these are properties of the hardware, even if we can't specify
> >> them as such per se because of the problems above. The capacity would
> >> have to be defined as a relative value among CPUs. And while it's true
> >> it may be abused for tuning purposes, that's true of any strategy.
> >> Certainly the sysfs strategy and even if only a dynamic option is
> >> provided, it is guaranteed to be hacked by platform vendors.
> >
> > I also like the DT approach and consider the sysfs option as something
> > that can go together with any solution we want to adopt.
> 
> I'm not sure that we should consider sysfs as an option because of all
> the concerned that as already been put forward.
> I would prefer a debugfs if we have play with these capacity values in
> order to test their accuracy.
> 
> Regards,
> Vincent
> >
> > Best,
> >
> > - Juri
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 16:30         ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vincent,

On 18/01/16 17:13, Vincent Guittot wrote:
> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > Hi Steve,
> >
> > On 15/01/16 11:50, Steve Muckle wrote:
> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >  2. Dynamic profiling at boot (v2)
> >> >
> >> >     pros: - does not require a standardized definition of capacity
> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >           - does not require user/integrator work
> >> >
> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >             with several subsystems (e.g., cpufreq) is required
> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >             representative and simple enough to run at boot)
> >> >           - numbers might (and do) vary from boot to boot
> >>
> >> An important additional con that was mentioned earlier IIRC was the
> >> additional boot time required for the benchmark.
> >
> > Right. I forgot about that.
> >
> >> Perhaps there could be
> >> a kernel command line argument to bypass the benchmark if it is known
> >> that predetermined values will be provided via sysfs later?
> >>
> >
> > This might work, yes.
> 
> Instead of command line, I prefer to use DT.
> 
> Can't we use something similar to what is currently done in arm arch
> for the early stage of the boot ? We don't have to provide performance
> value for which it's difficult to find a consensus on how to define it
> and which benchmark should be used. We use the micro arch and the
> frequency of the core to define a relative capacity. This give us a
> relatively good idea of the capacity of each core.

I'm not sure I understand what you are proposing. arm arch is currently
based on having static hardcoded data (efficiency values). But, this has
already been NACKed for arm64 during last review of this RFC.

Are you proposing something different?

Thanks,

- Juri

> Then, the dynamic profiling can update it with a more accurate value
> during the boot.
> 
> >
> >> Though there may be another issue with that as mentioned below.
> >>
> >> >  3. sysfs (v1)
> >> >
> >> >     pros: - clean and super easy to implement
> >> >           - values don't require to be physical properties, defining them is
> >> >             probably easier
> >> >
> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >> >           - API is modified, still some discussion/review is needed
> >> >           - values can still be incorrectly used for runtime tuning purposes
> >>
> >> Initializing the values via userspace init will cause more of the boot
> >> process to run with incorrect CPU capacity values. Boot times may be
> >> increased with tasks running on suboptimal CPUs. Such increases may also
> >> not be deterministic.
> >>
> >> Extending the kernel command line idea above, perhaps capacity values
> >> could be provided there as well, similar to the lpj parameter? That has
> >> scalability issues though if there's a huge highly heterogeneous platform...
> >>
> >
> > Yeah, adding such option is not difficult, but I'm also a bit concerned
> > about the scalability of such a thing.
> >
> >> DT solves these issues and would be the perfect place for this - we are
> >> defining the compute capacity of a CPU which is a property of the
> >> hardware. However there are a couple things forcing us to compromise.
> >> One is that the amount and detail of information required to adequately
> >> capture the computational abilities of a CPU across all possible
> >> workloads seem onerous to collect and enumerate. The second is that even
> >> if we were willing to undertake that, CPU vendors probably won't be
> >> forthcoming with that information.
> >>
> >
> > You mean because they won't publish performance data of their hw?
> >
> > But we already use per platform normalized values (as you are proposing
> > below). So that a platform to platform comparison doesn't make sense.
> >
> >> Despite this DT still seems to me like the best way to go. At their
> >> heart these are properties of the hardware, even if we can't specify
> >> them as such per se because of the problems above. The capacity would
> >> have to be defined as a relative value among CPUs. And while it's true
> >> it may be abused for tuning purposes, that's true of any strategy.
> >> Certainly the sysfs strategy and even if only a dynamic option is
> >> provided, it is guaranteed to be hacked by platform vendors.
> >
> > I also like the DT approach and consider the sysfs option as something
> > that can go together with any solution we want to adopt.
> 
> I'm not sure that we should consider sysfs as an option because of all
> the concerned that as already been put forward.
> I would prefer a debugfs if we have play with these capacity values in
> order to test their accuracy.
> 
> Regards,
> Vincent
> >
> > Best,
> >
> > - Juri
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 16:30         ` Juri Lelli
@ 2016-01-18 16:42           ` Vincent Guittot
  -1 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 16:42 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Steve Muckle, linux-kernel, linux-pm, LAK, Peter Zijlstra,
	Rob Herring, Mark Rutland, Russell King - ARM Linux,
	Sudeep Holla, Lorenzo Pieralisi, Catalin Marinas, Will Deacon,
	Morten Rasmussen, Dietmar Eggemann, Mark Brown

On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi Vincent,
>
> On 18/01/16 17:13, Vincent Guittot wrote:
>> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
>> > Hi Steve,
>> >
>> > On 15/01/16 11:50, Steve Muckle wrote:
>> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >> >  2. Dynamic profiling at boot (v2)
>> >> >
>> >> >     pros: - does not require a standardized definition of capacity
>> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >> >           - does not require user/integrator work
>> >> >
>> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >> >             with several subsystems (e.g., cpufreq) is required
>> >> >           - not easy to agree upon a single benchmark (that has to be both
>> >> >             representative and simple enough to run at boot)
>> >> >           - numbers might (and do) vary from boot to boot
>> >>
>> >> An important additional con that was mentioned earlier IIRC was the
>> >> additional boot time required for the benchmark.
>> >
>> > Right. I forgot about that.
>> >
>> >> Perhaps there could be
>> >> a kernel command line argument to bypass the benchmark if it is known
>> >> that predetermined values will be provided via sysfs later?
>> >>
>> >
>> > This might work, yes.
>>
>> Instead of command line, I prefer to use DT.
>>
>> Can't we use something similar to what is currently done in arm arch
>> for the early stage of the boot ? We don't have to provide performance
>> value for which it's difficult to find a consensus on how to define it
>> and which benchmark should be used. We use the micro arch and the
>> frequency of the core to define a relative capacity. This give us a
>> relatively good idea of the capacity of each core.
>
> I'm not sure I understand what you are proposing. arm arch is currently
> based on having static hardcoded data (efficiency values). But, this has
> already been NACKed for arm64 during last review of this RFC.
>
> Are you proposing something different?

No, i'm proposing to use it at boot time until the dynamic profiling
gives better value.
We don't have to set any new properties.
IIRC, It was nacked because it was of static hardcoded value that was
not always reflecting the best accurate capacity of a system. IMHO,
it's not that far from reality so can't this be used as an
intermediate step while waiting for dynamic profiling ?

Vincent

>
> Thanks,
>
> - Juri
>
>> Then, the dynamic profiling can update it with a more accurate value
>> during the boot.
>>
>> >
>> >> Though there may be another issue with that as mentioned below.
>> >>
>> >> >  3. sysfs (v1)
>> >> >
>> >> >     pros: - clean and super easy to implement
>> >> >           - values don't require to be physical properties, defining them is
>> >> >             probably easier
>> >> >
>> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >> >           - API is modified, still some discussion/review is needed
>> >> >           - values can still be incorrectly used for runtime tuning purposes
>> >>
>> >> Initializing the values via userspace init will cause more of the boot
>> >> process to run with incorrect CPU capacity values. Boot times may be
>> >> increased with tasks running on suboptimal CPUs. Such increases may also
>> >> not be deterministic.
>> >>
>> >> Extending the kernel command line idea above, perhaps capacity values
>> >> could be provided there as well, similar to the lpj parameter? That has
>> >> scalability issues though if there's a huge highly heterogeneous platform...
>> >>
>> >
>> > Yeah, adding such option is not difficult, but I'm also a bit concerned
>> > about the scalability of such a thing.
>> >
>> >> DT solves these issues and would be the perfect place for this - we are
>> >> defining the compute capacity of a CPU which is a property of the
>> >> hardware. However there are a couple things forcing us to compromise.
>> >> One is that the amount and detail of information required to adequately
>> >> capture the computational abilities of a CPU across all possible
>> >> workloads seem onerous to collect and enumerate. The second is that even
>> >> if we were willing to undertake that, CPU vendors probably won't be
>> >> forthcoming with that information.
>> >>
>> >
>> > You mean because they won't publish performance data of their hw?
>> >
>> > But we already use per platform normalized values (as you are proposing
>> > below). So that a platform to platform comparison doesn't make sense.
>> >
>> >> Despite this DT still seems to me like the best way to go. At their
>> >> heart these are properties of the hardware, even if we can't specify
>> >> them as such per se because of the problems above. The capacity would
>> >> have to be defined as a relative value among CPUs. And while it's true
>> >> it may be abused for tuning purposes, that's true of any strategy.
>> >> Certainly the sysfs strategy and even if only a dynamic option is
>> >> provided, it is guaranteed to be hacked by platform vendors.
>> >
>> > I also like the DT approach and consider the sysfs option as something
>> > that can go together with any solution we want to adopt.
>>
>> I'm not sure that we should consider sysfs as an option because of all
>> the concerned that as already been put forward.
>> I would prefer a debugfs if we have play with these capacity values in
>> order to test their accuracy.
>>
>> Regards,
>> Vincent
>> >
>> > Best,
>> >
>> > - Juri
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 16:42           ` Vincent Guittot
  0 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 16:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi Vincent,
>
> On 18/01/16 17:13, Vincent Guittot wrote:
>> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
>> > Hi Steve,
>> >
>> > On 15/01/16 11:50, Steve Muckle wrote:
>> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >> >  2. Dynamic profiling at boot (v2)
>> >> >
>> >> >     pros: - does not require a standardized definition of capacity
>> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >> >           - does not require user/integrator work
>> >> >
>> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >> >             with several subsystems (e.g., cpufreq) is required
>> >> >           - not easy to agree upon a single benchmark (that has to be both
>> >> >             representative and simple enough to run at boot)
>> >> >           - numbers might (and do) vary from boot to boot
>> >>
>> >> An important additional con that was mentioned earlier IIRC was the
>> >> additional boot time required for the benchmark.
>> >
>> > Right. I forgot about that.
>> >
>> >> Perhaps there could be
>> >> a kernel command line argument to bypass the benchmark if it is known
>> >> that predetermined values will be provided via sysfs later?
>> >>
>> >
>> > This might work, yes.
>>
>> Instead of command line, I prefer to use DT.
>>
>> Can't we use something similar to what is currently done in arm arch
>> for the early stage of the boot ? We don't have to provide performance
>> value for which it's difficult to find a consensus on how to define it
>> and which benchmark should be used. We use the micro arch and the
>> frequency of the core to define a relative capacity. This give us a
>> relatively good idea of the capacity of each core.
>
> I'm not sure I understand what you are proposing. arm arch is currently
> based on having static hardcoded data (efficiency values). But, this has
> already been NACKed for arm64 during last review of this RFC.
>
> Are you proposing something different?

No, i'm proposing to use it at boot time until the dynamic profiling
gives better value.
We don't have to set any new properties.
IIRC, It was nacked because it was of static hardcoded value that was
not always reflecting the best accurate capacity of a system. IMHO,
it's not that far from reality so can't this be used as an
intermediate step while waiting for dynamic profiling ?

Vincent

>
> Thanks,
>
> - Juri
>
>> Then, the dynamic profiling can update it with a more accurate value
>> during the boot.
>>
>> >
>> >> Though there may be another issue with that as mentioned below.
>> >>
>> >> >  3. sysfs (v1)
>> >> >
>> >> >     pros: - clean and super easy to implement
>> >> >           - values don't require to be physical properties, defining them is
>> >> >             probably easier
>> >> >
>> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >> >           - API is modified, still some discussion/review is needed
>> >> >           - values can still be incorrectly used for runtime tuning purposes
>> >>
>> >> Initializing the values via userspace init will cause more of the boot
>> >> process to run with incorrect CPU capacity values. Boot times may be
>> >> increased with tasks running on suboptimal CPUs. Such increases may also
>> >> not be deterministic.
>> >>
>> >> Extending the kernel command line idea above, perhaps capacity values
>> >> could be provided there as well, similar to the lpj parameter? That has
>> >> scalability issues though if there's a huge highly heterogeneous platform...
>> >>
>> >
>> > Yeah, adding such option is not difficult, but I'm also a bit concerned
>> > about the scalability of such a thing.
>> >
>> >> DT solves these issues and would be the perfect place for this - we are
>> >> defining the compute capacity of a CPU which is a property of the
>> >> hardware. However there are a couple things forcing us to compromise.
>> >> One is that the amount and detail of information required to adequately
>> >> capture the computational abilities of a CPU across all possible
>> >> workloads seem onerous to collect and enumerate. The second is that even
>> >> if we were willing to undertake that, CPU vendors probably won't be
>> >> forthcoming with that information.
>> >>
>> >
>> > You mean because they won't publish performance data of their hw?
>> >
>> > But we already use per platform normalized values (as you are proposing
>> > below). So that a platform to platform comparison doesn't make sense.
>> >
>> >> Despite this DT still seems to me like the best way to go. At their
>> >> heart these are properties of the hardware, even if we can't specify
>> >> them as such per se because of the problems above. The capacity would
>> >> have to be defined as a relative value among CPUs. And while it's true
>> >> it may be abused for tuning purposes, that's true of any strategy.
>> >> Certainly the sysfs strategy and even if only a dynamic option is
>> >> provided, it is guaranteed to be hacked by platform vendors.
>> >
>> > I also like the DT approach and consider the sysfs option as something
>> > that can go together with any solution we want to adopt.
>>
>> I'm not sure that we should consider sysfs as an option because of all
>> the concerned that as already been put forward.
>> I would prefer a debugfs if we have play with these capacity values in
>> order to test their accuracy.
>>
>> Regards,
>> Vincent
>> >
>> > Best,
>> >
>> > - Juri
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 16:42           ` Vincent Guittot
@ 2016-01-18 17:08             ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 17:08 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Steve Muckle, linux-kernel, linux-pm, LAK, Peter Zijlstra,
	Rob Herring, Mark Rutland, Russell King - ARM Linux,
	Sudeep Holla, Lorenzo Pieralisi, Catalin Marinas, Will Deacon,
	Morten Rasmussen, Dietmar Eggemann, Mark Brown

On 18/01/16 17:42, Vincent Guittot wrote:
> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > Hi Vincent,
> >
> > On 18/01/16 17:13, Vincent Guittot wrote:
> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> >> > Hi Steve,
> >> >
> >> > On 15/01/16 11:50, Steve Muckle wrote:
> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >> >  2. Dynamic profiling at boot (v2)
> >> >> >
> >> >> >     pros: - does not require a standardized definition of capacity
> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >> >           - does not require user/integrator work
> >> >> >
> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >> >             with several subsystems (e.g., cpufreq) is required
> >> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >> >             representative and simple enough to run at boot)
> >> >> >           - numbers might (and do) vary from boot to boot
> >> >>
> >> >> An important additional con that was mentioned earlier IIRC was the
> >> >> additional boot time required for the benchmark.
> >> >
> >> > Right. I forgot about that.
> >> >
> >> >> Perhaps there could be
> >> >> a kernel command line argument to bypass the benchmark if it is known
> >> >> that predetermined values will be provided via sysfs later?
> >> >>
> >> >
> >> > This might work, yes.
> >>
> >> Instead of command line, I prefer to use DT.
> >>
> >> Can't we use something similar to what is currently done in arm arch
> >> for the early stage of the boot ? We don't have to provide performance
> >> value for which it's difficult to find a consensus on how to define it
> >> and which benchmark should be used. We use the micro arch and the
> >> frequency of the core to define a relative capacity. This give us a
> >> relatively good idea of the capacity of each core.
> >
> > I'm not sure I understand what you are proposing. arm arch is currently
> > based on having static hardcoded data (efficiency values). But, this has
> > already been NACKed for arm64 during last review of this RFC.
> >
> > Are you proposing something different?
> 
> No, i'm proposing to use it at boot time until the dynamic profiling
> gives better value.
> We don't have to set any new properties.
> IIRC, It was nacked because it was of static hardcoded value that was
> not always reflecting the best accurate capacity of a system. IMHO,
> it's not that far from reality so can't this be used as an
> intermediate step while waiting for dynamic profiling ?

It seems to me that we will only make things more complicated than
needed, without gaining much. Either we will have these values until the
profile happens (do we really think we will speed up pre-profile boot
time much?) or we will have them as defaults, and every concern that
brought this approach to be nacked will apply.

Thanks,

- Juri

> >
> > Thanks,
> >
> > - Juri
> >
> >> Then, the dynamic profiling can update it with a more accurate value
> >> during the boot.
> >>
> >> >
> >> >> Though there may be another issue with that as mentioned below.
> >> >>
> >> >> >  3. sysfs (v1)
> >> >> >
> >> >> >     pros: - clean and super easy to implement
> >> >> >           - values don't require to be physical properties, defining them is
> >> >> >             probably easier
> >> >> >
> >> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >> >> >           - API is modified, still some discussion/review is needed
> >> >> >           - values can still be incorrectly used for runtime tuning purposes
> >> >>
> >> >> Initializing the values via userspace init will cause more of the boot
> >> >> process to run with incorrect CPU capacity values. Boot times may be
> >> >> increased with tasks running on suboptimal CPUs. Such increases may also
> >> >> not be deterministic.
> >> >>
> >> >> Extending the kernel command line idea above, perhaps capacity values
> >> >> could be provided there as well, similar to the lpj parameter? That has
> >> >> scalability issues though if there's a huge highly heterogeneous platform...
> >> >>
> >> >
> >> > Yeah, adding such option is not difficult, but I'm also a bit concerned
> >> > about the scalability of such a thing.
> >> >
> >> >> DT solves these issues and would be the perfect place for this - we are
> >> >> defining the compute capacity of a CPU which is a property of the
> >> >> hardware. However there are a couple things forcing us to compromise.
> >> >> One is that the amount and detail of information required to adequately
> >> >> capture the computational abilities of a CPU across all possible
> >> >> workloads seem onerous to collect and enumerate. The second is that even
> >> >> if we were willing to undertake that, CPU vendors probably won't be
> >> >> forthcoming with that information.
> >> >>
> >> >
> >> > You mean because they won't publish performance data of their hw?
> >> >
> >> > But we already use per platform normalized values (as you are proposing
> >> > below). So that a platform to platform comparison doesn't make sense.
> >> >
> >> >> Despite this DT still seems to me like the best way to go. At their
> >> >> heart these are properties of the hardware, even if we can't specify
> >> >> them as such per se because of the problems above. The capacity would
> >> >> have to be defined as a relative value among CPUs. And while it's true
> >> >> it may be abused for tuning purposes, that's true of any strategy.
> >> >> Certainly the sysfs strategy and even if only a dynamic option is
> >> >> provided, it is guaranteed to be hacked by platform vendors.
> >> >
> >> > I also like the DT approach and consider the sysfs option as something
> >> > that can go together with any solution we want to adopt.
> >>
> >> I'm not sure that we should consider sysfs as an option because of all
> >> the concerned that as already been put forward.
> >> I would prefer a debugfs if we have play with these capacity values in
> >> order to test their accuracy.
> >>
> >> Regards,
> >> Vincent
> >> >
> >> > Best,
> >> >
> >> > - Juri
> >>
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 17:08             ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-18 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 18/01/16 17:42, Vincent Guittot wrote:
> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > Hi Vincent,
> >
> > On 18/01/16 17:13, Vincent Guittot wrote:
> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> >> > Hi Steve,
> >> >
> >> > On 15/01/16 11:50, Steve Muckle wrote:
> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >> >  2. Dynamic profiling at boot (v2)
> >> >> >
> >> >> >     pros: - does not require a standardized definition of capacity
> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >> >           - does not require user/integrator work
> >> >> >
> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >> >             with several subsystems (e.g., cpufreq) is required
> >> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >> >             representative and simple enough to run at boot)
> >> >> >           - numbers might (and do) vary from boot to boot
> >> >>
> >> >> An important additional con that was mentioned earlier IIRC was the
> >> >> additional boot time required for the benchmark.
> >> >
> >> > Right. I forgot about that.
> >> >
> >> >> Perhaps there could be
> >> >> a kernel command line argument to bypass the benchmark if it is known
> >> >> that predetermined values will be provided via sysfs later?
> >> >>
> >> >
> >> > This might work, yes.
> >>
> >> Instead of command line, I prefer to use DT.
> >>
> >> Can't we use something similar to what is currently done in arm arch
> >> for the early stage of the boot ? We don't have to provide performance
> >> value for which it's difficult to find a consensus on how to define it
> >> and which benchmark should be used. We use the micro arch and the
> >> frequency of the core to define a relative capacity. This give us a
> >> relatively good idea of the capacity of each core.
> >
> > I'm not sure I understand what you are proposing. arm arch is currently
> > based on having static hardcoded data (efficiency values). But, this has
> > already been NACKed for arm64 during last review of this RFC.
> >
> > Are you proposing something different?
> 
> No, i'm proposing to use it at boot time until the dynamic profiling
> gives better value.
> We don't have to set any new properties.
> IIRC, It was nacked because it was of static hardcoded value that was
> not always reflecting the best accurate capacity of a system. IMHO,
> it's not that far from reality so can't this be used as an
> intermediate step while waiting for dynamic profiling ?

It seems to me that we will only make things more complicated than
needed, without gaining much. Either we will have these values until the
profile happens (do we really think we will speed up pre-profile boot
time much?) or we will have them as defaults, and every concern that
brought this approach to be nacked will apply.

Thanks,

- Juri

> >
> > Thanks,
> >
> > - Juri
> >
> >> Then, the dynamic profiling can update it with a more accurate value
> >> during the boot.
> >>
> >> >
> >> >> Though there may be another issue with that as mentioned below.
> >> >>
> >> >> >  3. sysfs (v1)
> >> >> >
> >> >> >     pros: - clean and super easy to implement
> >> >> >           - values don't require to be physical properties, defining them is
> >> >> >             probably easier
> >> >> >
> >> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
> >> >> >           - API is modified, still some discussion/review is needed
> >> >> >           - values can still be incorrectly used for runtime tuning purposes
> >> >>
> >> >> Initializing the values via userspace init will cause more of the boot
> >> >> process to run with incorrect CPU capacity values. Boot times may be
> >> >> increased with tasks running on suboptimal CPUs. Such increases may also
> >> >> not be deterministic.
> >> >>
> >> >> Extending the kernel command line idea above, perhaps capacity values
> >> >> could be provided there as well, similar to the lpj parameter? That has
> >> >> scalability issues though if there's a huge highly heterogeneous platform...
> >> >>
> >> >
> >> > Yeah, adding such option is not difficult, but I'm also a bit concerned
> >> > about the scalability of such a thing.
> >> >
> >> >> DT solves these issues and would be the perfect place for this - we are
> >> >> defining the compute capacity of a CPU which is a property of the
> >> >> hardware. However there are a couple things forcing us to compromise.
> >> >> One is that the amount and detail of information required to adequately
> >> >> capture the computational abilities of a CPU across all possible
> >> >> workloads seem onerous to collect and enumerate. The second is that even
> >> >> if we were willing to undertake that, CPU vendors probably won't be
> >> >> forthcoming with that information.
> >> >>
> >> >
> >> > You mean because they won't publish performance data of their hw?
> >> >
> >> > But we already use per platform normalized values (as you are proposing
> >> > below). So that a platform to platform comparison doesn't make sense.
> >> >
> >> >> Despite this DT still seems to me like the best way to go. At their
> >> >> heart these are properties of the hardware, even if we can't specify
> >> >> them as such per se because of the problems above. The capacity would
> >> >> have to be defined as a relative value among CPUs. And while it's true
> >> >> it may be abused for tuning purposes, that's true of any strategy.
> >> >> Certainly the sysfs strategy and even if only a dynamic option is
> >> >> provided, it is guaranteed to be hacked by platform vendors.
> >> >
> >> > I also like the DT approach and consider the sysfs option as something
> >> > that can go together with any solution we want to adopt.
> >>
> >> I'm not sure that we should consider sysfs as an option because of all
> >> the concerned that as already been put forward.
> >> I would prefer a debugfs if we have play with these capacity values in
> >> order to test their accuracy.
> >>
> >> Regards,
> >> Vincent
> >> >
> >> > Best,
> >> >
> >> > - Juri
> >>
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 17:08             ` Juri Lelli
@ 2016-01-18 17:23               ` Vincent Guittot
  -1 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 17:23 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Steve Muckle, linux-kernel, linux-pm, LAK, Peter Zijlstra,
	Rob Herring, Mark Rutland, Russell King - ARM Linux,
	Sudeep Holla, Lorenzo Pieralisi, Catalin Marinas, Will Deacon,
	Morten Rasmussen, Dietmar Eggemann, Mark Brown

On 18 January 2016 at 18:08, Juri Lelli <juri.lelli@arm.com> wrote:
> On 18/01/16 17:42, Vincent Guittot wrote:
>> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
>> > Hi Vincent,
>> >
>> > On 18/01/16 17:13, Vincent Guittot wrote:
>> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
>> >> > Hi Steve,
>> >> >
>> >> > On 15/01/16 11:50, Steve Muckle wrote:
>> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >> >> >  2. Dynamic profiling at boot (v2)
>> >> >> >
>> >> >> >     pros: - does not require a standardized definition of capacity
>> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >> >> >           - does not require user/integrator work
>> >> >> >
>> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >> >> >             with several subsystems (e.g., cpufreq) is required
>> >> >> >           - not easy to agree upon a single benchmark (that has to be both
>> >> >> >             representative and simple enough to run at boot)
>> >> >> >           - numbers might (and do) vary from boot to boot
>> >> >>
>> >> >> An important additional con that was mentioned earlier IIRC was the
>> >> >> additional boot time required for the benchmark.
>> >> >
>> >> > Right. I forgot about that.
>> >> >
>> >> >> Perhaps there could be
>> >> >> a kernel command line argument to bypass the benchmark if it is known
>> >> >> that predetermined values will be provided via sysfs later?
>> >> >>
>> >> >
>> >> > This might work, yes.
>> >>
>> >> Instead of command line, I prefer to use DT.
>> >>
>> >> Can't we use something similar to what is currently done in arm arch
>> >> for the early stage of the boot ? We don't have to provide performance
>> >> value for which it's difficult to find a consensus on how to define it
>> >> and which benchmark should be used. We use the micro arch and the
>> >> frequency of the core to define a relative capacity. This give us a
>> >> relatively good idea of the capacity of each core.
>> >
>> > I'm not sure I understand what you are proposing. arm arch is currently
>> > based on having static hardcoded data (efficiency values). But, this has
>> > already been NACKed for arm64 during last review of this RFC.
>> >
>> > Are you proposing something different?
>>
>> No, i'm proposing to use it at boot time until the dynamic profiling
>> gives better value.
>> We don't have to set any new properties.
>> IIRC, It was nacked because it was of static hardcoded value that was
>> not always reflecting the best accurate capacity of a system. IMHO,
>> it's not that far from reality so can't this be used as an
>> intermediate step while waiting for dynamic profiling ?
>
> It seems to me that we will only make things more complicated than
> needed, without gaining much. Either we will have these values until the

Not sure that this will complicate thing as it doesn't need any
specific inputs from DT as it uses properties already available.
Now, if we consider that the potential impact on the boot time of
using default value is not a issue, i agree that it doesn't worth
adding this step

Vincent

> profile happens (do we really think we will speed up pre-profile boot
> time much?) or we will have them as defaults, and every concern that
> brought this approach to be nacked will apply.
>
> Thanks,
>
> - Juri
>
>> >
>> > Thanks,
>> >
>> > - Juri
>> >
>> >> Then, the dynamic profiling can update it with a more accurate value
>> >> during the boot.
>> >>
>> >> >
>> >> >> Though there may be another issue with that as mentioned below.
>> >> >>
>> >> >> >  3. sysfs (v1)
>> >> >> >
>> >> >> >     pros: - clean and super easy to implement
>> >> >> >           - values don't require to be physical properties, defining them is
>> >> >> >             probably easier
>> >> >> >
>> >> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >> >> >           - API is modified, still some discussion/review is needed
>> >> >> >           - values can still be incorrectly used for runtime tuning purposes
>> >> >>
>> >> >> Initializing the values via userspace init will cause more of the boot
>> >> >> process to run with incorrect CPU capacity values. Boot times may be
>> >> >> increased with tasks running on suboptimal CPUs. Such increases may also
>> >> >> not be deterministic.
>> >> >>
>> >> >> Extending the kernel command line idea above, perhaps capacity values
>> >> >> could be provided there as well, similar to the lpj parameter? That has
>> >> >> scalability issues though if there's a huge highly heterogeneous platform...
>> >> >>
>> >> >
>> >> > Yeah, adding such option is not difficult, but I'm also a bit concerned
>> >> > about the scalability of such a thing.
>> >> >
>> >> >> DT solves these issues and would be the perfect place for this - we are
>> >> >> defining the compute capacity of a CPU which is a property of the
>> >> >> hardware. However there are a couple things forcing us to compromise.
>> >> >> One is that the amount and detail of information required to adequately
>> >> >> capture the computational abilities of a CPU across all possible
>> >> >> workloads seem onerous to collect and enumerate. The second is that even
>> >> >> if we were willing to undertake that, CPU vendors probably won't be
>> >> >> forthcoming with that information.
>> >> >>
>> >> >
>> >> > You mean because they won't publish performance data of their hw?
>> >> >
>> >> > But we already use per platform normalized values (as you are proposing
>> >> > below). So that a platform to platform comparison doesn't make sense.
>> >> >
>> >> >> Despite this DT still seems to me like the best way to go. At their
>> >> >> heart these are properties of the hardware, even if we can't specify
>> >> >> them as such per se because of the problems above. The capacity would
>> >> >> have to be defined as a relative value among CPUs. And while it's true
>> >> >> it may be abused for tuning purposes, that's true of any strategy.
>> >> >> Certainly the sysfs strategy and even if only a dynamic option is
>> >> >> provided, it is guaranteed to be hacked by platform vendors.
>> >> >
>> >> > I also like the DT approach and consider the sysfs option as something
>> >> > that can go together with any solution we want to adopt.
>> >>
>> >> I'm not sure that we should consider sysfs as an option because of all
>> >> the concerned that as already been put forward.
>> >> I would prefer a debugfs if we have play with these capacity values in
>> >> order to test their accuracy.
>> >>
>> >> Regards,
>> >> Vincent
>> >> >
>> >> > Best,
>> >> >
>> >> > - Juri
>> >>
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 17:23               ` Vincent Guittot
  0 siblings, 0 replies; 49+ messages in thread
From: Vincent Guittot @ 2016-01-18 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 18 January 2016 at 18:08, Juri Lelli <juri.lelli@arm.com> wrote:
> On 18/01/16 17:42, Vincent Guittot wrote:
>> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
>> > Hi Vincent,
>> >
>> > On 18/01/16 17:13, Vincent Guittot wrote:
>> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
>> >> > Hi Steve,
>> >> >
>> >> > On 15/01/16 11:50, Steve Muckle wrote:
>> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
>> >> >> >  2. Dynamic profiling at boot (v2)
>> >> >> >
>> >> >> >     pros: - does not require a standardized definition of capacity
>> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
>> >> >> >           - does not require user/integrator work
>> >> >> >
>> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
>> >> >> >             with several subsystems (e.g., cpufreq) is required
>> >> >> >           - not easy to agree upon a single benchmark (that has to be both
>> >> >> >             representative and simple enough to run at boot)
>> >> >> >           - numbers might (and do) vary from boot to boot
>> >> >>
>> >> >> An important additional con that was mentioned earlier IIRC was the
>> >> >> additional boot time required for the benchmark.
>> >> >
>> >> > Right. I forgot about that.
>> >> >
>> >> >> Perhaps there could be
>> >> >> a kernel command line argument to bypass the benchmark if it is known
>> >> >> that predetermined values will be provided via sysfs later?
>> >> >>
>> >> >
>> >> > This might work, yes.
>> >>
>> >> Instead of command line, I prefer to use DT.
>> >>
>> >> Can't we use something similar to what is currently done in arm arch
>> >> for the early stage of the boot ? We don't have to provide performance
>> >> value for which it's difficult to find a consensus on how to define it
>> >> and which benchmark should be used. We use the micro arch and the
>> >> frequency of the core to define a relative capacity. This give us a
>> >> relatively good idea of the capacity of each core.
>> >
>> > I'm not sure I understand what you are proposing. arm arch is currently
>> > based on having static hardcoded data (efficiency values). But, this has
>> > already been NACKed for arm64 during last review of this RFC.
>> >
>> > Are you proposing something different?
>>
>> No, i'm proposing to use it at boot time until the dynamic profiling
>> gives better value.
>> We don't have to set any new properties.
>> IIRC, It was nacked because it was of static hardcoded value that was
>> not always reflecting the best accurate capacity of a system. IMHO,
>> it's not that far from reality so can't this be used as an
>> intermediate step while waiting for dynamic profiling ?
>
> It seems to me that we will only make things more complicated than
> needed, without gaining much. Either we will have these values until the

Not sure that this will complicate thing as it doesn't need any
specific inputs from DT as it uses properties already available.
Now, if we consider that the potential impact on the boot time of
using default value is not a issue, i agree that it doesn't worth
adding this step

Vincent

> profile happens (do we really think we will speed up pre-profile boot
> time much?) or we will have them as defaults, and every concern that
> brought this approach to be nacked will apply.
>
> Thanks,
>
> - Juri
>
>> >
>> > Thanks,
>> >
>> > - Juri
>> >
>> >> Then, the dynamic profiling can update it with a more accurate value
>> >> during the boot.
>> >>
>> >> >
>> >> >> Though there may be another issue with that as mentioned below.
>> >> >>
>> >> >> >  3. sysfs (v1)
>> >> >> >
>> >> >> >     pros: - clean and super easy to implement
>> >> >> >           - values don't require to be physical properties, defining them is
>> >> >> >             probably easier
>> >> >> >
>> >> >> >     cons: - CPUs capacity have to be provided after boot (by some init script?)
>> >> >> >           - API is modified, still some discussion/review is needed
>> >> >> >           - values can still be incorrectly used for runtime tuning purposes
>> >> >>
>> >> >> Initializing the values via userspace init will cause more of the boot
>> >> >> process to run with incorrect CPU capacity values. Boot times may be
>> >> >> increased with tasks running on suboptimal CPUs. Such increases may also
>> >> >> not be deterministic.
>> >> >>
>> >> >> Extending the kernel command line idea above, perhaps capacity values
>> >> >> could be provided there as well, similar to the lpj parameter? That has
>> >> >> scalability issues though if there's a huge highly heterogeneous platform...
>> >> >>
>> >> >
>> >> > Yeah, adding such option is not difficult, but I'm also a bit concerned
>> >> > about the scalability of such a thing.
>> >> >
>> >> >> DT solves these issues and would be the perfect place for this - we are
>> >> >> defining the compute capacity of a CPU which is a property of the
>> >> >> hardware. However there are a couple things forcing us to compromise.
>> >> >> One is that the amount and detail of information required to adequately
>> >> >> capture the computational abilities of a CPU across all possible
>> >> >> workloads seem onerous to collect and enumerate. The second is that even
>> >> >> if we were willing to undertake that, CPU vendors probably won't be
>> >> >> forthcoming with that information.
>> >> >>
>> >> >
>> >> > You mean because they won't publish performance data of their hw?
>> >> >
>> >> > But we already use per platform normalized values (as you are proposing
>> >> > below). So that a platform to platform comparison doesn't make sense.
>> >> >
>> >> >> Despite this DT still seems to me like the best way to go. At their
>> >> >> heart these are properties of the hardware, even if we can't specify
>> >> >> them as such per se because of the problems above. The capacity would
>> >> >> have to be defined as a relative value among CPUs. And while it's true
>> >> >> it may be abused for tuning purposes, that's true of any strategy.
>> >> >> Certainly the sysfs strategy and even if only a dynamic option is
>> >> >> provided, it is guaranteed to be hacked by platform vendors.
>> >> >
>> >> > I also like the DT approach and consider the sysfs option as something
>> >> > that can go together with any solution we want to adopt.
>> >>
>> >> I'm not sure that we should consider sysfs as an option because of all
>> >> the concerned that as already been put forward.
>> >> I would prefer a debugfs if we have play with these capacity values in
>> >> order to test their accuracy.
>> >>
>> >> Regards,
>> >> Vincent
>> >> >
>> >> > Best,
>> >> >
>> >> > - Juri
>> >>
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 15:13     ` Juri Lelli
@ 2016-01-18 19:25       ` Steve Muckle
  -1 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-18 19:25 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, linux-arm-kernel, peterz,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann, broonie

On 01/18/2016 07:13 AM, Juri Lelli wrote:
>> DT solves these issues and would be the perfect place for this - we are
>> > defining the compute capacity of a CPU which is a property of the
>> > hardware. However there are a couple things forcing us to compromise.
>> > One is that the amount and detail of information required to adequately
>> > capture the computational abilities of a CPU across all possible
>> > workloads seem onerous to collect and enumerate. The second is that even
>> > if we were willing to undertake that, CPU vendors probably won't be
>> > forthcoming with that information.
>> > 
>
> You mean because they won't publish performance data of their hw?

More specific things like IPC and other architectural details that could
comprise a precise physical definition of a CPU that would meet the
ideal goals of a device tree definition.

> But we already use per platform normalized values (as you are proposing
> below). So that a platform to platform comparison doesn't make sense.

Yeah I'm just advocating for that strategy here.

cheers,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-18 19:25       ` Steve Muckle
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-18 19:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/18/2016 07:13 AM, Juri Lelli wrote:
>> DT solves these issues and would be the perfect place for this - we are
>> > defining the compute capacity of a CPU which is a property of the
>> > hardware. However there are a couple things forcing us to compromise.
>> > One is that the amount and detail of information required to adequately
>> > capture the computational abilities of a CPU across all possible
>> > workloads seem onerous to collect and enumerate. The second is that even
>> > if we were willing to undertake that, CPU vendors probably won't be
>> > forthcoming with that information.
>> > 
>
> You mean because they won't publish performance data of their hw?

More specific things like IPC and other architectural details that could
comprise a precise physical definition of a CPU that would meet the
ideal goals of a device tree definition.

> But we already use per platform normalized values (as you are proposing
> below). So that a platform to platform comparison doesn't make sense.

Yeah I'm just advocating for that strategy here.

cheers,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-18 16:42           ` Vincent Guittot
@ 2016-01-19 10:59             ` Catalin Marinas
  -1 siblings, 0 replies; 49+ messages in thread
From: Catalin Marinas @ 2016-01-19 10:59 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Juri Lelli, Mark Rutland, Lorenzo Pieralisi,
	Russell King - ARM Linux, linux-pm, Peter Zijlstra, Mark Brown,
	Will Deacon, linux-kernel, Dietmar Eggemann, Rob Herring,
	Steve Muckle, Sudeep Holla, Morten Rasmussen, LAK

On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > On 18/01/16 17:13, Vincent Guittot wrote:
> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> >> > On 15/01/16 11:50, Steve Muckle wrote:
> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >> >  2. Dynamic profiling at boot (v2)
> >> >> >
> >> >> >     pros: - does not require a standardized definition of capacity
> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >> >           - does not require user/integrator work
> >> >> >
> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >> >             with several subsystems (e.g., cpufreq) is required
> >> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >> >             representative and simple enough to run at boot)
> >> >> >           - numbers might (and do) vary from boot to boot
> >> >>
> >> >> An important additional con that was mentioned earlier IIRC was the
> >> >> additional boot time required for the benchmark.
> >> >
> >> > Right. I forgot about that.
> >> >
> >> >> Perhaps there could be
> >> >> a kernel command line argument to bypass the benchmark if it is known
> >> >> that predetermined values will be provided via sysfs later?
> >> >>
> >> >
> >> > This might work, yes.
> >>
> >> Instead of command line, I prefer to use DT.

I fully agree. Command line doesn't scale with multiple CPUs, at most an
option to bypass the benchmark (though we could just skip it when the DT
values are present).

> >> Can't we use something similar to what is currently done in arm arch
> >> for the early stage of the boot ? We don't have to provide performance
> >> value for which it's difficult to find a consensus on how to define it
> >> and which benchmark should be used. We use the micro arch and the
> >> frequency of the core to define a relative capacity. This give us a
> >> relatively good idea of the capacity of each core.
> >
> > I'm not sure I understand what you are proposing. arm arch is currently
> > based on having static hardcoded data (efficiency values). But, this has
> > already been NACKed for arm64 during last review of this RFC.
> >
> > Are you proposing something different?
> 
> No, i'm proposing to use it at boot time until the dynamic profiling
> gives better value.
> We don't have to set any new properties.
> IIRC, It was nacked because it was of static hardcoded value that was
> not always reflecting the best accurate capacity of a system. IMHO,
> it's not that far from reality so can't this be used as an
> intermediate step while waiting for dynamic profiling ?

My nack for hard-coded values still stands since this is not just about
the microarchitecture (MIDR) but how the CPUs are integrated with the
SoC, additional caches, memory latency, maximum clock frequency (or you
rely on DT again to get this information and scale the initial CPU
capacity/efficiency accordingly). MIDR does not capture SoC details.

Two questions:

1. How is the boot time affected by the benchmark?
2. How is the boot time affected by considering all the CPUs the same?

My preference is for DT and sysfs (especially useful for
development/tuning) but I'm not opposed to a boot-time benchmark if
people insist on it. If the answer to point 2 is "insignificant", we
could as well defer the capacity setting to user space (sysfs).

-- 
Catalin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 10:59             ` Catalin Marinas
  0 siblings, 0 replies; 49+ messages in thread
From: Catalin Marinas @ 2016-01-19 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > On 18/01/16 17:13, Vincent Guittot wrote:
> >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> >> > On 15/01/16 11:50, Steve Muckle wrote:
> >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> >> >> >  2. Dynamic profiling at boot (v2)
> >> >> >
> >> >> >     pros: - does not require a standardized definition of capacity
> >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> >> >> >           - does not require user/integrator work
> >> >> >
> >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >> >> >             with several subsystems (e.g., cpufreq) is required
> >> >> >           - not easy to agree upon a single benchmark (that has to be both
> >> >> >             representative and simple enough to run at boot)
> >> >> >           - numbers might (and do) vary from boot to boot
> >> >>
> >> >> An important additional con that was mentioned earlier IIRC was the
> >> >> additional boot time required for the benchmark.
> >> >
> >> > Right. I forgot about that.
> >> >
> >> >> Perhaps there could be
> >> >> a kernel command line argument to bypass the benchmark if it is known
> >> >> that predetermined values will be provided via sysfs later?
> >> >>
> >> >
> >> > This might work, yes.
> >>
> >> Instead of command line, I prefer to use DT.

I fully agree. Command line doesn't scale with multiple CPUs, at most an
option to bypass the benchmark (though we could just skip it when the DT
values are present).

> >> Can't we use something similar to what is currently done in arm arch
> >> for the early stage of the boot ? We don't have to provide performance
> >> value for which it's difficult to find a consensus on how to define it
> >> and which benchmark should be used. We use the micro arch and the
> >> frequency of the core to define a relative capacity. This give us a
> >> relatively good idea of the capacity of each core.
> >
> > I'm not sure I understand what you are proposing. arm arch is currently
> > based on having static hardcoded data (efficiency values). But, this has
> > already been NACKed for arm64 during last review of this RFC.
> >
> > Are you proposing something different?
> 
> No, i'm proposing to use it at boot time until the dynamic profiling
> gives better value.
> We don't have to set any new properties.
> IIRC, It was nacked because it was of static hardcoded value that was
> not always reflecting the best accurate capacity of a system. IMHO,
> it's not that far from reality so can't this be used as an
> intermediate step while waiting for dynamic profiling ?

My nack for hard-coded values still stands since this is not just about
the microarchitecture (MIDR) but how the CPUs are integrated with the
SoC, additional caches, memory latency, maximum clock frequency (or you
rely on DT again to get this information and scale the initial CPU
capacity/efficiency accordingly). MIDR does not capture SoC details.

Two questions:

1. How is the boot time affected by the benchmark?
2. How is the boot time affected by considering all the CPUs the same?

My preference is for DT and sysfs (especially useful for
development/tuning) but I'm not opposed to a boot-time benchmark if
people insist on it. If the answer to point 2 is "insignificant", we
could as well defer the capacity setting to user space (sysfs).

-- 
Catalin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 10:59             ` Catalin Marinas
@ 2016-01-19 11:23               ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-19 11:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Vincent Guittot, Mark Rutland, Lorenzo Pieralisi,
	Russell King - ARM Linux, linux-pm, Peter Zijlstra, Mark Brown,
	Will Deacon, linux-kernel, Dietmar Eggemann, Rob Herring,
	Steve Muckle, Sudeep Holla, Morten Rasmussen, LAK

Hi Catalin,

On 19/01/16 10:59, Catalin Marinas wrote:
> On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> > On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > > On 18/01/16 17:13, Vincent Guittot wrote:
> > >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > >> > On 15/01/16 11:50, Steve Muckle wrote:
> > >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> > >> >> >  2. Dynamic profiling at boot (v2)
> > >> >> >
> > >> >> >     pros: - does not require a standardized definition of capacity
> > >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> > >> >> >           - does not require user/integrator work
> > >> >> >
> > >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> > >> >> >             with several subsystems (e.g., cpufreq) is required
> > >> >> >           - not easy to agree upon a single benchmark (that has to be both
> > >> >> >             representative and simple enough to run at boot)
> > >> >> >           - numbers might (and do) vary from boot to boot
> > >> >>
> > >> >> An important additional con that was mentioned earlier IIRC was the
> > >> >> additional boot time required for the benchmark.
> > >> >
> > >> > Right. I forgot about that.
> > >> >
> > >> >> Perhaps there could be
> > >> >> a kernel command line argument to bypass the benchmark if it is known
> > >> >> that predetermined values will be provided via sysfs later?
> > >> >>
> > >> >
> > >> > This might work, yes.
> > >>
> > >> Instead of command line, I prefer to use DT.
> 
> I fully agree. Command line doesn't scale with multiple CPUs, at most an
> option to bypass the benchmark (though we could just skip it when the DT
> values are present).
> 
> > >> Can't we use something similar to what is currently done in arm arch
> > >> for the early stage of the boot ? We don't have to provide performance
> > >> value for which it's difficult to find a consensus on how to define it
> > >> and which benchmark should be used. We use the micro arch and the
> > >> frequency of the core to define a relative capacity. This give us a
> > >> relatively good idea of the capacity of each core.
> > >
> > > I'm not sure I understand what you are proposing. arm arch is currently
> > > based on having static hardcoded data (efficiency values). But, this has
> > > already been NACKed for arm64 during last review of this RFC.
> > >
> > > Are you proposing something different?
> > 
> > No, i'm proposing to use it at boot time until the dynamic profiling
> > gives better value.
> > We don't have to set any new properties.
> > IIRC, It was nacked because it was of static hardcoded value that was
> > not always reflecting the best accurate capacity of a system. IMHO,
> > it's not that far from reality so can't this be used as an
> > intermediate step while waiting for dynamic profiling ?
> 
> My nack for hard-coded values still stands since this is not just about
> the microarchitecture (MIDR) but how the CPUs are integrated with the
> SoC, additional caches, memory latency, maximum clock frequency (or you
> rely on DT again to get this information and scale the initial CPU
> capacity/efficiency accordingly). MIDR does not capture SoC details.
> 
> Two questions:
> 
> 1. How is the boot time affected by the benchmark?
> 2. How is the boot time affected by considering all the CPUs the same?
> 
> My preference is for DT and sysfs (especially useful for
> development/tuning) but I'm not opposed to a boot-time benchmark if
> people insist on it. If the answer to point 2 is "insignificant", we
> could as well defer the capacity setting to user space (sysfs).
> 

Given that we are not targeting boot time with this, but rather better
performance afterwards, I don't expect significant differences; but,
I'll get numbers :).

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 11:23               ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-19 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 19/01/16 10:59, Catalin Marinas wrote:
> On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> > On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > > On 18/01/16 17:13, Vincent Guittot wrote:
> > >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > >> > On 15/01/16 11:50, Steve Muckle wrote:
> > >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:
> > >> >> >  2. Dynamic profiling at boot (v2)
> > >> >> >
> > >> >> >     pros: - does not require a standardized definition of capacity
> > >> >> >           - cannot be incorrectly tuned (once benchmark is fixed)
> > >> >> >           - does not require user/integrator work
> > >> >> >
> > >> >> >     cons: - not easy to come up with a clean solution, as it seems interaction
> > >> >> >             with several subsystems (e.g., cpufreq) is required
> > >> >> >           - not easy to agree upon a single benchmark (that has to be both
> > >> >> >             representative and simple enough to run at boot)
> > >> >> >           - numbers might (and do) vary from boot to boot
> > >> >>
> > >> >> An important additional con that was mentioned earlier IIRC was the
> > >> >> additional boot time required for the benchmark.
> > >> >
> > >> > Right. I forgot about that.
> > >> >
> > >> >> Perhaps there could be
> > >> >> a kernel command line argument to bypass the benchmark if it is known
> > >> >> that predetermined values will be provided via sysfs later?
> > >> >>
> > >> >
> > >> > This might work, yes.
> > >>
> > >> Instead of command line, I prefer to use DT.
> 
> I fully agree. Command line doesn't scale with multiple CPUs, at most an
> option to bypass the benchmark (though we could just skip it when the DT
> values are present).
> 
> > >> Can't we use something similar to what is currently done in arm arch
> > >> for the early stage of the boot ? We don't have to provide performance
> > >> value for which it's difficult to find a consensus on how to define it
> > >> and which benchmark should be used. We use the micro arch and the
> > >> frequency of the core to define a relative capacity. This give us a
> > >> relatively good idea of the capacity of each core.
> > >
> > > I'm not sure I understand what you are proposing. arm arch is currently
> > > based on having static hardcoded data (efficiency values). But, this has
> > > already been NACKed for arm64 during last review of this RFC.
> > >
> > > Are you proposing something different?
> > 
> > No, i'm proposing to use it at boot time until the dynamic profiling
> > gives better value.
> > We don't have to set any new properties.
> > IIRC, It was nacked because it was of static hardcoded value that was
> > not always reflecting the best accurate capacity of a system. IMHO,
> > it's not that far from reality so can't this be used as an
> > intermediate step while waiting for dynamic profiling ?
> 
> My nack for hard-coded values still stands since this is not just about
> the microarchitecture (MIDR) but how the CPUs are integrated with the
> SoC, additional caches, memory latency, maximum clock frequency (or you
> rely on DT again to get this information and scale the initial CPU
> capacity/efficiency accordingly). MIDR does not capture SoC details.
> 
> Two questions:
> 
> 1. How is the boot time affected by the benchmark?
> 2. How is the boot time affected by considering all the CPUs the same?
> 
> My preference is for DT and sysfs (especially useful for
> development/tuning) but I'm not opposed to a boot-time benchmark if
> people insist on it. If the answer to point 2 is "insignificant", we
> could as well defer the capacity setting to user space (sysfs).
> 

Given that we are not targeting boot time with this, but rather better
performance afterwards, I don't expect significant differences; but,
I'll get numbers :).

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 11:23               ` Juri Lelli
@ 2016-01-19 14:29                 ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-19 14:29 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Vincent Guittot, Mark Rutland, Lorenzo Pieralisi,
	Russell King - ARM Linux, linux-pm, Peter Zijlstra, Mark Brown,
	Will Deacon, linux-kernel, Dietmar Eggemann, Rob Herring,
	Steve Muckle, Sudeep Holla, Morten Rasmussen, LAK

On 19/01/16 11:23, Juri Lelli wrote:
> Hi Catalin,
> 
> On 19/01/16 10:59, Catalin Marinas wrote:
> > On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> > > On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > > > On 18/01/16 17:13, Vincent Guittot wrote:
> > > >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > > >> > On 15/01/16 11:50, Steve Muckle wrote:
> > > >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:

[...]

> > 
> > Two questions:
> > 
> > 1. How is the boot time affected by the benchmark?
> > 2. How is the boot time affected by considering all the CPUs the same?
> > 
> > My preference is for DT and sysfs (especially useful for
> > development/tuning) but I'm not opposed to a boot-time benchmark if
> > people insist on it. If the answer to point 2 is "insignificant", we
> > could as well defer the capacity setting to user space (sysfs).
> > 
> 
> Given that we are not targeting boot time with this, but rather better
> performance afterwards, I don't expect significant differences; but,
> I'll get numbers :).
> 

I've got some boot time numbers on TC2 and Juno based on timestamps.
They are of course not accurate and maybe not so representative of
products, but I guess still ballpark right.

I'm generally seeing ~1sec increase in boot time for 1 and practically
no difference for 2 (even after having added patches that provide
runtime performance improvements).

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 14:29                 ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-19 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/01/16 11:23, Juri Lelli wrote:
> Hi Catalin,
> 
> On 19/01/16 10:59, Catalin Marinas wrote:
> > On Mon, Jan 18, 2016 at 05:42:58PM +0100, Vincent Guittot wrote:
> > > On 18 January 2016 at 17:30, Juri Lelli <juri.lelli@arm.com> wrote:
> > > > On 18/01/16 17:13, Vincent Guittot wrote:
> > > >> On 18 January 2016 at 16:13, Juri Lelli <juri.lelli@arm.com> wrote:
> > > >> > On 15/01/16 11:50, Steve Muckle wrote:
> > > >> >> On 01/08/2016 06:09 AM, Juri Lelli wrote:

[...]

> > 
> > Two questions:
> > 
> > 1. How is the boot time affected by the benchmark?
> > 2. How is the boot time affected by considering all the CPUs the same?
> > 
> > My preference is for DT and sysfs (especially useful for
> > development/tuning) but I'm not opposed to a boot-time benchmark if
> > people insist on it. If the answer to point 2 is "insignificant", we
> > could as well defer the capacity setting to user space (sysfs).
> > 
> 
> Given that we are not targeting boot time with this, but rather better
> performance afterwards, I don't expect significant differences; but,
> I'll get numbers :).
> 

I've got some boot time numbers on TC2 and Juno based on timestamps.
They are of course not accurate and maybe not so representative of
products, but I guess still ballpark right.

I'm generally seeing ~1sec increase in boot time for 1 and practically
no difference for 2 (even after having added patches that provide
runtime performance improvements).

Best,

- Juri

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-08 14:09 ` Juri Lelli
@ 2016-01-19 15:05   ` Peter Zijlstra
  -1 siblings, 0 replies; 49+ messages in thread
From: Peter Zijlstra @ 2016-01-19 15:05 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, linux-arm-kernel, vincent.guittot,
	robh+dt, mark.rutland, linux, sudeep.holla, lorenzo.pieralisi,
	catalin.marinas, will.deacon, morten.rasmussen, dietmar.eggemann,
	broonie

On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> some integer computation, I'm sure there are better benchmarks around) on the
> first cpu of each frequency domain (assuming no u-arch differences inside
> domains), measure time to complete a fixed number of iterations and then
> normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> polishing this up or thinking about a better benchmark, as this is an RFC and
> I'd like discussion happening before we make this solution better
> working/looking. However, surprisingly, results are not that bad already:

>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work
> 
>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required
>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

This last point is a total pain for benchmarking, it means nothing is
every reproducible.

Therefore, I would always augment the above (2) with the below (3), such
that you can overwrite the results with a known stable set of numbers:

>  3. sysfs (v1)
> 
>     pros: - clean and super easy to implement
>           - values don't require to be physical properties, defining them is
>             probably easier
> 
>     cons: - CPUs capacity have to be provided after boot (by some init script?)
>           - API is modified, still some discussion/review is needed
>           - values can still be incorrectly used for runtime tuning purposes

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 15:05   ` Peter Zijlstra
  0 siblings, 0 replies; 49+ messages in thread
From: Peter Zijlstra @ 2016-01-19 15:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> Second version of this RFC proposes an alternative solution (w.r.t. v1) to the
> problem of how do we init CPUs original capacity: we run a bogus benchmark (for
> this RFC I simple stole int_sqrt from lib/ and I run that in a loop to perform
> some integer computation, I'm sure there are better benchmarks around) on the
> first cpu of each frequency domain (assuming no u-arch differences inside
> domains), measure time to complete a fixed number of iterations and then
> normalize results to SCHED_CAPACITY_SCALE (1024). I didn't spend much time in
> polishing this up or thinking about a better benchmark, as this is an RFC and
> I'd like discussion happening before we make this solution better
> working/looking. However, surprisingly, results are not that bad already:

>  2. Dynamic profiling at boot (v2)
> 
>     pros: - does not require a standardized definition of capacity
>           - cannot be incorrectly tuned (once benchmark is fixed)
>           - does not require user/integrator work
> 
>     cons: - not easy to come up with a clean solution, as it seems interaction
>             with several subsystems (e.g., cpufreq) is required
>           - not easy to agree upon a single benchmark (that has to be both
>             representative and simple enough to run at boot)
>           - numbers might (and do) vary from boot to boot

This last point is a total pain for benchmarking, it means nothing is
every reproducible.

Therefore, I would always augment the above (2) with the below (3), such
that you can overwrite the results with a known stable set of numbers:

>  3. sysfs (v1)
> 
>     pros: - clean and super easy to implement
>           - values don't require to be physical properties, defining them is
>             probably easier
> 
>     cons: - CPUs capacity have to be provided after boot (by some init script?)
>           - API is modified, still some discussion/review is needed
>           - values can still be incorrectly used for runtime tuning purposes

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 15:05   ` Peter Zijlstra
@ 2016-01-19 17:50     ` Mark Brown
  -1 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-19 17:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, linux-kernel, linux-pm, linux-arm-kernel,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann

[-- Attachment #1: Type: text/plain, Size: 910 bytes --]

On Tue, Jan 19, 2016 at 04:05:51PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:

> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot

> This last point is a total pain for benchmarking, it means nothing is
> every reproducible.

> Therefore, I would always augment the above (2) with the below (3), such
> that you can overwrite the results with a known stable set of numbers:

The suggestion when the previous version was being discussed was that
there are supposed to be some other knobs one uses for tuning and one
was never supposed to use these numbers.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 17:50     ` Mark Brown
  0 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-19 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 19, 2016 at 04:05:51PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:

> >     cons: - not easy to come up with a clean solution, as it seems interaction
> >             with several subsystems (e.g., cpufreq) is required
> >           - not easy to agree upon a single benchmark (that has to be both
> >             representative and simple enough to run at boot)
> >           - numbers might (and do) vary from boot to boot

> This last point is a total pain for benchmarking, it means nothing is
> every reproducible.

> Therefore, I would always augment the above (2) with the below (3), such
> that you can overwrite the results with a known stable set of numbers:

The suggestion when the previous version was being discussed was that
there are supposed to be some other knobs one uses for tuning and one
was never supposed to use these numbers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160119/2f579d28/attachment.sig>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 14:29                 ` Juri Lelli
@ 2016-01-19 19:48                   ` Steve Muckle
  -1 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-19 19:48 UTC (permalink / raw)
  To: Juri Lelli, Catalin Marinas
  Cc: Vincent Guittot, Mark Rutland, Lorenzo Pieralisi,
	Russell King - ARM Linux, linux-pm, Peter Zijlstra, Mark Brown,
	Will Deacon, linux-kernel, Dietmar Eggemann, Rob Herring,
	Sudeep Holla, Morten Rasmussen, LAK

On 01/19/2016 06:29 AM, Juri Lelli wrote:
>>> Two questions:
>>> > > 
>>> > > 1. How is the boot time affected by the benchmark?
>>> > > 2. How is the boot time affected by considering all the CPUs the same?
>>> > > 
>>> > > My preference is for DT and sysfs (especially useful for
>>> > > development/tuning) but I'm not opposed to a boot-time benchmark if
>>> > > people insist on it. If the answer to point 2 is "insignificant", we
>>> > > could as well defer the capacity setting to user space (sysfs).
>>> > > 
>> > 
>> > Given that we are not targeting boot time with this, but rather better
>> > performance afterwards, I don't expect significant differences; but,
>> > I'll get numbers :).
>> > 
> I've got some boot time numbers on TC2 and Juno based on timestamps.
> They are of course not accurate and maybe not so representative of
> products, but I guess still ballpark right.
> 
> I'm generally seeing ~1sec increase in boot time for 1 and practically
> no difference for 2 (even after having added patches that provide
> runtime performance improvements).

One second is considerable IMO. Aside from the general desire to have
shorter boot times on any platform there are environments like
automotive where boot time is critical.

How are the CPUs numbered on TC2 and Juno? When all CPUs are considered
the same, is work running on the big CPUs because of the way they are
numbered?

thanks,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 19:48                   ` Steve Muckle
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Muckle @ 2016-01-19 19:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/19/2016 06:29 AM, Juri Lelli wrote:
>>> Two questions:
>>> > > 
>>> > > 1. How is the boot time affected by the benchmark?
>>> > > 2. How is the boot time affected by considering all the CPUs the same?
>>> > > 
>>> > > My preference is for DT and sysfs (especially useful for
>>> > > development/tuning) but I'm not opposed to a boot-time benchmark if
>>> > > people insist on it. If the answer to point 2 is "insignificant", we
>>> > > could as well defer the capacity setting to user space (sysfs).
>>> > > 
>> > 
>> > Given that we are not targeting boot time with this, but rather better
>> > performance afterwards, I don't expect significant differences; but,
>> > I'll get numbers :).
>> > 
> I've got some boot time numbers on TC2 and Juno based on timestamps.
> They are of course not accurate and maybe not so representative of
> products, but I guess still ballpark right.
> 
> I'm generally seeing ~1sec increase in boot time for 1 and practically
> no difference for 2 (even after having added patches that provide
> runtime performance improvements).

One second is considerable IMO. Aside from the general desire to have
shorter boot times on any platform there are environments like
automotive where boot time is critical.

How are the CPUs numbered on TC2 and Juno? When all CPUs are considered
the same, is work running on the big CPUs because of the way they are
numbered?

thanks,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 19:48                   ` Steve Muckle
@ 2016-01-19 21:10                     ` Mark Brown
  -1 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-19 21:10 UTC (permalink / raw)
  To: Steve Muckle
  Cc: Juri Lelli, Catalin Marinas, Vincent Guittot, Mark Rutland,
	Lorenzo Pieralisi, Russell King - ARM Linux, linux-pm,
	Peter Zijlstra, Will Deacon, linux-kernel, Dietmar Eggemann,
	Rob Herring, Sudeep Holla, Morten Rasmussen, LAK

[-- Attachment #1: Type: text/plain, Size: 644 bytes --]

On Tue, Jan 19, 2016 at 11:48:15AM -0800, Steve Muckle wrote:
> On 01/19/2016 06:29 AM, Juri Lelli wrote:

> > I'm generally seeing ~1sec increase in boot time for 1 and practically
> > no difference for 2 (even after having added patches that provide
> > runtime performance improvements).

> One second is considerable IMO. Aside from the general desire to have
> shorter boot times on any platform there are environments like
> automotive where boot time is critical.

Yeah, definitely.  Is this actually blocking boot and if so can we
arrange to do this in parallel with other activity (with likely knock on
effects on reproducibility...)?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-19 21:10                     ` Mark Brown
  0 siblings, 0 replies; 49+ messages in thread
From: Mark Brown @ 2016-01-19 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 19, 2016 at 11:48:15AM -0800, Steve Muckle wrote:
> On 01/19/2016 06:29 AM, Juri Lelli wrote:

> > I'm generally seeing ~1sec increase in boot time for 1 and practically
> > no difference for 2 (even after having added patches that provide
> > runtime performance improvements).

> One second is considerable IMO. Aside from the general desire to have
> shorter boot times on any platform there are environments like
> automotive where boot time is critical.

Yeah, definitely.  Is this actually blocking boot and if so can we
arrange to do this in parallel with other activity (with likely knock on
effects on reproducibility...)?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160119/97a243b4/attachment.sig>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 21:10                     ` Mark Brown
@ 2016-01-20 10:22                       ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-20 10:22 UTC (permalink / raw)
  To: Mark Brown
  Cc: Steve Muckle, Catalin Marinas, Vincent Guittot, Mark Rutland,
	Lorenzo Pieralisi, Russell King - ARM Linux, linux-pm,
	Peter Zijlstra, Will Deacon, linux-kernel, Dietmar Eggemann,
	Rob Herring, Sudeep Holla, Morten Rasmussen, LAK

On 19/01/16 21:10, Mark Brown wrote:
> On Tue, Jan 19, 2016 at 11:48:15AM -0800, Steve Muckle wrote:
> > On 01/19/2016 06:29 AM, Juri Lelli wrote:
> 
> > > I'm generally seeing ~1sec increase in boot time for 1 and practically
> > > no difference for 2 (even after having added patches that provide
> > > runtime performance improvements).
> 
> > One second is considerable IMO. Aside from the general desire to have
> > shorter boot times on any platform there are environments like
> > automotive where boot time is critical.
> 
> Yeah, definitely.  Is this actually blocking boot and if so can we
> arrange to do this in parallel with other activity (with likely knock on
> effects on reproducibility...)?

No, this goes in parallel. That's also showed by the fact that the
benchmarking thing itself usually takes more that 1 sec, but it seems to
impact for that amount of time only.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-20 10:22                       ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-20 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/01/16 21:10, Mark Brown wrote:
> On Tue, Jan 19, 2016 at 11:48:15AM -0800, Steve Muckle wrote:
> > On 01/19/2016 06:29 AM, Juri Lelli wrote:
> 
> > > I'm generally seeing ~1sec increase in boot time for 1 and practically
> > > no difference for 2 (even after having added patches that provide
> > > runtime performance improvements).
> 
> > One second is considerable IMO. Aside from the general desire to have
> > shorter boot times on any platform there are environments like
> > automotive where boot time is critical.
> 
> Yeah, definitely.  Is this actually blocking boot and if so can we
> arrange to do this in parallel with other activity (with likely knock on
> effects on reproducibility...)?

No, this goes in parallel. That's also showed by the fact that the
benchmarking thing itself usually takes more that 1 sec, but it seems to
impact for that amount of time only.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
  2016-01-19 17:50     ` Mark Brown
@ 2016-01-20 10:25       ` Juri Lelli
  -1 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-20 10:25 UTC (permalink / raw)
  To: Mark Brown
  Cc: Peter Zijlstra, linux-kernel, linux-pm, linux-arm-kernel,
	vincent.guittot, robh+dt, mark.rutland, linux, sudeep.holla,
	lorenzo.pieralisi, catalin.marinas, will.deacon,
	morten.rasmussen, dietmar.eggemann

On 19/01/16 17:50, Mark Brown wrote:
> On Tue, Jan 19, 2016 at 04:05:51PM +0100, Peter Zijlstra wrote:
> > On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> 
> > >     cons: - not easy to come up with a clean solution, as it seems interaction
> > >             with several subsystems (e.g., cpufreq) is required
> > >           - not easy to agree upon a single benchmark (that has to be both
> > >             representative and simple enough to run at boot)
> > >           - numbers might (and do) vary from boot to boot
> 
> > This last point is a total pain for benchmarking, it means nothing is
> > every reproducible.
> 
> > Therefore, I would always augment the above (2) with the below (3), such
> > that you can overwrite the results with a known stable set of numbers:
> 
> The suggestion when the previous version was being discussed was that
> there are supposed to be some other knobs one uses for tuning and one
> was never supposed to use these numbers.

Right, DT solution might live without a sysfs interface, as you want to
use those other knobs for runtime tuning. Dynamic solution instead, and
I think this is what Peter was also pointing out, will most probably
require a sysfs interface for cases in which variation of default values
from boot to boot is not acceptable.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems
@ 2016-01-20 10:25       ` Juri Lelli
  0 siblings, 0 replies; 49+ messages in thread
From: Juri Lelli @ 2016-01-20 10:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/01/16 17:50, Mark Brown wrote:
> On Tue, Jan 19, 2016 at 04:05:51PM +0100, Peter Zijlstra wrote:
> > On Fri, Jan 08, 2016 at 02:09:28PM +0000, Juri Lelli wrote:
> 
> > >     cons: - not easy to come up with a clean solution, as it seems interaction
> > >             with several subsystems (e.g., cpufreq) is required
> > >           - not easy to agree upon a single benchmark (that has to be both
> > >             representative and simple enough to run at boot)
> > >           - numbers might (and do) vary from boot to boot
> 
> > This last point is a total pain for benchmarking, it means nothing is
> > every reproducible.
> 
> > Therefore, I would always augment the above (2) with the below (3), such
> > that you can overwrite the results with a known stable set of numbers:
> 
> The suggestion when the previous version was being discussed was that
> there are supposed to be some other knobs one uses for tuning and one
> was never supposed to use these numbers.

Right, DT solution might live without a sysfs interface, as you want to
use those other knobs for runtime tuning. Dynamic solution instead, and
I think this is what Peter was also pointing out, will most probably
require a sysfs interface for cases in which variation of default values
from boot to boot is not acceptable.

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2016-01-20 10:25 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08 14:09 [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems Juri Lelli
2016-01-08 14:09 ` Juri Lelli
2016-01-08 14:09 ` [RFC PATCH v2 1/4] ARM: initialize cpu_scale to its default Juri Lelli
2016-01-08 14:09   ` Juri Lelli
2016-01-08 14:09   ` Juri Lelli
2016-01-08 14:09 ` [RFC PATCH v2 2/4] drivers/cpufreq: implement init_cpu_capacity_default() Juri Lelli
2016-01-08 14:09   ` Juri Lelli
2016-01-08 14:09 ` [RFC PATCH v2 3/4] arm: Enable dynamic CPU capacity initialization Juri Lelli
2016-01-08 14:09   ` Juri Lelli
2016-01-08 14:09 ` [RFC PATCH v2 4/4] arm64: " Juri Lelli
2016-01-08 14:09   ` Juri Lelli
2016-01-15 18:01 ` [RFC PATCH v2 0/4] CPUs capacity information for heterogeneous systems Mark Brown
2016-01-15 18:01   ` Mark Brown
2016-01-18 15:01   ` Juri Lelli
2016-01-18 15:01     ` Juri Lelli
2016-01-15 19:50 ` Steve Muckle
2016-01-15 19:50   ` Steve Muckle
2016-01-18 15:13   ` Juri Lelli
2016-01-18 15:13     ` Juri Lelli
2016-01-18 16:13     ` Vincent Guittot
2016-01-18 16:13       ` Vincent Guittot
2016-01-18 16:30       ` Juri Lelli
2016-01-18 16:30         ` Juri Lelli
2016-01-18 16:42         ` Vincent Guittot
2016-01-18 16:42           ` Vincent Guittot
2016-01-18 17:08           ` Juri Lelli
2016-01-18 17:08             ` Juri Lelli
2016-01-18 17:23             ` Vincent Guittot
2016-01-18 17:23               ` Vincent Guittot
2016-01-19 10:59           ` Catalin Marinas
2016-01-19 10:59             ` Catalin Marinas
2016-01-19 11:23             ` Juri Lelli
2016-01-19 11:23               ` Juri Lelli
2016-01-19 14:29               ` Juri Lelli
2016-01-19 14:29                 ` Juri Lelli
2016-01-19 19:48                 ` Steve Muckle
2016-01-19 19:48                   ` Steve Muckle
2016-01-19 21:10                   ` Mark Brown
2016-01-19 21:10                     ` Mark Brown
2016-01-20 10:22                     ` Juri Lelli
2016-01-20 10:22                       ` Juri Lelli
2016-01-18 19:25     ` Steve Muckle
2016-01-18 19:25       ` Steve Muckle
2016-01-19 15:05 ` Peter Zijlstra
2016-01-19 15:05   ` Peter Zijlstra
2016-01-19 17:50   ` Mark Brown
2016-01-19 17:50     ` Mark Brown
2016-01-20 10:25     ` Juri Lelli
2016-01-20 10:25       ` Juri Lelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.