All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Luba <lukasz.luba@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	rafael@kernel.org
Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com,
	rui.zhang@intel.com, amit.kucheria@verdurent.com,
	amit.kachhap@gmail.com, daniel.lezcano@linaro.org,
	viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz,
	mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com,
	xuewen.yan94@gmail.com
Subject: [PATCH v8 13/23] PM: EM: Add performance field to struct em_perf_state and optimize
Date: Thu,  8 Feb 2024 11:55:47 +0000	[thread overview]
Message-ID: <20240208115557.1273962-14-lukasz.luba@arm.com> (raw)
In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com>

The performance doesn't scale linearly with the frequency. Also, it may
be different in different workloads. Some CPUs are designed to be
particularly good at some applications e.g. images or video processing
and other CPUs in different. When those different types of CPUs are
combined in one SoC they should be properly modeled to get max of the HW
in Energy Aware Scheduler (EAS). The Energy Model (EM) provides the
power vs. performance curves to the EAS, but assumes the CPUs capacity
is fixed and scales linearly with the frequency. This patch allows to
adjust the curve on the 'performance' axis as well.

Code speed optimization:
Removing map_util_freq() allows to avoid one division and one
multiplication operations from the EAS hot code path.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 include/linux/energy_model.h | 24 ++++++++++++------------
 kernel/power/energy_model.c  | 27 +++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 158dad6ea313..ce24ea3fe41c 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -13,6 +13,7 @@
 
 /**
  * struct em_perf_state - Performance state of a performance domain
+ * @performance:	CPU performance (capacity) at a given frequency
  * @frequency:	The frequency in KHz, for consistency with CPUFreq
  * @power:	The power consumed at this level (by 1 CPU or by a registered
  *		device). It can be a total power: static and dynamic.
@@ -21,6 +22,7 @@
  * @flags:	see "em_perf_state flags" description below.
  */
 struct em_perf_state {
+	unsigned long performance;
 	unsigned long frequency;
 	unsigned long power;
 	unsigned long cost;
@@ -196,25 +198,25 @@ void em_table_free(struct em_perf_table __rcu *table);
  * em_pd_get_efficient_state() - Get an efficient performance state from the EM
  * @table:		List of performance states, in ascending order
  * @nr_perf_states:	Number of performance states
- * @freq:		Frequency to map with the EM
+ * @max_util:		Max utilization to map with the EM
  * @pd_flags:		Performance Domain flags
  *
  * It is called from the scheduler code quite frequently and as a consequence
  * doesn't implement any check.
  *
- * Return: An efficient performance state id, high enough to meet @freq
+ * Return: An efficient performance state id, high enough to meet @max_util
  * requirement.
  */
 static inline int
 em_pd_get_efficient_state(struct em_perf_state *table, int nr_perf_states,
-			  unsigned long freq, unsigned long pd_flags)
+			  unsigned long max_util, unsigned long pd_flags)
 {
 	struct em_perf_state *ps;
 	int i;
 
 	for (i = 0; i < nr_perf_states; i++) {
 		ps = &table[i];
-		if (ps->frequency >= freq) {
+		if (ps->performance >= max_util) {
 			if (pd_flags & EM_PERF_DOMAIN_SKIP_INEFFICIENCIES &&
 			    ps->flags & EM_PERF_STATE_INEFFICIENT)
 				continue;
@@ -245,9 +247,9 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
 				unsigned long max_util, unsigned long sum_util,
 				unsigned long allowed_cpu_cap)
 {
-	unsigned long freq, ref_freq, scale_cpu;
 	struct em_perf_table *em_table;
 	struct em_perf_state *ps;
+	unsigned long scale_cpu;
 	int cpu, i;
 
 #ifdef CONFIG_SCHED_DEBUG
@@ -260,25 +262,23 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
 	/*
 	 * In order to predict the performance state, map the utilization of
 	 * the most utilized CPU of the performance domain to a requested
-	 * frequency, like schedutil. Take also into account that the real
-	 * frequency might be set lower (due to thermal capping). Thus, clamp
+	 * performance, like schedutil. Take also into account that the real
+	 * performance might be set lower (due to thermal capping). Thus, clamp
 	 * max utilization to the allowed CPU capacity before calculating
-	 * effective frequency.
+	 * effective performance.
 	 */
 	cpu = cpumask_first(to_cpumask(pd->cpus));
 	scale_cpu = arch_scale_cpu_capacity(cpu);
-	ref_freq = arch_scale_freq_ref(cpu);
 
 	max_util = min(max_util, allowed_cpu_cap);
-	freq = map_util_freq(max_util, ref_freq, scale_cpu);
 
 	/*
 	 * Find the lowest performance state of the Energy Model above the
-	 * requested frequency.
+	 * requested performance.
 	 */
 	em_table = rcu_dereference(pd->em_table);
 	i = em_pd_get_efficient_state(em_table->state, pd->nr_perf_states,
-				      freq, pd->flags);
+				      max_util, pd->flags);
 	ps = &em_table->state[i];
 
 	/*
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index 667619b70be7..41418aa6daa6 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -46,6 +46,7 @@ static void em_debug_create_ps(struct em_perf_state *ps, struct dentry *pd)
 	debugfs_create_ulong("frequency", 0444, d, &ps->frequency);
 	debugfs_create_ulong("power", 0444, d, &ps->power);
 	debugfs_create_ulong("cost", 0444, d, &ps->cost);
+	debugfs_create_ulong("performance", 0444, d, &ps->performance);
 	debugfs_create_ulong("inefficient", 0444, d, &ps->flags);
 }
 
@@ -159,6 +160,30 @@ struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd)
 	return table;
 }
 
+static void em_init_performance(struct device *dev, struct em_perf_domain *pd,
+				struct em_perf_state *table, int nr_states)
+{
+	u64 fmax, max_cap;
+	int i, cpu;
+
+	/* This is needed only for CPUs and EAS skip other devices */
+	if (!_is_cpu_device(dev))
+		return;
+
+	cpu = cpumask_first(em_span_cpus(pd));
+
+	/*
+	 * Calculate the performance value for each frequency with
+	 * linear relationship. The final CPU capacity might not be ready at
+	 * boot time, but the EM will be updated a bit later with correct one.
+	 */
+	fmax = (u64) table[nr_states - 1].frequency;
+	max_cap = (u64) arch_scale_cpu_capacity(cpu);
+	for (i = 0; i < nr_states; i++)
+		table[i].performance = div64_u64(max_cap * table[i].frequency,
+						 fmax);
+}
+
 static int em_compute_costs(struct device *dev, struct em_perf_state *table,
 			    struct em_data_callback *cb, int nr_states,
 			    unsigned long flags)
@@ -318,6 +343,8 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd,
 		table[i].frequency = prev_freq = freq;
 	}
 
+	em_init_performance(dev, pd, table, nr_states);
+
 	ret = em_compute_costs(dev, table, cb, nr_states, flags);
 	if (ret)
 		return -EINVAL;
-- 
2.25.1


  parent reply	other threads:[~2024-02-08 11:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-08 11:55 [PATCH v8 00/23] Introduce runtime modifiable Energy Model Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 01/23] PM: EM: Add missing newline for the message log Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 02/23] PM: EM: Extend em_cpufreq_update_efficiencies() argument list Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 03/23] PM: EM: Find first CPU active while updating OPP efficiency Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 04/23] PM: EM: Refactor em_pd_get_efficient_state() to be more flexible Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 05/23] PM: EM: Introduce em_compute_costs() Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 06/23] PM: EM: Check if the get_cost() callback is present in em_compute_costs() Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 07/23] PM: EM: Split the allocation and initialization of the EM table Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 08/23] PM: EM: Introduce runtime modifiable table Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 09/23] PM: EM: Use runtime modified EM for CPUs energy estimation in EAS Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 10/23] PM: EM: Add functions for memory allocations for new EM tables Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 11/23] PM: EM: Introduce em_dev_update_perf_domain() for EM updates Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 12/23] PM: EM: Add em_perf_state_from_pd() to get performance states table Lukasz Luba
2024-02-08 11:55 ` Lukasz Luba [this message]
2024-02-08 11:55 ` [PATCH v8 14/23] PM: EM: Support late CPUs booting and capacity adjustment Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 15/23] PM: EM: Optimize em_cpu_energy() and remove division Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 16/23] powercap/dtpm_cpu: Use new Energy Model interface to get table Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 17/23] powercap/dtpm_devfreq: " Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 18/23] drivers/thermal/cpufreq_cooling: Use new Energy Model interface Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 19/23] drivers/thermal/devfreq_cooling: " Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 20/23] PM: EM: Change debugfs configuration to use runtime EM table data Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 21/23] PM: EM: Remove old table Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 22/23] PM: EM: Add em_dev_compute_costs() Lukasz Luba
2024-02-08 11:55 ` [PATCH v8 23/23] Documentation: EM: Update with runtime modification design Lukasz Luba
2024-02-08 14:01 ` [PATCH v8 00/23] Introduce runtime modifiable Energy Model Rafael J. Wysocki
2024-02-08 15:00   ` Lukasz Luba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240208115557.1273962-14-lukasz.luba@arm.com \
    --to=lukasz.luba@arm.com \
    --cc=amit.kachhap@gmail.com \
    --cc=amit.kucheria@verdurent.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=qyousef@layalina.io \
    --cc=rafael@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=viresh.kumar@linaro.org \
    --cc=wvw@google.com \
    --cc=xuewen.yan94@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.