linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
@ 2020-12-15  5:34 Viresh Kumar
  2020-12-15  5:34 ` [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit Viresh Kumar
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Viresh Kumar @ 2020-12-15  5:34 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Viresh Kumar, Vincent Guittot, linux-arm-kernel, linux-kernel

Every time I have stumbled upon this routine, I get confused with the
way 'have_policy' is used and I have to dig in to understand why is it
so. Here is an attempt to make it easier to understand, and hopefully it
is an improvement.

The 'have_policy' check was just an optimization to avoid writing
to amu_fie_cpus in case we don't have to, but that optimization itself
is creating more confusion than the real work. Lets just do that if all
the CPUs support AMUs. It is much cleaner that way.

Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V3:
- Added Reviewed by tag.

 arch/arm64/kernel/topology.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index f6faa697e83e..ebadc73449f9 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -199,14 +199,14 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
 	return 0;
 }
 
-static inline bool
+static inline void
 enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
 {
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 
 	if (!policy) {
 		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
-		return false;
+		return;
 	}
 
 	if (cpumask_subset(policy->related_cpus, valid_cpus))
@@ -214,8 +214,6 @@ enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
 			   amu_fie_cpus);
 
 	cpufreq_cpu_put(policy);
-
-	return true;
 }
 
 static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
@@ -225,7 +223,6 @@ static int __init init_amu_fie(void)
 {
 	bool invariance_status = topology_scale_freq_invariant();
 	cpumask_var_t valid_cpus;
-	bool have_policy = false;
 	int ret = 0;
 	int cpu;
 
@@ -245,17 +242,12 @@ static int __init init_amu_fie(void)
 			continue;
 
 		cpumask_set_cpu(cpu, valid_cpus);
-		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);
+		enable_policy_freq_counters(cpu, valid_cpus);
 	}
 
-	/*
-	 * If we are not restricted by cpufreq policies, we only enable
-	 * the use of the AMU feature for FIE if all CPUs support AMU.
-	 * Otherwise, enable_policy_freq_counters has already enabled
-	 * policy cpus.
-	 */
-	if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask))
-		cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
+	/* Overwrite amu_fie_cpus if all CPUs support AMU */
+	if (cpumask_equal(valid_cpus, cpu_present_mask))
+		cpumask_copy(amu_fie_cpus, cpu_present_mask);
 
 	if (!cpumask_empty(amu_fie_cpus)) {
 		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
-- 
2.25.0.rc1.19.g042ed3e048af


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit
  2020-12-15  5:34 [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
@ 2020-12-15  5:34 ` Viresh Kumar
  2020-12-15 11:53   ` Ionela Voinescu
  2020-12-15  5:34 ` [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers Viresh Kumar
  2020-12-17  7:57 ` [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
  2 siblings, 1 reply; 17+ messages in thread
From: Viresh Kumar @ 2020-12-15  5:34 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Viresh Kumar, Vincent Guittot, linux-arm-kernel, linux-kernel

This patch does a couple of optimizations in init_amu_fie(), like early
exits from paths where we don't need to continue any further, avoid the
enable/disable dance, moving the calls to
topology_scale_freq_invariant() just when we need them, instead of at
the top of the routine, and avoiding calling it for the third time.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V3:
- Skipped the enable/disable dance.
- No need to call topology_scale_freq_invariant() multiple times.

 arch/arm64/kernel/topology.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index ebadc73449f9..57267d694495 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -221,8 +221,8 @@ static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
 
 static int __init init_amu_fie(void)
 {
-	bool invariance_status = topology_scale_freq_invariant();
 	cpumask_var_t valid_cpus;
+	bool invariant;
 	int ret = 0;
 	int cpu;
 
@@ -249,18 +249,19 @@ static int __init init_amu_fie(void)
 	if (cpumask_equal(valid_cpus, cpu_present_mask))
 		cpumask_copy(amu_fie_cpus, cpu_present_mask);
 
-	if (!cpumask_empty(amu_fie_cpus)) {
-		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
-			cpumask_pr_args(amu_fie_cpus));
-		static_branch_enable(&amu_fie_key);
-	}
+	if (cpumask_empty(amu_fie_cpus))
+		goto free_valid_mask;
 
-	/*
-	 * If the system is not fully invariant after AMU init, disable
-	 * partial use of counters for frequency invariance.
-	 */
-	if (!topology_scale_freq_invariant())
-		static_branch_disable(&amu_fie_key);
+	invariant = topology_scale_freq_invariant();
+
+	/* We aren't fully invariant yet */
+	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
+		goto free_valid_mask;
+
+	static_branch_enable(&amu_fie_key);
+
+	pr_info("CPUs[%*pbl]: counters will be used for FIE.",
+		cpumask_pr_args(amu_fie_cpus));
 
 	/*
 	 * Task scheduler behavior depends on frequency invariance support,
@@ -268,7 +269,7 @@ static int __init init_amu_fie(void)
 	 * a result of counter initialisation and use, retrigger the build of
 	 * scheduling domains to ensure the information is propagated properly.
 	 */
-	if (invariance_status != topology_scale_freq_invariant())
+	if (!invariant)
 		rebuild_sched_domains_energy();
 
 free_valid_mask:
-- 
2.25.0.rc1.19.g042ed3e048af


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-15  5:34 [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
  2020-12-15  5:34 ` [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit Viresh Kumar
@ 2020-12-15  5:34 ` Viresh Kumar
  2020-12-15 11:56   ` Ionela Voinescu
  2020-12-16  0:03   ` Ionela Voinescu
  2020-12-17  7:57 ` [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
  2 siblings, 2 replies; 17+ messages in thread
From: Viresh Kumar @ 2020-12-15  5:34 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Viresh Kumar, Vincent Guittot, linux-arm-kernel, linux-kernel

The AMU counters won't get used today if the cpufreq driver is built as
a module as the amu core requires everything to be ready by late init.

Fix that properly by registering for cpufreq policy notifier. Note that
the amu core don't have any cpufreq dependency after the first time
CPUFREQ_CREATE_POLICY notifier is called for all the CPUs. And so we
don't need to do anything on the CPUFREQ_REMOVE_POLICY notifier. And for
the same reason we check if the CPUs are already parsed in the beginning
of amu_fie_setup() and skip if that is true. Alternatively we can shoot
a work from there to unregister the notifier instead, but that seemed
too much instead of this simple check.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V3:
- This is a new patch.

Ionela,

I don't have a way to test the AMU stuff, will it be possible for you to
give it a try ?

 arch/arm64/kernel/topology.c | 88 +++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 57267d694495..0fc2b32ddb45 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -199,64 +199,33 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
 	return 0;
 }
 
-static inline void
-enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
-{
-	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
-
-	if (!policy) {
-		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
-		return;
-	}
-
-	if (cpumask_subset(policy->related_cpus, valid_cpus))
-		cpumask_or(amu_fie_cpus, policy->related_cpus,
-			   amu_fie_cpus);
-
-	cpufreq_cpu_put(policy);
-}
-
 static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
 #define amu_freq_invariant() static_branch_unlikely(&amu_fie_key)
 
-static int __init init_amu_fie(void)
+static void amu_fie_setup(const struct cpumask *cpus)
 {
-	cpumask_var_t valid_cpus;
 	bool invariant;
-	int ret = 0;
 	int cpu;
 
-	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL))
-		return -ENOMEM;
-
-	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {
-		ret = -ENOMEM;
-		goto free_valid_mask;
-	}
+	/* We are already set since the last insmod of cpufreq driver */
+	if (unlikely(cpumask_subset(cpus, amu_fie_cpus)))
+		return;
 
-	for_each_present_cpu(cpu) {
+	for_each_cpu(cpu, cpus) {
 		if (!freq_counters_valid(cpu) ||
 		    freq_inv_set_max_ratio(cpu,
 					   cpufreq_get_hw_max_freq(cpu) * 1000,
 					   arch_timer_get_rate()))
-			continue;
-
-		cpumask_set_cpu(cpu, valid_cpus);
-		enable_policy_freq_counters(cpu, valid_cpus);
+			return;
 	}
 
-	/* Overwrite amu_fie_cpus if all CPUs support AMU */
-	if (cpumask_equal(valid_cpus, cpu_present_mask))
-		cpumask_copy(amu_fie_cpus, cpu_present_mask);
-
-	if (cpumask_empty(amu_fie_cpus))
-		goto free_valid_mask;
+	cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus);
 
 	invariant = topology_scale_freq_invariant();
 
 	/* We aren't fully invariant yet */
 	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
-		goto free_valid_mask;
+		return;
 
 	static_branch_enable(&amu_fie_key);
 
@@ -271,13 +240,48 @@ static int __init init_amu_fie(void)
 	 */
 	if (!invariant)
 		rebuild_sched_domains_energy();
+}
+
+static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
+				 void *data)
+{
+	struct cpufreq_policy *policy = data;
+
+	if (val == CPUFREQ_CREATE_POLICY)
+		amu_fie_setup(policy->related_cpus);
+
+	/*
+	 * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU
+	 * counters don't have any dependency on cpufreq driver once we have
+	 * initialized AMU support and enabled invariance. The AMU counters will
+	 * keep on working just fine in the absence of the cpufreq driver, and
+	 * for the CPUs for which there are no counters availalbe, the last set
+	 * value of freq_scale will remain valid as that is the frequency those
+	 * CPUs are running at.
+	 */
+
+	return 0;
+}
+
+static struct notifier_block init_amu_fie_notifier = {
+	.notifier_call = init_amu_fie_callback,
+};
+
+static int __init init_amu_fie(void)
+{
+	int ret;
+
+	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))
+		return -ENOMEM;
 
-free_valid_mask:
-	free_cpumask_var(valid_cpus);
+	ret = cpufreq_register_notifier(&init_amu_fie_notifier,
+					CPUFREQ_POLICY_NOTIFIER);
+	if (ret)
+		free_cpumask_var(amu_fie_cpus);
 
 	return ret;
 }
-late_initcall_sync(init_amu_fie);
+core_initcall(init_amu_fie);
 
 bool arch_freq_counters_available(const struct cpumask *cpus)
 {
-- 
2.25.0.rc1.19.g042ed3e048af


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit
  2020-12-15  5:34 ` [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit Viresh Kumar
@ 2020-12-15 11:53   ` Ionela Voinescu
  0 siblings, 0 replies; 17+ messages in thread
From: Ionela Voinescu @ 2020-12-15 11:53 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Tuesday 15 Dec 2020 at 11:04:15 (+0530), Viresh Kumar wrote:
> This patch does a couple of optimizations in init_amu_fie(), like early
> exits from paths where we don't need to continue any further, avoid the
> enable/disable dance, moving the calls to
> topology_scale_freq_invariant() just when we need them, instead of at
> the top of the routine, and avoiding calling it for the third time.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3:
> - Skipped the enable/disable dance.
> - No need to call topology_scale_freq_invariant() multiple times.
> 
>  arch/arm64/kernel/topology.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index ebadc73449f9..57267d694495 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -221,8 +221,8 @@ static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
>  
>  static int __init init_amu_fie(void)
>  {
> -	bool invariance_status = topology_scale_freq_invariant();
>  	cpumask_var_t valid_cpus;
> +	bool invariant;
>  	int ret = 0;
>  	int cpu;
>  
> @@ -249,18 +249,19 @@ static int __init init_amu_fie(void)
>  	if (cpumask_equal(valid_cpus, cpu_present_mask))
>  		cpumask_copy(amu_fie_cpus, cpu_present_mask);
>  
> -	if (!cpumask_empty(amu_fie_cpus)) {
> -		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> -			cpumask_pr_args(amu_fie_cpus));
> -		static_branch_enable(&amu_fie_key);
> -	}
> +	if (cpumask_empty(amu_fie_cpus))
> +		goto free_valid_mask;
>  
> -	/*
> -	 * If the system is not fully invariant after AMU init, disable
> -	 * partial use of counters for frequency invariance.
> -	 */
> -	if (!topology_scale_freq_invariant())
> -		static_branch_disable(&amu_fie_key);
> +	invariant = topology_scale_freq_invariant();
> +
> +	/* We aren't fully invariant yet */
> +	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
> +		goto free_valid_mask;
> +
> +	static_branch_enable(&amu_fie_key);
> +
> +	pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> +		cpumask_pr_args(amu_fie_cpus));
>  
>  	/*
>  	 * Task scheduler behavior depends on frequency invariance support,
> @@ -268,7 +269,7 @@ static int __init init_amu_fie(void)
>  	 * a result of counter initialisation and use, retrigger the build of
>  	 * scheduling domains to ensure the information is propagated properly.
>  	 */
> -	if (invariance_status != topology_scale_freq_invariant())
> +	if (!invariant)
>  		rebuild_sched_domains_energy();
>  
>  free_valid_mask:
> -- 
> 2.25.0.rc1.19.g042ed3e048af
> 

Looks good!

Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-15  5:34 ` [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers Viresh Kumar
@ 2020-12-15 11:56   ` Ionela Voinescu
  2020-12-16  0:03   ` Ionela Voinescu
  1 sibling, 0 replies; 17+ messages in thread
From: Ionela Voinescu @ 2020-12-15 11:56 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi,

On Tuesday 15 Dec 2020 at 11:04:16 (+0530), Viresh Kumar wrote:
> The AMU counters won't get used today if the cpufreq driver is built as
> a module as the amu core requires everything to be ready by late init.
> 
> Fix that properly by registering for cpufreq policy notifier. Note that
> the amu core don't have any cpufreq dependency after the first time
> CPUFREQ_CREATE_POLICY notifier is called for all the CPUs. And so we
> don't need to do anything on the CPUFREQ_REMOVE_POLICY notifier. And for
> the same reason we check if the CPUs are already parsed in the beginning
> of amu_fie_setup() and skip if that is true. Alternatively we can shoot
> a work from there to unregister the notifier instead, but that seemed
> too much instead of this simple check.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3:
> - This is a new patch.
> 
> Ionela,
> 
> I don't have a way to test the AMU stuff, will it be possible for you to
> give it a try ?
> 

I'll review this new patch and then I'll give the full set a try.

Thanks for the interest in improving this,
Ionela.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-15  5:34 ` [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers Viresh Kumar
  2020-12-15 11:56   ` Ionela Voinescu
@ 2020-12-16  0:03   ` Ionela Voinescu
  2020-12-16  4:38     ` Viresh Kumar
  1 sibling, 1 reply; 17+ messages in thread
From: Ionela Voinescu @ 2020-12-16  0:03 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi,

On Tuesday 15 Dec 2020 at 11:04:16 (+0530), Viresh Kumar wrote:
> The AMU counters won't get used today if the cpufreq driver is built as
> a module as the amu core requires everything to be ready by late init.
> 
> Fix that properly by registering for cpufreq policy notifier. Note that
> the amu core don't have any cpufreq dependency after the first time
> CPUFREQ_CREATE_POLICY notifier is called for all the CPUs. And so we
> don't need to do anything on the CPUFREQ_REMOVE_POLICY notifier. And for
> the same reason we check if the CPUs are already parsed in the beginning
> of amu_fie_setup() and skip if that is true. Alternatively we can shoot
> a work from there to unregister the notifier instead, but that seemed
> too much instead of this simple check.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3:
> - This is a new patch.
> 
> Ionela,
> 
> I don't have a way to test the AMU stuff, will it be possible for you to
> give it a try ?
> 

My best AMU test setup is a hacked Juno :). A few runs with different
"AMU settings" showed everything works nicely. I'll continue reviewing
and testing tomorrow as I want to test with CPPC and on some models as
well.

>  arch/arm64/kernel/topology.c | 88 +++++++++++++++++++-----------------
>  1 file changed, 46 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 57267d694495..0fc2b32ddb45 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -199,64 +199,33 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
>  	return 0;
>  }
>  
> -static inline void
> -enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
> -{
> -	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> -
> -	if (!policy) {
> -		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> -		return;
> -	}
> -
> -	if (cpumask_subset(policy->related_cpus, valid_cpus))
> -		cpumask_or(amu_fie_cpus, policy->related_cpus,
> -			   amu_fie_cpus);
> -
> -	cpufreq_cpu_put(policy);
> -}
> -
>  static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
>  #define amu_freq_invariant() static_branch_unlikely(&amu_fie_key)
>  
> -static int __init init_amu_fie(void)
> +static void amu_fie_setup(const struct cpumask *cpus)
>  {
> -	cpumask_var_t valid_cpus;
>  	bool invariant;
> -	int ret = 0;
>  	int cpu;
>  
> -	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL))
> -		return -ENOMEM;
> -
> -	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {
> -		ret = -ENOMEM;
> -		goto free_valid_mask;
> -	}
> +	/* We are already set since the last insmod of cpufreq driver */
> +	if (unlikely(cpumask_subset(cpus, amu_fie_cpus)))
> +		return;
>  
> -	for_each_present_cpu(cpu) {
> +	for_each_cpu(cpu, cpus) {
>  		if (!freq_counters_valid(cpu) ||
>  		    freq_inv_set_max_ratio(cpu,
>  					   cpufreq_get_hw_max_freq(cpu) * 1000,
>  					   arch_timer_get_rate()))
> -			continue;
> -
> -		cpumask_set_cpu(cpu, valid_cpus);
> -		enable_policy_freq_counters(cpu, valid_cpus);
> +			return;
>  	}
>  
> -	/* Overwrite amu_fie_cpus if all CPUs support AMU */
> -	if (cpumask_equal(valid_cpus, cpu_present_mask))
> -		cpumask_copy(amu_fie_cpus, cpu_present_mask);
> -
> -	if (cpumask_empty(amu_fie_cpus))
> -		goto free_valid_mask;
> +	cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus);
>  
>  	invariant = topology_scale_freq_invariant();
>  
>  	/* We aren't fully invariant yet */
>  	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
> -		goto free_valid_mask;
> +		return;
>  
>  	static_branch_enable(&amu_fie_key);
>  

If we are cpufreq invariant, we'll reach this point and the following
pr_info, even if not all CPUs have been checked, which (at this late
hour) I think it's functionally fine.

But we get prints like:

[    2.665918] AMU: CPUs[0-3]: counters will be used for FIE.
[    2.666293] AMU: CPUs[0-5]: counters will be used for FIE.

For two policies this is fine (although confusing) but if we had more
CPUs and more policies, it would be too many lines.

I'm not sure if there's a better way of fixing this other than keeping
track of all visited CPUs and printing this line when all online CPUs
have been visited.

> @@ -271,13 +240,48 @@ static int __init init_amu_fie(void)
>  	 */
>  	if (!invariant)
>  		rebuild_sched_domains_energy();
> +}
> +
> +static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
> +				 void *data)
> +{
> +	struct cpufreq_policy *policy = data;
> +
> +	if (val == CPUFREQ_CREATE_POLICY)
> +		amu_fie_setup(policy->related_cpus);
> +

Is is guaranteed that cpuinfo.max_freq is always set at this point?

I have a vague recollection of a scenario where cpuinfo.max_freq could
be populated later in case the driver needs to talk to firmware to
obtain this value.

The setup above will fail if the CPU's max frequency cannot be obtained.

Thanks,
Ionela.

> +	/*
> +	 * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU
> +	 * counters don't have any dependency on cpufreq driver once we have
> +	 * initialized AMU support and enabled invariance. The AMU counters will
> +	 * keep on working just fine in the absence of the cpufreq driver, and
> +	 * for the CPUs for which there are no counters availalbe, the last set
> +	 * value of freq_scale will remain valid as that is the frequency those
> +	 * CPUs are running at.
> +	 */
> +
> +	return 0;
> +}
> +
> +static struct notifier_block init_amu_fie_notifier = {
> +	.notifier_call = init_amu_fie_callback,
> +};
> +
> +static int __init init_amu_fie(void)
> +{
> +	int ret;
> +
> +	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))
> +		return -ENOMEM;
>  
> -free_valid_mask:
> -	free_cpumask_var(valid_cpus);
> +	ret = cpufreq_register_notifier(&init_amu_fie_notifier,
> +					CPUFREQ_POLICY_NOTIFIER);
> +	if (ret)
> +		free_cpumask_var(amu_fie_cpus);
>  
>  	return ret;
>  }
> -late_initcall_sync(init_amu_fie);
> +core_initcall(init_amu_fie);
>  
>  bool arch_freq_counters_available(const struct cpumask *cpus)
>  {
> -- 
> 2.25.0.rc1.19.g042ed3e048af
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-16  0:03   ` Ionela Voinescu
@ 2020-12-16  4:38     ` Viresh Kumar
  2020-12-16 19:37       ` Ionela Voinescu
  0 siblings, 1 reply; 17+ messages in thread
From: Viresh Kumar @ 2020-12-16  4:38 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 16-12-20, 00:03, Ionela Voinescu wrote:
> Hi,
> 
> On Tuesday 15 Dec 2020 at 11:04:16 (+0530), Viresh Kumar wrote:
> > The AMU counters won't get used today if the cpufreq driver is built as
> > a module as the amu core requires everything to be ready by late init.
> > 
> > Fix that properly by registering for cpufreq policy notifier. Note that
> > the amu core don't have any cpufreq dependency after the first time
> > CPUFREQ_CREATE_POLICY notifier is called for all the CPUs. And so we
> > don't need to do anything on the CPUFREQ_REMOVE_POLICY notifier. And for
> > the same reason we check if the CPUs are already parsed in the beginning
> > of amu_fie_setup() and skip if that is true. Alternatively we can shoot
> > a work from there to unregister the notifier instead, but that seemed
> > too much instead of this simple check.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> > V3:
> > - This is a new patch.
> > 
> > Ionela,
> > 
> > I don't have a way to test the AMU stuff, will it be possible for you to
> > give it a try ?
> > 
> 
> My best AMU test setup is a hacked Juno :).

Everyone is hacking around :)

> A few runs with different
> "AMU settings" showed everything works nicely. I'll continue reviewing
> and testing tomorrow as I want to test with CPPC and on some models as
> well.
> 
> >  arch/arm64/kernel/topology.c | 88 +++++++++++++++++++-----------------
> >  1 file changed, 46 insertions(+), 42 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> > index 57267d694495..0fc2b32ddb45 100644
> > --- a/arch/arm64/kernel/topology.c
> > +++ b/arch/arm64/kernel/topology.c
> > @@ -199,64 +199,33 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
> >  	return 0;
> >  }
> >  
> > -static inline void
> > -enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
> > -{
> > -	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> > -
> > -	if (!policy) {
> > -		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> > -		return;
> > -	}
> > -
> > -	if (cpumask_subset(policy->related_cpus, valid_cpus))
> > -		cpumask_or(amu_fie_cpus, policy->related_cpus,
> > -			   amu_fie_cpus);
> > -
> > -	cpufreq_cpu_put(policy);
> > -}
> > -
> >  static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
> >  #define amu_freq_invariant() static_branch_unlikely(&amu_fie_key)
> >  
> > -static int __init init_amu_fie(void)
> > +static void amu_fie_setup(const struct cpumask *cpus)
> >  {
> > -	cpumask_var_t valid_cpus;
> >  	bool invariant;
> > -	int ret = 0;
> >  	int cpu;
> >  
> > -	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL))
> > -		return -ENOMEM;
> > -
> > -	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {
> > -		ret = -ENOMEM;
> > -		goto free_valid_mask;
> > -	}
> > +	/* We are already set since the last insmod of cpufreq driver */
> > +	if (unlikely(cpumask_subset(cpus, amu_fie_cpus)))
> > +		return;
> >  
> > -	for_each_present_cpu(cpu) {
> > +	for_each_cpu(cpu, cpus) {
> >  		if (!freq_counters_valid(cpu) ||
> >  		    freq_inv_set_max_ratio(cpu,
> >  					   cpufreq_get_hw_max_freq(cpu) * 1000,
> >  					   arch_timer_get_rate()))
> > -			continue;
> > -
> > -		cpumask_set_cpu(cpu, valid_cpus);
> > -		enable_policy_freq_counters(cpu, valid_cpus);
> > +			return;
> >  	}
> >  
> > -	/* Overwrite amu_fie_cpus if all CPUs support AMU */
> > -	if (cpumask_equal(valid_cpus, cpu_present_mask))
> > -		cpumask_copy(amu_fie_cpus, cpu_present_mask);
> > -
> > -	if (cpumask_empty(amu_fie_cpus))
> > -		goto free_valid_mask;
> > +	cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus);
> >  
> >  	invariant = topology_scale_freq_invariant();
> >  
> >  	/* We aren't fully invariant yet */
> >  	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
> > -		goto free_valid_mask;
> > +		return;
> >  
> >  	static_branch_enable(&amu_fie_key);
> >  
> 
> If we are cpufreq invariant, we'll reach this point and the following
> pr_info, even if not all CPUs have been checked, which (at this late
> hour) I think it's functionally fine.

Even then I think we should print cpus here instead of amu_fie_cpus,
just to not repeat things..

> But we get prints like:
> 
> [    2.665918] AMU: CPUs[0-3]: counters will be used for FIE.
> [    2.666293] AMU: CPUs[0-5]: counters will be used for FIE.
> 
> For two policies this is fine (although confusing) but if we had more
> CPUs and more policies, it would be too many lines.
> 
> I'm not sure if there's a better way of fixing this other than keeping
> track of all visited CPUs and printing this line when all online CPUs
> have been visited.

This is at best a debug message and maybe we should just convert it to
that and then it should be fine ?

And any logic added to print a better message isn't worth it I
believe.

> > @@ -271,13 +240,48 @@ static int __init init_amu_fie(void)
> >  	 */
> >  	if (!invariant)
> >  		rebuild_sched_domains_energy();
> > +}
> > +
> > +static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
> > +				 void *data)
> > +{
> > +	struct cpufreq_policy *policy = data;
> > +
> > +	if (val == CPUFREQ_CREATE_POLICY)
> > +		amu_fie_setup(policy->related_cpus);
> > +
> 
> Is is guaranteed that cpuinfo.max_freq is always set at this point?

Yes, we call this after the policy is initialized and all basic setup
is done.

> I have a vague recollection of a scenario where cpuinfo.max_freq could
> be populated later in case the driver needs to talk to firmware to
> obtain this value.

I don't remember anything like that for sure, we can see what to do if
we ever come across such a scenario.

> The setup above will fail if the CPU's max frequency cannot be obtained.

Yeah, I agree. Shouldn't happen though.

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-16  4:38     ` Viresh Kumar
@ 2020-12-16 19:37       ` Ionela Voinescu
  2020-12-17 10:50         ` Viresh Kumar
  0 siblings, 1 reply; 17+ messages in thread
From: Ionela Voinescu @ 2020-12-16 19:37 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi,

On Wednesday 16 Dec 2020 at 10:08:05 (+0530), Viresh Kumar wrote:
[..]
> > > +static void amu_fie_setup(const struct cpumask *cpus)
> > >  {
> > > -	cpumask_var_t valid_cpus;
> > >  	bool invariant;
> > > -	int ret = 0;
> > >  	int cpu;
> > >  
> > > -	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL))
> > > -		return -ENOMEM;
> > > -
> > > -	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {
> > > -		ret = -ENOMEM;
> > > -		goto free_valid_mask;
> > > -	}
> > > +	/* We are already set since the last insmod of cpufreq driver */
> > > +	if (unlikely(cpumask_subset(cpus, amu_fie_cpus)))
> > > +		return;
> > >  
> > > -	for_each_present_cpu(cpu) {
> > > +	for_each_cpu(cpu, cpus) {
> > >  		if (!freq_counters_valid(cpu) ||
> > >  		    freq_inv_set_max_ratio(cpu,
> > >  					   cpufreq_get_hw_max_freq(cpu) * 1000,
> > >  					   arch_timer_get_rate()))
> > > -			continue;
> > > -
> > > -		cpumask_set_cpu(cpu, valid_cpus);
> > > -		enable_policy_freq_counters(cpu, valid_cpus);
> > > +			return;
> > >  	}
> > >  
> > > -	/* Overwrite amu_fie_cpus if all CPUs support AMU */
> > > -	if (cpumask_equal(valid_cpus, cpu_present_mask))
> > > -		cpumask_copy(amu_fie_cpus, cpu_present_mask);
> > > -
> > > -	if (cpumask_empty(amu_fie_cpus))
> > > -		goto free_valid_mask;
> > > +	cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus);
> > >  
> > >  	invariant = topology_scale_freq_invariant();
> > >  
> > >  	/* We aren't fully invariant yet */
> > >  	if (!invariant && !cpumask_equal(amu_fie_cpus, cpu_present_mask))
> > > -		goto free_valid_mask;
> > > +		return;
> > >  
> > >  	static_branch_enable(&amu_fie_key);
> > >  
> > 
> > If we are cpufreq invariant, we'll reach this point and the following
> > pr_info, even if not all CPUs have been checked, which (at this late
> > hour) I think it's functionally fine.
> 
> Even then I think we should print cpus here instead of amu_fie_cpus,
> just to not repeat things..
> 

Agreed!

> > But we get prints like:
> > 
> > [    2.665918] AMU: CPUs[0-3]: counters will be used for FIE.
> > [    2.666293] AMU: CPUs[0-5]: counters will be used for FIE.
> > 
> > For two policies this is fine (although confusing) but if we had more
> > CPUs and more policies, it would be too many lines.
> > 
> > I'm not sure if there's a better way of fixing this other than keeping
> > track of all visited CPUs and printing this line when all online CPUs
> > have been visited.
> 
> This is at best a debug message and maybe we should just convert it to
> that and then it should be fine ?
> 
> And any logic added to print a better message isn't worth it I
> believe.
> 

That's true. The reason I would have wanted a single line to mention the
use of counters for FIE is because it's difficult to notice frequency
invariance issues. Everything might seems to work fine but without
or with broken frequency invariance the scheduler would not behave
optimally.

But we did the same compromise for EAS so making this a debug message is
fine I think.

> > > @@ -271,13 +240,48 @@ static int __init init_amu_fie(void)
> > >  	 */
> > >  	if (!invariant)
> > >  		rebuild_sched_domains_energy();
> > > +}
> > > +
> > > +static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
> > > +				 void *data)
> > > +{
> > > +	struct cpufreq_policy *policy = data;
> > > +
> > > +	if (val == CPUFREQ_CREATE_POLICY)
> > > +		amu_fie_setup(policy->related_cpus);
> > > +
> > 
> > Is is guaranteed that cpuinfo.max_freq is always set at this point?
> 
> Yes, we call this after the policy is initialized and all basic setup
> is done.
> 
> > I have a vague recollection of a scenario where cpuinfo.max_freq could
> > be populated later in case the driver needs to talk to firmware to
> > obtain this value.
> 
> I don't remember anything like that for sure, we can see what to do if
> we ever come across such a scenario.
> 

Makes sense!

> > The setup above will fail if the CPU's max frequency cannot be obtained.
> 
> Yeah, I agree. Shouldn't happen though.
> 

> > > +	/*
> > > +	 * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU
> > > +	 * counters don't have any dependency on cpufreq driver once we have
> > > +	 * initialized AMU support and enabled invariance. The AMU counters will
> > > +	 * keep on working just fine in the absence of the cpufreq driver, and
> > > +	 * for the CPUs for which there are no counters availalbe, the last set
                                                        ^^^^^^^^^
							available

> > > +	 * value of freq_scale will remain valid as that is the frequency those
> > > +	 * CPUs are running at.
> +	 */

I did not yet test this, but reading this comment made me wonder..

arch_scale_freq_invariant() (or topology_scale_freq_invariant()) is also
called from schedutil when obtaining the next frequency.

So if we had a system that only partly supports AMUs but had at some
point a cpufreq driver that provided FIE for the other CPUs, when we
unregister the driver, the cpufreq_freq_invariance static key is
disabled. Therefore, none of the conditions for system invariance is
now accomplished and arch_scale_freq_invariant() will return false.
This will be broken as utilization is still scaled, but the algorithm
for computing the next frequency in schedutil will not take this into
account.

[..]
> > > +	ret = cpufreq_register_notifier(&init_amu_fie_notifier,
> > > +					CPUFREQ_POLICY_NOTIFIER);

The above makes the use of AMUs for FIE tightly coupled with cpufreq.

Initially I made cpufreq_get_hw_max_freq(cpu) a weak function for the
possible platforms that might not use a cpufreq driver and might want to
provide this function to still benefit from the use of counters for
frequency invariance.

But I'm starting to believe that supporting all these corner-cases in
advance just introduces messiness.

So feel free to remove the 'weak' state of cpufreq_get_hw_max_freq() as
well, so we don't keep wondering why we had that in the first place.
It would not make any sense keeping that in light of these changes.

P.S. I will be on holiday starting tomorrow until beginning of January.
Were you intending this for 5.11, or can I take more time to review
future versions and continue testing?

Thanks,
Ionela.

> -- 
> viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
  2020-12-15  5:34 [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
  2020-12-15  5:34 ` [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit Viresh Kumar
  2020-12-15  5:34 ` [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers Viresh Kumar
@ 2020-12-17  7:57 ` Viresh Kumar
  2020-12-17 10:55   ` Catalin Marinas
  2 siblings, 1 reply; 17+ messages in thread
From: Viresh Kumar @ 2020-12-17  7:57 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Vincent Guittot, linux-arm-kernel, linux-kernel

On 15-12-20, 11:04, Viresh Kumar wrote:
> Every time I have stumbled upon this routine, I get confused with the
> way 'have_policy' is used and I have to dig in to understand why is it
> so. Here is an attempt to make it easier to understand, and hopefully it
> is an improvement.
> 
> The 'have_policy' check was just an optimization to avoid writing
> to amu_fie_cpus in case we don't have to, but that optimization itself
> is creating more confusion than the real work. Lets just do that if all
> the CPUs support AMUs. It is much cleaner that way.
> 
> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3:
> - Added Reviewed by tag.

Catalin, please pick the first two patches for 5.11. I will send the
last one separately later on.

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-16 19:37       ` Ionela Voinescu
@ 2020-12-17 10:50         ` Viresh Kumar
  2021-01-08  9:44           ` Ionela Voinescu
  0 siblings, 1 reply; 17+ messages in thread
From: Viresh Kumar @ 2020-12-17 10:50 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 16-12-20, 19:37, Ionela Voinescu wrote:
> I did not yet test this, but reading this comment made me wonder..
> 
> arch_scale_freq_invariant() (or topology_scale_freq_invariant()) is also
> called from schedutil when obtaining the next frequency.
> 
> So if we had a system that only partly supports AMUs but had at some
> point a cpufreq driver that provided FIE for the other CPUs, when we
> unregister the driver, the cpufreq_freq_invariance static key is
> disabled. Therefore, none of the conditions for system invariance is
> now accomplished and arch_scale_freq_invariant() will return false.
> This will be broken as utilization is still scaled, but the algorithm
> for computing the next frequency in schedutil will not take this into
> account.

I think the best and the easiest solution for this is:

bool arch_freq_counters_available(const struct cpumask *cpus)
{
        return amu_freq_invariant();
}

But we probably need to rename it to something like arch_is_fie().

> 
> [..]
> > > > +	ret = cpufreq_register_notifier(&init_amu_fie_notifier,
> > > > +					CPUFREQ_POLICY_NOTIFIER);
> 
> The above makes the use of AMUs for FIE tightly coupled with cpufreq.
> 
> Initially I made cpufreq_get_hw_max_freq(cpu) a weak function for the
> possible platforms that might not use a cpufreq driver and might want to
> provide this function to still benefit from the use of counters for
> frequency invariance.
> 
> But I'm starting to believe that supporting all these corner-cases in
> advance just introduces messiness.
> 
> So feel free to remove the 'weak' state of cpufreq_get_hw_max_freq() as
> well, so we don't keep wondering why we had that in the first place.
> It would not make any sense keeping that in light of these changes.

Will do it in a separate patch then.

> P.S. I will be on holiday starting tomorrow until beginning of January.
> Were you intending this for 5.11, or can I take more time to review
> future versions and continue testing?

I wanted to  :)

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
  2020-12-17  7:57 ` [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
@ 2020-12-17 10:55   ` Catalin Marinas
  2020-12-18  4:26     ` Viresh Kumar
  0 siblings, 1 reply; 17+ messages in thread
From: Catalin Marinas @ 2020-12-17 10:55 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ionela Voinescu, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi Viresh,

On Thu, Dec 17, 2020 at 01:27:32PM +0530, Viresh Kumar wrote:
> On 15-12-20, 11:04, Viresh Kumar wrote:
> > Every time I have stumbled upon this routine, I get confused with the
> > way 'have_policy' is used and I have to dig in to understand why is it
> > so. Here is an attempt to make it easier to understand, and hopefully it
> > is an improvement.
> > 
> > The 'have_policy' check was just an optimization to avoid writing
> > to amu_fie_cpus in case we don't have to, but that optimization itself
> > is creating more confusion than the real work. Lets just do that if all
> > the CPUs support AMUs. It is much cleaner that way.
> > 
> > Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> > V3:
> > - Added Reviewed by tag.
> 
> Catalin, please pick the first two patches for 5.11. I will send the
> last one separately later on.

I haven't figured out whether these are fixes (a cover letter would
help ;)). They look like generic improvements to me and given that we
are already in the 5.11 merging window, they would probably need to wait
until 5.12.

-- 
Catalin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
  2020-12-17 10:55   ` Catalin Marinas
@ 2020-12-18  4:26     ` Viresh Kumar
  2020-12-18 11:01       ` Catalin Marinas
  0 siblings, 1 reply; 17+ messages in thread
From: Viresh Kumar @ 2020-12-18  4:26 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ionela Voinescu, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 17-12-20, 10:55, Catalin Marinas wrote:
> Hi Viresh,
> 
> On Thu, Dec 17, 2020 at 01:27:32PM +0530, Viresh Kumar wrote:
> > On 15-12-20, 11:04, Viresh Kumar wrote:
> > > Every time I have stumbled upon this routine, I get confused with the
> > > way 'have_policy' is used and I have to dig in to understand why is it
> > > so. Here is an attempt to make it easier to understand, and hopefully it
> > > is an improvement.
> > > 
> > > The 'have_policy' check was just an optimization to avoid writing
> > > to amu_fie_cpus in case we don't have to, but that optimization itself
> > > is creating more confusion than the real work. Lets just do that if all
> > > the CPUs support AMUs. It is much cleaner that way.
> > > 
> > > Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > > ---
> > > V3:
> > > - Added Reviewed by tag.
> > 
> > Catalin, please pick the first two patches for 5.11. I will send the
> > last one separately later on.
> 
> I haven't figured out whether these are fixes (a cover letter would
> help ;)). They look like generic improvements to me

Right they are and since the merge window just opened I thought these
don't really need to wait for another full cycle to get in.

> and given that we
> are already in the 5.11 merging window, they would probably need to wait
> until 5.12.

Whatever you decide :)

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
  2020-12-18  4:26     ` Viresh Kumar
@ 2020-12-18 11:01       ` Catalin Marinas
  2020-12-18 11:04         ` Viresh Kumar
  0 siblings, 1 reply; 17+ messages in thread
From: Catalin Marinas @ 2020-12-18 11:01 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ionela Voinescu, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Fri, Dec 18, 2020 at 09:56:02AM +0530, Viresh Kumar wrote:
> On 17-12-20, 10:55, Catalin Marinas wrote:
> > Hi Viresh,
> > 
> > On Thu, Dec 17, 2020 at 01:27:32PM +0530, Viresh Kumar wrote:
> > > On 15-12-20, 11:04, Viresh Kumar wrote:
> > > > Every time I have stumbled upon this routine, I get confused with the
> > > > way 'have_policy' is used and I have to dig in to understand why is it
> > > > so. Here is an attempt to make it easier to understand, and hopefully it
> > > > is an improvement.
> > > > 
> > > > The 'have_policy' check was just an optimization to avoid writing
> > > > to amu_fie_cpus in case we don't have to, but that optimization itself
> > > > is creating more confusion than the real work. Lets just do that if all
> > > > the CPUs support AMUs. It is much cleaner that way.
> > > > 
> > > > Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > > > ---
> > > > V3:
> > > > - Added Reviewed by tag.
> > > 
> > > Catalin, please pick the first two patches for 5.11. I will send the
> > > last one separately later on.
> > 
> > I haven't figured out whether these are fixes (a cover letter would
> > help ;)). They look like generic improvements to me
> 
> Right they are and since the merge window just opened I thought these
> don't really need to wait for another full cycle to get in.

Normally we freeze the arm64 tree around the -rc6 prior to the merging
window to give the patches a bit of time in linux-next. This time
around, given the holidays, Linus even stated that if not already in
-next at 5.10, it won't be pulled: https://lkml.org/lkml/2020/12/13/290.

So please re-post at -rc1 with the acks in place.

-- 
Catalin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 1/3] arm64: topology: Avoid the have_policy check
  2020-12-18 11:01       ` Catalin Marinas
@ 2020-12-18 11:04         ` Viresh Kumar
  0 siblings, 0 replies; 17+ messages in thread
From: Viresh Kumar @ 2020-12-18 11:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ionela Voinescu, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 18-12-20, 11:01, Catalin Marinas wrote:
> On Fri, Dec 18, 2020 at 09:56:02AM +0530, Viresh Kumar wrote:
> > On 17-12-20, 10:55, Catalin Marinas wrote:
> > > Hi Viresh,
> > > 
> > > On Thu, Dec 17, 2020 at 01:27:32PM +0530, Viresh Kumar wrote:
> > > > On 15-12-20, 11:04, Viresh Kumar wrote:
> > > > > Every time I have stumbled upon this routine, I get confused with the
> > > > > way 'have_policy' is used and I have to dig in to understand why is it
> > > > > so. Here is an attempt to make it easier to understand, and hopefully it
> > > > > is an improvement.
> > > > > 
> > > > > The 'have_policy' check was just an optimization to avoid writing
> > > > > to amu_fie_cpus in case we don't have to, but that optimization itself
> > > > > is creating more confusion than the real work. Lets just do that if all
> > > > > the CPUs support AMUs. It is much cleaner that way.
> > > > > 
> > > > > Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > > > > ---
> > > > > V3:
> > > > > - Added Reviewed by tag.
> > > > 
> > > > Catalin, please pick the first two patches for 5.11. I will send the
> > > > last one separately later on.
> > > 
> > > I haven't figured out whether these are fixes (a cover letter would
> > > help ;)). They look like generic improvements to me
> > 
> > Right they are and since the merge window just opened I thought these
> > don't really need to wait for another full cycle to get in.
> 
> Normally we freeze the arm64 tree around the -rc6 prior to the merging
> window to give the patches a bit of time in linux-next. This time
> around, given the holidays, Linus even stated that if not already in
> -next at 5.10, it won't be pulled: https://lkml.org/lkml/2020/12/13/290.

Okay, sounds good.

> So please re-post at -rc1 with the acks in place.

Sure.

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2020-12-17 10:50         ` Viresh Kumar
@ 2021-01-08  9:44           ` Ionela Voinescu
  2021-01-08 10:42             ` Ionela Voinescu
  2021-01-08 11:03             ` Viresh Kumar
  0 siblings, 2 replies; 17+ messages in thread
From: Ionela Voinescu @ 2021-01-08  9:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Thursday 17 Dec 2020 at 16:20:49 (+0530), Viresh Kumar wrote:
> On 16-12-20, 19:37, Ionela Voinescu wrote:
> > I did not yet test this, but reading this comment made me wonder..
> > 
> > arch_scale_freq_invariant() (or topology_scale_freq_invariant()) is also
> > called from schedutil when obtaining the next frequency.
> > 
> > So if we had a system that only partly supports AMUs but had at some
> > point a cpufreq driver that provided FIE for the other CPUs, when we
> > unregister the driver, the cpufreq_freq_invariance static key is
> > disabled. Therefore, none of the conditions for system invariance is
> > now accomplished and arch_scale_freq_invariant() will return false.
> > This will be broken as utilization is still scaled, but the algorithm
> > for computing the next frequency in schedutil will not take this into
> > account.
> 
> I think the best and the easiest solution for this is:
> 
> bool arch_freq_counters_available(const struct cpumask *cpus)
> {
>         return amu_freq_invariant();
> }
> 
> But we probably need to rename it to something like arch_is_fie().
> 

Now that I think of it again (after spending 30 minutes trying to come
up with a more clear solution) I realised this is not actually a
problem :).

The only location that checks the invariance status is schedutil, but
what a cpufreq governor does becomes irrelevant if you remove the
cpufreq driver. The only potential problem is if one then inmods a
cpufreq driver that's not invariant. But I think that might be on "if"
too many to consider. What do you think?

Thanks,
Ionela.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2021-01-08  9:44           ` Ionela Voinescu
@ 2021-01-08 10:42             ` Ionela Voinescu
  2021-01-08 11:03             ` Viresh Kumar
  1 sibling, 0 replies; 17+ messages in thread
From: Ionela Voinescu @ 2021-01-08 10:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Friday 08 Jan 2021 at 09:44:16 (+0000), Ionela Voinescu wrote:
> On Thursday 17 Dec 2020 at 16:20:49 (+0530), Viresh Kumar wrote:
> > On 16-12-20, 19:37, Ionela Voinescu wrote:
> > > I did not yet test this, but reading this comment made me wonder..
> > > 
> > > arch_scale_freq_invariant() (or topology_scale_freq_invariant()) is also
> > > called from schedutil when obtaining the next frequency.
> > > 
> > > So if we had a system that only partly supports AMUs but had at some
> > > point a cpufreq driver that provided FIE for the other CPUs, when we
> > > unregister the driver, the cpufreq_freq_invariance static key is
> > > disabled. Therefore, none of the conditions for system invariance is
> > > now accomplished and arch_scale_freq_invariant() will return false.
> > > This will be broken as utilization is still scaled, but the algorithm
> > > for computing the next frequency in schedutil will not take this into
> > > account.
> > 
> > I think the best and the easiest solution for this is:
> > 
> > bool arch_freq_counters_available(const struct cpumask *cpus)
> > {
> >         return amu_freq_invariant();
> > }
> > 
> > But we probably need to rename it to something like arch_is_fie().
> > 

Forgot to answer this one:

arch_freq_counters_available() is also used in arch_set_freq_scale() to
tell us not only if the arch is FI, but also to tell us if the AMUs are
used for FI for some particular CPUs. So we couldn't easily rewrite this
one, or do it in a way that would be worth it.

Ionela.

> 
> Now that I think of it again (after spending 30 minutes trying to come
> up with a more clear solution) I realised this is not actually a
> problem :).
> 
> The only location that checks the invariance status is schedutil, but
> what a cpufreq governor does becomes irrelevant if you remove the
> cpufreq driver. The only potential problem is if one then inmods a
> cpufreq driver that's not invariant. But I think that might be on "if"
> too many to consider. What do you think?
> 
> Thanks,
> Ionela.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers
  2021-01-08  9:44           ` Ionela Voinescu
  2021-01-08 10:42             ` Ionela Voinescu
@ 2021-01-08 11:03             ` Viresh Kumar
  1 sibling, 0 replies; 17+ messages in thread
From: Viresh Kumar @ 2021-01-08 11:03 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 08-01-21, 09:44, Ionela Voinescu wrote:
> Now that I think of it again (after spending 30 minutes trying to come
> up with a more clear solution) I realised this is not actually a
> problem :).
> 
> The only location that checks the invariance status is schedutil, but
> what a cpufreq governor does becomes irrelevant if you remove the
> cpufreq driver.

Good catch :)

> The only potential problem is if one then inmods a
> cpufreq driver that's not invariant. But I think that might be on "if"
> too many to consider. What do you think?

Yeah, there is no need to worry about this then I think.

I will resend the patches soon.

-- 
viresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-08 11:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15  5:34 [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
2020-12-15  5:34 ` [PATCH V3 2/3] arm64: topology: Reorder init_amu_fie() a bit Viresh Kumar
2020-12-15 11:53   ` Ionela Voinescu
2020-12-15  5:34 ` [PATCH V3 3/3] arm64: topology: Make AMUs work with modular cpufreq drivers Viresh Kumar
2020-12-15 11:56   ` Ionela Voinescu
2020-12-16  0:03   ` Ionela Voinescu
2020-12-16  4:38     ` Viresh Kumar
2020-12-16 19:37       ` Ionela Voinescu
2020-12-17 10:50         ` Viresh Kumar
2021-01-08  9:44           ` Ionela Voinescu
2021-01-08 10:42             ` Ionela Voinescu
2021-01-08 11:03             ` Viresh Kumar
2020-12-17  7:57 ` [PATCH V3 1/3] arm64: topology: Avoid the have_policy check Viresh Kumar
2020-12-17 10:55   ` Catalin Marinas
2020-12-18  4:26     ` Viresh Kumar
2020-12-18 11:01       ` Catalin Marinas
2020-12-18 11:04         ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).