linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
@ 2020-12-10  7:18 Viresh Kumar
  2020-12-10  9:42 ` [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance Viresh Kumar
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10  7:18 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Viresh Kumar, Vincent Guittot, linux-arm-kernel, linux-kernel

Every time I have stumbled upon this routine, I get confused with the
way 'have_policy' is used and I have to dig in to understand why is it
so.

Here is an attempt to make it easier to understand, and hopefully it is
an improvement. This is based on the logic that amu_fie_cpus will be
empty if cpufreq policy wasn't available for any CPU.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---

Ionela, I think it would be even better to do this over this patch

-       /*
-        * If none of the CPUs have cpufreq support, we only enable
-        * the use of the AMU feature for FIE if all CPUs support AMU.
-        * Otherwise, enable_policy_freq_counters has already enabled
-        * policy cpus.
-        */
-       if (cpumask_empty(amu_fie_cpus) &&
-           cpumask_equal(valid_cpus, cpu_present_mask))
+       /* Overwrite amu_fie_cpus if all CPUs support AMU */
+       if (cpumask_equal(valid_cpus, cpu_present_mask))
                cpumask_copy(amu_fie_cpus, cpu_present_mask);

This will also take care of the case where the cpufreq policy isn't
there for a small group of CPUs, which do have AMUs enabled for them.
(This doesn't normally happen though).

---
 arch/arm64/kernel/topology.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index f6faa697e83e..7f7d8de325b6 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -199,14 +199,14 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
 	return 0;
 }
 
-static inline bool
+static inline void
 enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
 {
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 
 	if (!policy) {
 		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
-		return false;
+		return;
 	}
 
 	if (cpumask_subset(policy->related_cpus, valid_cpus))
@@ -214,8 +214,6 @@ enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
 			   amu_fie_cpus);
 
 	cpufreq_cpu_put(policy);
-
-	return true;
 }
 
 static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
@@ -225,7 +223,6 @@ static int __init init_amu_fie(void)
 {
 	bool invariance_status = topology_scale_freq_invariant();
 	cpumask_var_t valid_cpus;
-	bool have_policy = false;
 	int ret = 0;
 	int cpu;
 
@@ -245,17 +242,18 @@ static int __init init_amu_fie(void)
 			continue;
 
 		cpumask_set_cpu(cpu, valid_cpus);
-		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);
+		enable_policy_freq_counters(cpu, valid_cpus);
 	}
 
 	/*
-	 * If we are not restricted by cpufreq policies, we only enable
+	 * If none of the CPUs have cpufreq support, we only enable
 	 * the use of the AMU feature for FIE if all CPUs support AMU.
 	 * Otherwise, enable_policy_freq_counters has already enabled
 	 * policy cpus.
 	 */
-	if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask))
-		cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
+	if (cpumask_empty(amu_fie_cpus) &&
+	    cpumask_equal(valid_cpus, cpu_present_mask))
+		cpumask_copy(amu_fie_cpus, cpu_present_mask);
 
 	if (!cpumask_empty(amu_fie_cpus)) {
 		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
-- 
2.25.0.rc1.19.g042ed3e048af


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance
  2020-12-10  7:18 [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
@ 2020-12-10  9:42 ` Viresh Kumar
  2020-12-10 11:09   ` Ionela Voinescu
  2020-12-10 10:34 ` [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10  9:42 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Viresh Kumar, Vincent Guittot, linux-arm-kernel, linux-kernel

Avoid the static_branch_enable() and static_branch_disable() dance by
redoing the code in a different way. We will be fully invariant here
only if amu_fie_cpus is set with all present CPUs, use that instead of
yet another call to topology_scale_freq_invariant().

This also avoids running rest of the routine if we enabled the static
branch, followed by a disable.

Also make the first call to topology_scale_freq_invariant() just when we
need it, instead of at the top of the routine. This makes it further
clear on why we need it, i.e. just around enabling AMUs use here.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 arch/arm64/kernel/topology.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 7f7d8de325b6..6dedc6ee91cf 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -221,7 +221,7 @@ static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
 
 static int __init init_amu_fie(void)
 {
-	bool invariance_status = topology_scale_freq_invariant();
+	bool invariance_status;
 	cpumask_var_t valid_cpus;
 	int ret = 0;
 	int cpu;
@@ -255,18 +255,15 @@ static int __init init_amu_fie(void)
 	    cpumask_equal(valid_cpus, cpu_present_mask))
 		cpumask_copy(amu_fie_cpus, cpu_present_mask);
 
-	if (!cpumask_empty(amu_fie_cpus)) {
-		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
-			cpumask_pr_args(amu_fie_cpus));
-		static_branch_enable(&amu_fie_key);
-	}
+	/* Disallow partial use of counters for frequency invariance */
+	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
+		goto free_valid_mask;
 
-	/*
-	 * If the system is not fully invariant after AMU init, disable
-	 * partial use of counters for frequency invariance.
-	 */
-	if (!topology_scale_freq_invariant())
-		static_branch_disable(&amu_fie_key);
+	pr_info("CPUs[%*pbl]: counters will be used for FIE.",
+		cpumask_pr_args(amu_fie_cpus));
+
+	invariance_status = topology_scale_freq_invariant();
+	static_branch_enable(&amu_fie_key);
 
 	/*
 	 * Task scheduler behavior depends on frequency invariance support,
-- 
2.25.0.rc1.19.g042ed3e048af


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10  7:18 [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
  2020-12-10  9:42 ` [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance Viresh Kumar
@ 2020-12-10 10:34 ` Viresh Kumar
  2020-12-10 10:38 ` Ionela Voinescu
  2020-12-10 13:22 ` Ionela Voinescu
  3 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10 10:34 UTC (permalink / raw)
  To: Ionela Voinescu, Catalin Marinas, Will Deacon
  Cc: Vincent Guittot, linux-arm-kernel, linux-kernel

On 10-12-20, 12:48, Viresh Kumar wrote:
> Every time I have stumbled upon this routine, I get confused with the
> way 'have_policy' is used and I have to dig in to understand why is it
> so.
> 
> Here is an attempt to make it easier to understand, and hopefully it is
> an improvement. This is based on the logic that amu_fie_cpus will be
> empty if cpufreq policy wasn't available for any CPU.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> 
> Ionela, I think it would be even better to do this over this patch
> 
> -       /*
> -        * If none of the CPUs have cpufreq support, we only enable
> -        * the use of the AMU feature for FIE if all CPUs support AMU.
> -        * Otherwise, enable_policy_freq_counters has already enabled
> -        * policy cpus.
> -        */
> -       if (cpumask_empty(amu_fie_cpus) &&
> -           cpumask_equal(valid_cpus, cpu_present_mask))
> +       /* Overwrite amu_fie_cpus if all CPUs support AMU */
> +       if (cpumask_equal(valid_cpus, cpu_present_mask))
>                 cpumask_copy(amu_fie_cpus, cpu_present_mask);
> 
> This will also take care of the case where the cpufreq policy isn't
> there for a small group of CPUs, which do have AMUs enabled for them.
> (This doesn't normally happen though).

And on similar lines, this change as well as amu_fie_cpus must be set
to all the CPUs and this check (and parameter to the routine) aren't
required..

 bool arch_freq_counters_available(const struct cpumask *cpus)
 {
-       return amu_freq_invariant() &&
-              cpumask_subset(cpus, amu_fie_cpus);
+       return amu_freq_invariant();
 }

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10  7:18 [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
  2020-12-10  9:42 ` [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance Viresh Kumar
  2020-12-10 10:34 ` [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
@ 2020-12-10 10:38 ` Ionela Voinescu
  2020-12-10 10:55   ` Viresh Kumar
  2020-12-11 11:05   ` Viresh Kumar
  2020-12-10 13:22 ` Ionela Voinescu
  3 siblings, 2 replies; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-10 10:38 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi,

On Thursday 10 Dec 2020 at 12:48:20 (+0530), Viresh Kumar wrote:
> Every time I have stumbled upon this routine, I get confused with the
> way 'have_policy' is used and I have to dig in to understand why is it
> so.
> 

I'm first of all replying to discuss the need of this policy analysis
that enable_policy_freq_counters() does which results in the setting of
have_policy.

Basically, that's functions purpose is only to make sure that invariance
at the level of the policy is consistent: either all CPUs in a policy
support counters and counters will be used for the scale factor, or
either none or only some support counters and therefore the default
cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs
in a policy.

If we find that cpufreq policies are not present at all, we only enable
counter use if all CPUs support them.

This being said, there is a very important part of your patches in this
set:

+	/* Disallow partial use of counters for frequency invariance */
+	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
+		goto free_valid_mask;


If this is agreed upon, there is a lot more that can be removed from the
initialisation: enable_policy_freq_counters() can entirely go away plus
all the checks around it.

I completely agree that all of this will be more clear if we decided to
"Disallow partial use of counters for frequency invariance", but I think
we might have to add it back in again when systems with partial support
for counters show up.

I don't agree to not support these systems from the get go as it's
likely that the first big.LITTLE systems will only have partial support
for AMUs, but it's only an assumption at this point.

I'm happy to hear the opinion of other arm64 folk about this.

Many thanks for looking over the code,
Ionela.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10 10:38 ` Ionela Voinescu
@ 2020-12-10 10:55   ` Viresh Kumar
  2020-12-10 11:29     ` Ionela Voinescu
  2020-12-11 11:05   ` Viresh Kumar
  1 sibling, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10 10:55 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

My intent was to keep the logical working of the code as is (in all
the patches I have sent today), let me see if I broke that assumption
somewhere unintentionally.

On 10-12-20, 10:38, Ionela Voinescu wrote:
> I'm first of all replying to discuss the need of this policy analysis
> that enable_policy_freq_counters() does which results in the setting of
> have_policy.
> 
> Basically, that's functions purpose is only to make sure that invariance
> at the level of the policy is consistent: either all CPUs in a policy
> support counters and counters will be used for the scale factor, or
> either none or only some support counters and therefore the default
> cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs
> in a policy.
> 
> If we find that cpufreq policies are not present at all, we only enable
> counter use if all CPUs support them.

Right, and the same must be true after this patch.

> This being said, there is a very important part of your patches in this
> set:
> 
> +	/* Disallow partial use of counters for frequency invariance */
> +	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
> +		goto free_valid_mask;

The current code already has this check and so this isn't something
new.
 
> If this is agreed upon, there is a lot more that can be removed from the
> initialisation: enable_policy_freq_counters() can entirely go away plus
> all the checks around it.
> 
> I completely agree that all of this will be more clear if we decided to
> "Disallow partial use of counters for frequency invariance", but I think
> we might have to add it back in again when systems with partial support
> for counters show up.
> 
> I don't agree to not support these systems from the get go as it's
> likely that the first big.LITTLE systems will only have partial support
> for AMUs, but it's only an assumption at this point.

Here is how things move AFAICT:

- We set valid_cpus 1-by-1 with all CPUs that have counters available.

- Once all CPUs of a policy are part of valid_cpus, we update
  amu_fie_cpus with that policies related_cpus.

- Once we are done with all CPUs, we check if cpufreq policy was there
  for any of the CPUs, if not, we update amu_fie_cpus if all present
  CPUs are part of valid_cpus.

- At this point we call static_branch_enable() if amu_fie_cpus is not
  empty (i.e. even if it is partially set).

- But right after that we call static_branch_disable() if we aren't
  invariant (call to topology_scale_freq_invariant()), and this will
  happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So
  partial amu support is already disallowed, without cpufreq.

Where am I wrong ? (And I know there is a high possibility of that
happening here :) )..

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance
  2020-12-10  9:42 ` [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance Viresh Kumar
@ 2020-12-10 11:09   ` Ionela Voinescu
  2020-12-10 12:35     ` Viresh Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-10 11:09 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Thursday 10 Dec 2020 at 15:12:25 (+0530), Viresh Kumar wrote:
> Avoid the static_branch_enable() and static_branch_disable() dance by
> redoing the code in a different way. We will be fully invariant here
> only if amu_fie_cpus is set with all present CPUs, use that instead of
> yet another call to topology_scale_freq_invariant().
> 
> This also avoids running rest of the routine if we enabled the static
> branch, followed by a disable.
> 
> Also make the first call to topology_scale_freq_invariant() just when we
> need it, instead of at the top of the routine. This makes it further
> clear on why we need it, i.e. just around enabling AMUs use here.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  arch/arm64/kernel/topology.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 7f7d8de325b6..6dedc6ee91cf 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -221,7 +221,7 @@ static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
>  
>  static int __init init_amu_fie(void)
>  {
> -	bool invariance_status = topology_scale_freq_invariant();
> +	bool invariance_status;
>  	cpumask_var_t valid_cpus;
>  	int ret = 0;
>  	int cpu;
> @@ -255,18 +255,15 @@ static int __init init_amu_fie(void)
>  	    cpumask_equal(valid_cpus, cpu_present_mask))
>  		cpumask_copy(amu_fie_cpus, cpu_present_mask);
>  
> -	if (!cpumask_empty(amu_fie_cpus)) {
> -		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> -			cpumask_pr_args(amu_fie_cpus));
> -		static_branch_enable(&amu_fie_key);
> -	}

This check verifies if there are *any* CPUs for which AMUs can be used for
FIE (!empty mask) -> enable static key.

> +	/* Disallow partial use of counters for frequency invariance */
> +	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
> +		goto free_valid_mask;
>  

The replacement verifies if *all* present CPUs support AMUs for FIE and
only then it enables the static key.

Ionela.

> -	/*
> -	 * If the system is not fully invariant after AMU init, disable
> -	 * partial use of counters for frequency invariance.
> -	 */
> -	if (!topology_scale_freq_invariant())
> -		static_branch_disable(&amu_fie_key);
> +	pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> +		cpumask_pr_args(amu_fie_cpus));
> +
> +	invariance_status = topology_scale_freq_invariant();
> +	static_branch_enable(&amu_fie_key);
>  
>  	/*
>  	 * Task scheduler behavior depends on frequency invariance support,
> -- 
> 2.25.0.rc1.19.g042ed3e048af
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10 10:55   ` Viresh Kumar
@ 2020-12-10 11:29     ` Ionela Voinescu
  2020-12-10 12:34       ` Viresh Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-10 11:29 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote:
> My intent was to keep the logical working of the code as is (in all
> the patches I have sent today), let me see if I broke that assumption
> somewhere unintentionally.
> 

Okay, I replied separately in regards to the piece of code that confused
things.

> On 10-12-20, 10:38, Ionela Voinescu wrote:
> > I'm first of all replying to discuss the need of this policy analysis
> > that enable_policy_freq_counters() does which results in the setting of
> > have_policy.
> > 
> > Basically, that's functions purpose is only to make sure that invariance
> > at the level of the policy is consistent: either all CPUs in a policy
> > support counters and counters will be used for the scale factor, or
> > either none or only some support counters and therefore the default
> > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs
> > in a policy.
> > 
> > If we find that cpufreq policies are not present at all, we only enable
> > counter use if all CPUs support them.
> 
> Right, and the same must be true after this patch.
> 
> > This being said, there is a very important part of your patches in this
> > set:
> > 
> > +	/* Disallow partial use of counters for frequency invariance */
> > +	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
> > +		goto free_valid_mask;
> 
> The current code already has this check and so this isn't something
> new.
>  

Replied separately for the patch.

> > If this is agreed upon, there is a lot more that can be removed from the
> > initialisation: enable_policy_freq_counters() can entirely go away plus
> > all the checks around it.
> > 
> > I completely agree that all of this will be more clear if we decided to
> > "Disallow partial use of counters for frequency invariance", but I think
> > we might have to add it back in again when systems with partial support
> > for counters show up.
> > 
> > I don't agree to not support these systems from the get go as it's
> > likely that the first big.LITTLE systems will only have partial support
> > for AMUs, but it's only an assumption at this point.
> 
> Here is how things move AFAICT:
> 
> - We set valid_cpus 1-by-1 with all CPUs that have counters available.
> 

Yes.

> - Once all CPUs of a policy are part of valid_cpus, we update
>   amu_fie_cpus with that policies related_cpus.
> 

Yes.

> - Once we are done with all CPUs, we check if cpufreq policy was there
>   for any of the CPUs, if not, we update amu_fie_cpus if all present
>   CPUs are part of valid_cpus.
> 

Yes.

> - At this point we call static_branch_enable() if amu_fie_cpus is not
>   empty (i.e. even if it is partially set).
> 

Yes.

> - But right after that we call static_branch_disable() if we aren't
>   invariant (call to topology_scale_freq_invariant()), and this will
>   happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So
>   partial amu support is already disallowed, without cpufreq.
> 

This is the point that needs clarification:

topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() ||
                                   arch_freq_counters_available(cpu_online_mask);

This checks if the full system is invariant.

The possible scenarios are:

 - All online CPUs support AMUs - arch_freq_counters_available() will
   return true -> topology_scale_freq_invariant() will return true.

 - None of the CPUs support AMUs, or part of the CPUs support AMUs - the
   system is invariant only if cpufreq is invariant (dependent on
   whether the driver implements the proper callbacks that results in
   calling arch_set_freq_scale() in cpufreq core.

 - Either cpufreq does not support invariance or we don't have AMU
   support on all CPUs -> the system is not invariant so we disable
   the AMU static key that guards the calls to
   topology_scale_freq_tick() - we would not want to set a scale factor
   for only a part of the CPUs.

So whether were are or are not invariant does not depend only on the AMU
presence, but also on the cpufreq support for invariance. We have to
disable invariance altogether (including the AMU guarding static key)
if the system is not invariant (it no all CPUs have means to provide the
scale).

Let me know if it makes sense.

> Where am I wrong ? (And I know there is a high possibility of that
> happening here :) )..
> 

No worries. I myself sometimes wonder if I missed something. I will be
happy in a few years when we can remove this and use counters for all
CPUs.

Thanks,
Ionela.

> -- 
> viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10 11:29     ` Ionela Voinescu
@ 2020-12-10 12:34       ` Viresh Kumar
  2020-12-10 13:26         ` Ionela Voinescu
  0 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10 12:34 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 10-12-20, 11:29, Ionela Voinescu wrote:
> On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote:
> > - But right after that we call static_branch_disable() if we aren't
> >   invariant (call to topology_scale_freq_invariant()), and this will
> >   happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So
> >   partial amu support is already disallowed, without cpufreq.
> > 
> 
> This is the point that needs clarification:
> 
> topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() ||
>                                    arch_freq_counters_available(cpu_online_mask);
> 
> This checks if the full system is invariant.
> 
> The possible scenarios are:
> 
>  - All online CPUs support AMUs - arch_freq_counters_available() will
>    return true -> topology_scale_freq_invariant() will return true.
> 
>  - None of the CPUs support AMUs, or part of the CPUs support AMUs - the
>    system is invariant only if cpufreq is invariant (dependent on
>    whether the driver implements the proper callbacks that results in
>    calling arch_set_freq_scale() in cpufreq core.
> 
>  - Either cpufreq does not support invariance or we don't have AMU
>    support on all CPUs -> the system is not invariant so we disable
>    the AMU static key that guards the calls to
>    topology_scale_freq_tick() - we would not want to set a scale factor
>    for only a part of the CPUs.
> 
> So whether were are or are not invariant does not depend only on the AMU
> presence, but also on the cpufreq support for invariance. We have to
> disable invariance altogether (including the AMU guarding static key)
> if the system is not invariant (it no all CPUs have means to provide the
> scale).

Okay, I think I mis-assumed that amu_fie_cpus will get set by
enable_policy_freq_counters() even for CPUs where AMU support isn't
there, it won't be though.

Having said that, this patch, along with the minor suggestion in the
commit log, still stands fine, right ? The other patch which I sent is
probably incorrect due to the above assumption I had.

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance
  2020-12-10 11:09   ` Ionela Voinescu
@ 2020-12-10 12:35     ` Viresh Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2020-12-10 12:35 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 10-12-20, 11:09, Ionela Voinescu wrote:
> On Thursday 10 Dec 2020 at 15:12:25 (+0530), Viresh Kumar wrote:
> > Avoid the static_branch_enable() and static_branch_disable() dance by
> > redoing the code in a different way. We will be fully invariant here
> > only if amu_fie_cpus is set with all present CPUs, use that instead of
> > yet another call to topology_scale_freq_invariant().
> > 
> > This also avoids running rest of the routine if we enabled the static
> > branch, followed by a disable.
> > 
> > Also make the first call to topology_scale_freq_invariant() just when we
> > need it, instead of at the top of the routine. This makes it further
> > clear on why we need it, i.e. just around enabling AMUs use here.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> >  arch/arm64/kernel/topology.c | 21 +++++++++------------
> >  1 file changed, 9 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> > index 7f7d8de325b6..6dedc6ee91cf 100644
> > --- a/arch/arm64/kernel/topology.c
> > +++ b/arch/arm64/kernel/topology.c
> > @@ -221,7 +221,7 @@ static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
> >  
> >  static int __init init_amu_fie(void)
> >  {
> > -	bool invariance_status = topology_scale_freq_invariant();
> > +	bool invariance_status;
> >  	cpumask_var_t valid_cpus;
> >  	int ret = 0;
> >  	int cpu;
> > @@ -255,18 +255,15 @@ static int __init init_amu_fie(void)
> >  	    cpumask_equal(valid_cpus, cpu_present_mask))
> >  		cpumask_copy(amu_fie_cpus, cpu_present_mask);
> >  
> > -	if (!cpumask_empty(amu_fie_cpus)) {
> > -		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> > -			cpumask_pr_args(amu_fie_cpus));
> > -		static_branch_enable(&amu_fie_key);
> > -	}
> 
> This check verifies if there are *any* CPUs for which AMUs can be used for
> FIE (!empty mask) -> enable static key.
> 
> > +	/* Disallow partial use of counters for frequency invariance */
> > +	if (!cpumask_equal(amu_fie_cpus, cpu_present_mask))
> > +		goto free_valid_mask;
> >  
> 
> The replacement verifies if *all* present CPUs support AMUs for FIE and
> only then it enables the static key.
> 
> > -	/*
> > -	 * If the system is not fully invariant after AMU init, disable
> > -	 * partial use of counters for frequency invariance.
> > -	 */
> > -	if (!topology_scale_freq_invariant())

I mis-read something here, as shared in the other thread, so yeah I
need to think again about this patch.

> > -		static_branch_disable(&amu_fie_key);
> > +	pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> > +		cpumask_pr_args(amu_fie_cpus));
> > +
> > +	invariance_status = topology_scale_freq_invariant();
> > +	static_branch_enable(&amu_fie_key);
> >  
> >  	/*
> >  	 * Task scheduler behavior depends on frequency invariance support,
> > -- 
> > 2.25.0.rc1.19.g042ed3e048af
> > 

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10  7:18 [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
                   ` (2 preceding siblings ...)
  2020-12-10 10:38 ` Ionela Voinescu
@ 2020-12-10 13:22 ` Ionela Voinescu
  3 siblings, 0 replies; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-10 13:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hey,

On Thursday 10 Dec 2020 at 12:48:20 (+0530), Viresh Kumar wrote:
> Every time I have stumbled upon this routine, I get confused with the
> way 'have_policy' is used and I have to dig in to understand why is it
> so.
> 
> Here is an attempt to make it easier to understand, and hopefully it is
> an improvement. This is based on the logic that amu_fie_cpus will be
> empty if cpufreq policy wasn't available for any CPU.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> 
> Ionela, I think it would be even better to do this over this patch
> 
> -       /*
> -        * If none of the CPUs have cpufreq support, we only enable
> -        * the use of the AMU feature for FIE if all CPUs support AMU.
> -        * Otherwise, enable_policy_freq_counters has already enabled
> -        * policy cpus.
> -        */
> -       if (cpumask_empty(amu_fie_cpus) &&
> -           cpumask_equal(valid_cpus, cpu_present_mask))
> +       /* Overwrite amu_fie_cpus if all CPUs support AMU */
> +       if (cpumask_equal(valid_cpus, cpu_present_mask))
>                 cpumask_copy(amu_fie_cpus, cpu_present_mask);
> 

Yes, I was just about to suggest this, reading the patch below.

> This will also take care of the case where the cpufreq policy isn't
> there for a small group of CPUs, which do have AMUs enabled for them.
> (This doesn't normally happen though).
> 
> ---
>  arch/arm64/kernel/topology.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index f6faa697e83e..7f7d8de325b6 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -199,14 +199,14 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate)
>  	return 0;
>  }
>  
> -static inline bool
> +static inline void
>  enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
>  {
>  	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>  
>  	if (!policy) {
>  		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> -		return false;
> +		return;
>  	}
>  
>  	if (cpumask_subset(policy->related_cpus, valid_cpus))
> @@ -214,8 +214,6 @@ enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
>  			   amu_fie_cpus);
>  
>  	cpufreq_cpu_put(policy);
> -
> -	return true;
>  }
>  
>  static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
> @@ -225,7 +223,6 @@ static int __init init_amu_fie(void)
>  {
>  	bool invariance_status = topology_scale_freq_invariant();
>  	cpumask_var_t valid_cpus;
> -	bool have_policy = false;
>  	int ret = 0;
>  	int cpu;
>  
> @@ -245,17 +242,18 @@ static int __init init_amu_fie(void)
>  			continue;
>  
>  		cpumask_set_cpu(cpu, valid_cpus);
> -		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);
> +		enable_policy_freq_counters(cpu, valid_cpus);
>  	}
>  
>  	/*
> -	 * If we are not restricted by cpufreq policies, we only enable
> +	 * If none of the CPUs have cpufreq support, we only enable
>  	 * the use of the AMU feature for FIE if all CPUs support AMU.
>  	 * Otherwise, enable_policy_freq_counters has already enabled
>  	 * policy cpus.
>  	 */
> -	if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask))
> -		cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
> +	if (cpumask_empty(amu_fie_cpus) &&
> +	    cpumask_equal(valid_cpus, cpu_present_mask))
> +		cpumask_copy(amu_fie_cpus, cpu_present_mask);
>  

Yes, if you really don't like the have_policy variable, I would go for
your suggestion in the commit message for this condition and the removal
of the comment. In the form of the comment here it creates more confusion,
but your suggestion in the commit message hides all involvement of
policies in enable_policy_freq_counters().

Thanks,
Ionela.


>  	if (!cpumask_empty(amu_fie_cpus)) {
>  		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> -- 
> 2.25.0.rc1.19.g042ed3e048af
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10 12:34       ` Viresh Kumar
@ 2020-12-10 13:26         ` Ionela Voinescu
  0 siblings, 0 replies; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-10 13:26 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On Thursday 10 Dec 2020 at 18:04:39 (+0530), Viresh Kumar wrote:
> On 10-12-20, 11:29, Ionela Voinescu wrote:
> > On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote:
> > > - But right after that we call static_branch_disable() if we aren't
> > >   invariant (call to topology_scale_freq_invariant()), and this will
> > >   happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So
> > >   partial amu support is already disallowed, without cpufreq.
> > > 
> > 
> > This is the point that needs clarification:
> > 
> > topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() ||
> >                                    arch_freq_counters_available(cpu_online_mask);
> > 
> > This checks if the full system is invariant.
> > 
> > The possible scenarios are:
> > 
> >  - All online CPUs support AMUs - arch_freq_counters_available() will
> >    return true -> topology_scale_freq_invariant() will return true.
> > 
> >  - None of the CPUs support AMUs, or part of the CPUs support AMUs - the
> >    system is invariant only if cpufreq is invariant (dependent on
> >    whether the driver implements the proper callbacks that results in
> >    calling arch_set_freq_scale() in cpufreq core.
> > 
> >  - Either cpufreq does not support invariance or we don't have AMU
> >    support on all CPUs -> the system is not invariant so we disable
> >    the AMU static key that guards the calls to
> >    topology_scale_freq_tick() - we would not want to set a scale factor
> >    for only a part of the CPUs.
> > 
> > So whether were are or are not invariant does not depend only on the AMU
> > presence, but also on the cpufreq support for invariance. We have to
> > disable invariance altogether (including the AMU guarding static key)
> > if the system is not invariant (it no all CPUs have means to provide the
> > scale).
> 
> Okay, I think I mis-assumed that amu_fie_cpus will get set by
> enable_policy_freq_counters() even for CPUs where AMU support isn't
> there, it won't be though.
> 
> Having said that, this patch, along with the minor suggestion in the
> commit log, still stands fine, right ? The other patch which I sent is
> probably incorrect due to the above assumption I had.
> 

Yes, this one is good, although I would vote for the commit message
implementation. I'll wait for v2 for reviewed-by, in case you want to
push something for the second patch in the same series.

Ionela.

> -- 
> viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-10 10:38 ` Ionela Voinescu
  2020-12-10 10:55   ` Viresh Kumar
@ 2020-12-11 11:05   ` Viresh Kumar
  2020-12-14 16:14     ` Ionela Voinescu
  1 sibling, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2020-12-11 11:05 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

On 10-12-20, 10:38, Ionela Voinescu wrote:
> Basically, that's functions purpose is only to make sure that invariance
> at the level of the policy is consistent: either all CPUs in a policy
> support counters and counters will be used for the scale factor, or
> either none or only some support counters and therefore the default
> cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs
> in a policy.

Why is it important to have this thing at policy level ? If we are
okay with only one policy having AMUs and not the other one, then what
about only some CPUs of both the policies having it. Does it break
anything ?

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: topology: Cleanup init_amu_fie() a bit
  2020-12-11 11:05   ` Viresh Kumar
@ 2020-12-14 16:14     ` Ionela Voinescu
  0 siblings, 0 replies; 13+ messages in thread
From: Ionela Voinescu @ 2020-12-14 16:14 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Catalin Marinas, Will Deacon, Vincent Guittot, linux-arm-kernel,
	linux-kernel

Hi,

Sorry, I missed this.

On Friday 11 Dec 2020 at 16:35:55 (+0530), Viresh Kumar wrote:
> On 10-12-20, 10:38, Ionela Voinescu wrote:
> > Basically, that's functions purpose is only to make sure that invariance
> > at the level of the policy is consistent: either all CPUs in a policy
> > support counters and counters will be used for the scale factor, or
> > either none or only some support counters and therefore the default
> > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs
> > in a policy.
> 
> Why is it important to have this thing at policy level ? If we are
> okay with only one policy having AMUs and not the other one, then what
> about only some CPUs of both the policies having it. Does it break
> anything ?
> 

First of all, in order for a single CPU in a policy to use AMUs for FIE,
when the others do not support AMUs, we'd have to modify
arch_set_freq_scale() which otherwise would uniformly set its own
computed scale factor for all related CPUs.

Beyond this, it's very likely that CPUs in the same policy have the same
uarch and therefore either all or none in the policy would support AMUs.
The check here is just to make sure of that and then ensure that
arch_set_freq_scale() and arch_scale_freq_tick() do not clash with each
other.

Beyond this, it's difficult to say how important it is to have the same
scale for all CPUs in a policy. AMUs would give you a scale based on an
average frequency between ticks, and more importantly it would be based
on the actual core frequency, not on what software requested. Cpufreq
would give you the scale obtained using the frequency that the OS
requested. Therefore, the two scale values could end up being quite
different, and therefore can result in quite different utilization
values for CPUs doing the same work.

An alternative would be to apply the AMU computed scale to all CPUs
although not all CPUs in a policy might support AMUs. But given that
this scenario is unlikely, the added hassle in arch_scale_freq_tick()
which would involve getting information on related CPUs from policies
was not worth it, in my opionion.

Thanks,
Ionela.

> -- 
> viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-14 16:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10  7:18 [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
2020-12-10  9:42 ` [PATCH] arm64: topology: Avoid the static_branch_{enable|disable} dance Viresh Kumar
2020-12-10 11:09   ` Ionela Voinescu
2020-12-10 12:35     ` Viresh Kumar
2020-12-10 10:34 ` [PATCH] arm64: topology: Cleanup init_amu_fie() a bit Viresh Kumar
2020-12-10 10:38 ` Ionela Voinescu
2020-12-10 10:55   ` Viresh Kumar
2020-12-10 11:29     ` Ionela Voinescu
2020-12-10 12:34       ` Viresh Kumar
2020-12-10 13:26         ` Ionela Voinescu
2020-12-11 11:05   ` Viresh Kumar
2020-12-14 16:14     ` Ionela Voinescu
2020-12-10 13:22 ` Ionela Voinescu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).