[v2,4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set
diff mbox series

Message ID 1905098.zDJocX6404@kreacher
State New, archived
Headers show
Series
  • cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor
Related show

Commit Message

Rafael J. Wysocki Oct. 23, 2020, 3:36 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because sugov_update_next_freq() may skip a frequency update even if
the need_freq_update flag has been set for the policy at hand, policy
limits updates may not take effect as expected.

For example, if the intel_pstate driver operates in the passive mode
with HWP enabled, it needs to update the HWP min and max limits when
the policy min and max limits change, respectively, but that may not
happen if the target frequency does not change along with the limit
at hand.  In particular, if the policy min is changed first, causing
the target frequency to be adjusted to it, and the policy max limit
is changed later to the same value, the HWP max limit will not be
updated to follow it as expected, because the target frequency is
still equal to the policy min limit and it will not change until
that limit is updated.

To address this issue, modify get_next_freq() to clear
need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
not set for the cpufreq driver in use (and it should be set for all
potentially affected drivers) and make sugov_update_next_freq()
check need_freq_update and continue when it is set regardless of
whether or not the new target frequency is equal to the old one.

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

New patch in v2.

---
 kernel/sched/cpufreq_schedutil.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Viresh Kumar Oct. 27, 2020, 4:25 a.m. UTC | #1
Spelling mistake in $subject (driver)

On 23-10-20, 17:36, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to clear
> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> not set for the cpufreq driver in use (and it should be set for all
> potentially affected drivers) and make sugov_update_next_freq()
> check need_freq_update and continue when it is set regardless of
> whether or not the new target frequency is equal to the old one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> New patch in v2.
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
>  	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
> +	if (sg_policy->need_freq_update)
> +		sg_policy->need_freq_update =
> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> +

The behavior here is a bit different from what we did in cpufreq.c. In cpufreq
core we are _always_ allowing the call to reach the driver's target() routine,
but here we do it only if limits have changed. Wonder if we should have similar
behavior here as well ?

Over that the code here can be rewritten a bit like:

	if (sg_policy->need_freq_update)
                sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
        else if (freq == sg_policy->cached_raw_freq)
		return sg_policy->next_freq;
Zhang Rui Oct. 27, 2020, 8:47 a.m. UTC | #2
On Fri, 2020-10-23 at 17:36 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to clear
> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> not set for the cpufreq driver in use (and it should be set for all
> potentially affected drivers) and make sugov_update_next_freq()
> check need_freq_update and continue when it is set regardless of
> whether or not the new target frequency is equal to the old one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode
> with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I have confirmed that the problem is gone with this patch series
applied.

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui

> ---
> 
> New patch in v2.
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy,
> u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy-
> >need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
>  	if (freq == sg_policy->cached_raw_freq && !sg_policy-
> >need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
> +	if (sg_policy->need_freq_update)
> +		sg_policy->need_freq_update =
> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_L
> IMITS);
> +
>  	sg_policy->cached_raw_freq = freq;
>  	return cpufreq_driver_resolve_freq(policy, freq);
>  }
> 
> 
>
Rafael J. Wysocki Oct. 27, 2020, 1:14 p.m. UTC | #3
On Tue, Oct 27, 2020 at 5:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> Spelling mistake in $subject (driver)
>
> On 23-10-20, 17:36, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Because sugov_update_next_freq() may skip a frequency update even if
> > the need_freq_update flag has been set for the policy at hand, policy
> > limits updates may not take effect as expected.
> >
> > For example, if the intel_pstate driver operates in the passive mode
> > with HWP enabled, it needs to update the HWP min and max limits when
> > the policy min and max limits change, respectively, but that may not
> > happen if the target frequency does not change along with the limit
> > at hand.  In particular, if the policy min is changed first, causing
> > the target frequency to be adjusted to it, and the policy max limit
> > is changed later to the same value, the HWP max limit will not be
> > updated to follow it as expected, because the target frequency is
> > still equal to the policy min limit and it will not change until
> > that limit is updated.
> >
> > To address this issue, modify get_next_freq() to clear
> > need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> > not set for the cpufreq driver in use (and it should be set for all
> > potentially affected drivers) and make sugov_update_next_freq()
> > check need_freq_update and continue when it is set regardless of
> > whether or not the new target frequency is equal to the old one.
> >
> > Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> > Reported-by: Zhang Rui <rui.zhang@intel.com>
> > Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > New patch in v2.
> >
> > ---
> >  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> >                                  unsigned int next_freq)
> >  {
> > -     if (sg_policy->next_freq == next_freq)
> > +     if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
> >               return false;
> >
> >       sg_policy->next_freq = next_freq;
> >       sg_policy->last_freq_update_time = time;
> > +     sg_policy->need_freq_update = false;
> >
> >       return true;
> >  }
> > @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
> >       if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> >               return sg_policy->next_freq;
> >
> > -     sg_policy->need_freq_update = false;
> > +     if (sg_policy->need_freq_update)
> > +             sg_policy->need_freq_update =
> > +                     cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> > +
>
> The behavior here is a bit different from what we did in cpufreq.c. In cpufreq
> core we are _always_ allowing the call to reach the driver's target() routine,
> but here we do it only if limits have changed. Wonder if we should have similar
> behavior here as well ?

I didn't think about that, but now that you mentioned it, I think that
this is a good idea.

Will send an updated patch with that implemented shortly.

> Over that the code here can be rewritten a bit like:
>
>         if (sg_policy->need_freq_update)
>                 sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
>         else if (freq == sg_policy->cached_raw_freq)
>                 return sg_policy->next_freq;

Right, but it will be somewhat different anyway. :-)

Patch
diff mbox series

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -102,11 +102,12 @@  static bool sugov_should_update_freq(str
 static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
 				   unsigned int next_freq)
 {
-	if (sg_policy->next_freq == next_freq)
+	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
 		return false;
 
 	sg_policy->next_freq = next_freq;
 	sg_policy->last_freq_update_time = time;
+	sg_policy->need_freq_update = false;
 
 	return true;
 }
@@ -164,7 +165,10 @@  static unsigned int get_next_freq(struct
 	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
 		return sg_policy->next_freq;
 
-	sg_policy->need_freq_update = false;
+	if (sg_policy->need_freq_update)
+		sg_policy->need_freq_update =
+			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
+
 	sg_policy->cached_raw_freq = freq;
 	return cpufreq_driver_resolve_freq(policy, freq);
 }