linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor
@ 2020-10-22 11:55 Rafael J. Wysocki
  2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
  2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki
  0 siblings, 2 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-10-22 11:55 UTC (permalink / raw)
  To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui

Hi All,

There is a problem in intel_pstate that if it works in the passive mode with
HWP enabled and the "powersave" governor is used on top of it, then changing
the policy max frequency doesn't cause the HWP max limit to be updated which
is quite confusing.

That happens because of two checks, one in the cpufreq core and one in the
driver itself, that are there to avoid unnecessary HW/FW updates when the
current frequency doesn't change.  Of course, that is the case when the
policy max limit changes under the "powersave" governor (which sets the
current frequency to the policy min limit), but in that particular case,
the checks turn out to be harmful.  This is dealt with by the first patch.

The second one is an optimization that can be done right away on top of the
first one.

Thanks!




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
@ 2020-10-22 11:57 ` Rafael J. Wysocki
  2020-10-23  6:10   ` Viresh Kumar
  2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki
  1 sibling, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-10-22 11:57 UTC (permalink / raw)
  To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the cpufreq policy max limit is changed when intel_pstate operates
in the passive mode with HWP enabled and the "powersave" governor is
used on top of it, the HWP max limit is not updated as appropriate.

Namely, in the "powersave" governor case, the target P-state
is always equal to the policy min limit, so if the latter does
not change, the "target_freq == policy->cur" check in
__cpufreq_driver_target() is "true" and the
"target_pstate != old_pstate" check in intel_cpufreq_update_pstate()
is "false", so intel_cpufreq_adjust_hwp() is not invoked to update
the HWP Request MSR and the HWP max limit is not updated as a result.

To prevent that from occurring, modify __cpufreq_driver_target()
to do the "target_freq == policy->cur" check only in the frequency
table case and change intel_cpufreq_update_pstate() to do the
"target_pstate != old_pstate" check only in the non-HWP case and
let intel_cpufreq_adjust_hwp() always run in the HWP case (it will
update HWP Request only if the cached value of the register is
different from the new one including the limits, so if neither the
target P-state value nor the max limit changes, the register write
will still be avoided).

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c      |    6 +++---
 drivers/cpufreq/intel_pstate.c |   12 +++++-------
 2 files changed, 8 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
 	int old_pstate = cpu->pstate.current_pstate;
 
 	target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
-	if (target_pstate != old_pstate) {
+	if (hwp_active) {
+		intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
+		cpu->pstate.current_pstate = target_pstate;
+	} else if (target_pstate != old_pstate) {
+		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
 		cpu->pstate.current_pstate = target_pstate;
-		if (hwp_active)
-			intel_cpufreq_adjust_hwp(cpu, target_pstate,
-						 fast_switch);
-		else
-			intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
-						      fast_switch);
 	}
 
 	intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr
 	pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n",
 		 policy->cpu, target_freq, relation, old_target_freq);
 
+	if (cpufreq_driver->target)
+		return cpufreq_driver->target(policy, target_freq, relation);
+
 	/*
 	 * This might look like a redundant call as we are checking it again
 	 * after finding index. But it is left intentionally for cases where
@@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
 	/* Save last value to restore later on errors */
 	policy->restore_freq = policy->cur;
 
-	if (cpufreq_driver->target)
-		return cpufreq_driver->target(policy, target_freq, relation);
-
 	if (!cpufreq_driver->target_index)
 		return -EINVAL;
 




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy
  2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
  2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
@ 2020-10-22 11:57 ` Rafael J. Wysocki
  2020-10-23  5:39   ` Viresh Kumar
  1 sibling, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-10-22 11:57 UTC (permalink / raw)
  To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The restore_freq field in struct cpufreq_policy is only used by
__target_index() in one place and a local variable in that function
may as well be used instead of it, so drop it and modify
__target_index() accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |   10 +++++-----
 include/linux/cpufreq.h   |    5 -----
 2 files changed, 5 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2112,7 +2112,7 @@ static int __target_intermediate(struct
 static int __target_index(struct cpufreq_policy *policy, int index)
 {
 	struct cpufreq_freqs freqs = {.old = policy->cur, .flags = 0};
-	unsigned int intermediate_freq = 0;
+	unsigned int restore_freq, intermediate_freq = 0;
 	unsigned int newfreq = policy->freq_table[index].frequency;
 	int retval = -EINVAL;
 	bool notify;
@@ -2120,6 +2120,9 @@ static int __target_index(struct cpufreq
 	if (newfreq == policy->cur)
 		return 0;
 
+	/* Save last value to restore later on errors */
+	restore_freq = policy->cur;
+
 	notify = !(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION);
 	if (notify) {
 		/* Handle switching to intermediate frequency */
@@ -2157,7 +2160,7 @@ static int __target_index(struct cpufreq
 		 */
 		if (unlikely(retval && intermediate_freq)) {
 			freqs.old = intermediate_freq;
-			freqs.new = policy->restore_freq;
+			freqs.new = restore_freq;
 			cpufreq_freq_transition_begin(policy, &freqs);
 			cpufreq_freq_transition_end(policy, &freqs, 0);
 		}
@@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
 	if (target_freq == policy->cur)
 		return 0;
 
-	/* Save last value to restore later on errors */
-	policy->restore_freq = policy->cur;
-
 	if (!cpufreq_driver->target_index)
 		return -EINVAL;
 
Index: linux-pm/include/linux/cpufreq.h
===================================================================
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -65,7 +65,6 @@ struct cpufreq_policy {
 	unsigned int		max;    /* in kHz */
 	unsigned int		cur;    /* in kHz, only needed if cpufreq
 					 * governors are used */
-	unsigned int		restore_freq; /* = policy->cur before transition */
 	unsigned int		suspend_freq; /* freq to set during suspend */
 
 	unsigned int		policy; /* see above */
@@ -308,10 +307,6 @@ struct cpufreq_driver {
 	/* define one out of two */
 	int		(*setpolicy)(struct cpufreq_policy *policy);
 
-	/*
-	 * On failure, should always restore frequency to policy->restore_freq
-	 * (i.e. old freq).
-	 */
 	int		(*target)(struct cpufreq_policy *policy,
 				  unsigned int target_freq,
 				  unsigned int relation);	/* Deprecated */




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy
  2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki
@ 2020-10-23  5:39   ` Viresh Kumar
  0 siblings, 0 replies; 6+ messages in thread
From: Viresh Kumar @ 2020-10-23  5:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 22-10-20, 13:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The restore_freq field in struct cpufreq_policy is only used by
> __target_index() in one place and a local variable in that function
> may as well be used instead of it, so drop it and modify
> __target_index() accordingly.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |   10 +++++-----
>  include/linux/cpufreq.h   |    5 -----
>  2 files changed, 5 insertions(+), 10 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
@ 2020-10-23  6:10   ` Viresh Kumar
  2020-10-23 11:57     ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Viresh Kumar @ 2020-10-23  6:10 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 22-10-20, 13:57, Rafael J. Wysocki wrote:
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr
>  	pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n",
>  		 policy->cpu, target_freq, relation, old_target_freq);
>  
> +	if (cpufreq_driver->target)
> +		return cpufreq_driver->target(policy, target_freq, relation);
> +
>  	/*
>  	 * This might look like a redundant call as we are checking it again
>  	 * after finding index. But it is left intentionally for cases where
> @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
>  	/* Save last value to restore later on errors */
>  	policy->restore_freq = policy->cur;
>  
> -	if (cpufreq_driver->target)
> -		return cpufreq_driver->target(policy, target_freq, relation);
> -
>  	if (!cpufreq_driver->target_index)
>  		return -EINVAL;

From what I understood, you want your driver to get notified about
policy->min/max changes and right now they are making it work from
within the target() callback. Your commit log talks about policy->max
and powersave combination, I think the same will be true in case of
policy->min and performance ? And also with any other governor (like
schedutil) if the target_freq doesn't change for a while.

And IMHO, this change is more like a band-aid which is going to remove
the check of target != cur for all target() type drivers (which aren't
many) and it feels like a penalty on them (which is also there for
intel-cpufreq without hwp), and that we will get into the same problem
for target_index() drivers as well if they want to do something
similar in future, i.e. skip checking for same-freq.

Maybe adding a new flag for the cpufreq-driver for force-updates would
be a better solution ? Which will make this very much driver
dependent.

-- 
viresh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-23  6:10   ` Viresh Kumar
@ 2020-10-23 11:57     ` Rafael J. Wysocki
  0 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 11:57 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On Fri, Oct 23, 2020 at 8:10 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 22-10-20, 13:57, Rafael J. Wysocki wrote:
> > Index: linux-pm/drivers/cpufreq/cpufreq.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> > +++ linux-pm/drivers/cpufreq/cpufreq.c
> > @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr
> >       pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n",
> >                policy->cpu, target_freq, relation, old_target_freq);
> >
> > +     if (cpufreq_driver->target)
> > +             return cpufreq_driver->target(policy, target_freq, relation);
> > +
> >       /*
> >        * This might look like a redundant call as we are checking it again
> >        * after finding index. But it is left intentionally for cases where
> > @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
> >       /* Save last value to restore later on errors */
> >       policy->restore_freq = policy->cur;
> >
> > -     if (cpufreq_driver->target)
> > -             return cpufreq_driver->target(policy, target_freq, relation);
> > -
> >       if (!cpufreq_driver->target_index)
> >               return -EINVAL;
>
> From what I understood, you want your driver to get notified about
> policy->min/max changes and right now they are making it work from
> within the target() callback.

Not exactly.

The driver needs to update some internal upper and lower boundary
values along with the target freq and skipping the update when target
freq doesn't change prevents it from doing that.

The policy min and max changes are communicated to the driver via the
governor ->limits() callback and that can only call
__cpufreq_driver_target() then or defer the update to the next
->fast_switch() invocation.  Either way, the driver needs to have a
chance to carry out the full update even if the target frequency
doesn't change, but the policy min or max limits may have changed.

> Your commit log talks about policy->max and powersave combination,

Yes, it does, because that's a very clearly visible symptom.

> I think the same will be true in case of policy->min and performance ?

It might in theory, but it is not in practice, because the HWP min is
set to the target freq (and the target freq is already clamped between
the policy min and max).

But generally speaking you are right, this would be a problem for any
driver having to update some internal upper and lower boundary
settings along with the target freq.

> And also with any other governor (like schedutil) if the target_freq doesn't change for a while.

Well, yes.

A change of one of the limits that doesn't cause the target to change
may be missed in general.

> And IMHO, this change is more like a band-aid which is going to remove
> the check of target != cur for all target() type drivers (which aren't
> many) and it feels like a penalty on them (which is also there for
> intel-cpufreq without hwp), and that we will get into the same problem
> for target_index() drivers as well if they want to do something
> similar in future, i.e. skip checking for same-freq.
>
> Maybe adding a new flag for the cpufreq-driver for force-updates would
> be a better solution ? Which will make this very much driver
> dependent.

Fair enough, I'll add a driver flag for that.

Also I split the patch into the core part and the driver-specific part
for clarity.

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-23 11:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
2020-10-23  6:10   ` Viresh Kumar
2020-10-23 11:57     ` Rafael J. Wysocki
2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki
2020-10-23  5:39   ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).