[v2,2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
diff mbox series

Message ID 3212190.yEXfVNHMLB@kreacher
State New, archived
Headers show
Series
  • cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor
Related show

Commit Message

Rafael J. Wysocki Oct. 23, 2020, 3:35 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the cpufreq policy max limit is changed when intel_pstate operates
in the passive mode with HWP enabled and the "powersave" governor is
used on top of it, the HWP max limit is not updated as appropriate.

Namely, in the "powersave" governor case, the target P-state
is always equal to the policy min limit, so if the latter does
not change, intel_cpufreq_adjust_hwp() is not invoked to update
the HWP Request MSR due to the "target_pstate != old_pstate" check
in intel_cpufreq_update_pstate(), so the HWP max limit is not
updated as a result.

Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
driver and the target frequency does not change along with the
policy max limit, the "target_freq == policy->cur" check in
__cpufreq_driver_target() prevents the driver's ->target() callback
from being invoked at all, so the HWP max limit is not updated.

To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
in the intel_cpufreq driver structure if HWP is enabled and modify
intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
always run in the HWP case (it will update HWP Request only if the
cached value of the register is different from the new one including
the limits, so if neither the target P-state value nor the max limit
changes, the register write will still be avoided).

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

The v2 is just the intel_pstate changes (without the core changes) and setting
the new flag.

---
 drivers/cpufreq/intel_pstate.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Comments

Viresh Kumar Oct. 27, 2020, 3:06 a.m. UTC | #1
On 23-10-20, 17:35, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the cpufreq policy max limit is changed when intel_pstate operates
> in the passive mode with HWP enabled and the "powersave" governor is
> used on top of it, the HWP max limit is not updated as appropriate.
> 
> Namely, in the "powersave" governor case, the target P-state
> is always equal to the policy min limit, so if the latter does
> not change, intel_cpufreq_adjust_hwp() is not invoked to update
> the HWP Request MSR due to the "target_pstate != old_pstate" check
> in intel_cpufreq_update_pstate(), so the HWP max limit is not
> updated as a result.
> 
> Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
> driver and the target frequency does not change along with the
> policy max limit, the "target_freq == policy->cur" check in
> __cpufreq_driver_target() prevents the driver's ->target() callback
> from being invoked at all, so the HWP max limit is not updated.
> 
> To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
> in the intel_cpufreq driver structure if HWP is enabled and modify
> intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
> check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
> always run in the HWP case (it will update HWP Request only if the
> cached value of the register is different from the new one including
> the limits, so if neither the target P-state value nor the max limit
> changes, the register write will still be avoided).
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> The v2 is just the intel_pstate changes (without the core changes) and setting
> the new flag.
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
>  	int old_pstate = cpu->pstate.current_pstate;
>  
>  	target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
> -	if (target_pstate != old_pstate) {
> +	if (hwp_active) {
> +		intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
> +		cpu->pstate.current_pstate = target_pstate;
> +	} else if (target_pstate != old_pstate) {
> +		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
>  		cpu->pstate.current_pstate = target_pstate;
> -		if (hwp_active)
> -			intel_cpufreq_adjust_hwp(cpu, target_pstate,
> -						 fast_switch);
> -		else
> -			intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
> -						      fast_switch);
>  	}
>  
>  	intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
> @@ -3014,6 +3012,7 @@ static int __init intel_pstate_init(void
>  			hwp_mode_bdw = id->driver_data;
>  			intel_pstate.attr = hwp_cpufreq_attrs;
>  			intel_cpufreq.attr = hwp_cpufreq_attrs;
> +			intel_cpufreq.flags |= CPUFREQ_NEED_UPDATE_LIMITS;
>  			if (!default_driver)
>  				default_driver = &intel_pstate;

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Zhang Rui Oct. 27, 2020, 8:47 a.m. UTC | #2
On Fri, 2020-10-23 at 17:35 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the cpufreq policy max limit is changed when intel_pstate operates
> in the passive mode with HWP enabled and the "powersave" governor is
> used on top of it, the HWP max limit is not updated as appropriate.
> 
> Namely, in the "powersave" governor case, the target P-state
> is always equal to the policy min limit, so if the latter does
> not change, intel_cpufreq_adjust_hwp() is not invoked to update
> the HWP Request MSR due to the "target_pstate != old_pstate" check
> in intel_cpufreq_update_pstate(), so the HWP max limit is not
> updated as a result.
> 
> Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
> driver and the target frequency does not change along with the
> policy max limit, the "target_freq == policy->cur" check in
> __cpufreq_driver_target() prevents the driver's ->target() callback
> from being invoked at all, so the HWP max limit is not updated.
> 
> To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
> in the intel_cpufreq driver structure if HWP is enabled and modify
> intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
> check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
> always run in the HWP case (it will update HWP Request only if the
> cached value of the register is different from the new one including
> the limits, so if neither the target P-state value nor the max limit
> changes, the register write will still be avoided).
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode
> with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I have confirmed that the problem is gone with this patch series
applied.
The HWP register is updated after changing the scaling_max_freq sysfs
attribute, with powersave governor.

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui
> ---
> 
> The v2 is just the intel_pstate changes (without the core changes)
> and setting
> the new flag.
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
>  	int old_pstate = cpu->pstate.current_pstate;
>  
>  	target_pstate = intel_pstate_prepare_request(cpu,
> target_pstate);
> -	if (target_pstate != old_pstate) {
> +	if (hwp_active) {
> +		intel_cpufreq_adjust_hwp(cpu, target_pstate,
> fast_switch);
> +		cpu->pstate.current_pstate = target_pstate;
> +	} else if (target_pstate != old_pstate) {
> +		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
> fast_switch);
>  		cpu->pstate.current_pstate = target_pstate;
> -		if (hwp_active)
> -			intel_cpufreq_adjust_hwp(cpu, target_pstate,
> -						 fast_switch);
> -		else
> -			intel_cpufreq_adjust_perf_ctl(cpu,
> target_pstate,
> -						      fast_switch);
>  	}
>  
>  	intel_cpufreq_trace(cpu, fast_switch ?
> INTEL_PSTATE_TRACE_FAST_SWITCH :
> @@ -3014,6 +3012,7 @@ static int __init intel_pstate_init(void
>  			hwp_mode_bdw = id->driver_data;
>  			intel_pstate.attr = hwp_cpufreq_attrs;
>  			intel_cpufreq.attr = hwp_cpufreq_attrs;
> +			intel_cpufreq.flags |=
> CPUFREQ_NEED_UPDATE_LIMITS;
>  			if (!default_driver)
>  				default_driver = &intel_pstate;
>  
> 
> 
>

Patch
diff mbox series

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -2550,14 +2550,12 @@  static int intel_cpufreq_update_pstate(s
 	int old_pstate = cpu->pstate.current_pstate;
 
 	target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
-	if (target_pstate != old_pstate) {
+	if (hwp_active) {
+		intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
+		cpu->pstate.current_pstate = target_pstate;
+	} else if (target_pstate != old_pstate) {
+		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
 		cpu->pstate.current_pstate = target_pstate;
-		if (hwp_active)
-			intel_cpufreq_adjust_hwp(cpu, target_pstate,
-						 fast_switch);
-		else
-			intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
-						      fast_switch);
 	}
 
 	intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
@@ -3014,6 +3012,7 @@  static int __init intel_pstate_init(void
 			hwp_mode_bdw = id->driver_data;
 			intel_pstate.attr = hwp_cpufreq_attrs;
 			intel_cpufreq.attr = hwp_cpufreq_attrs;
+			intel_cpufreq.flags |= CPUFREQ_NEED_UPDATE_LIMITS;
 			if (!default_driver)
 				default_driver = &intel_pstate;