linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor
@ 2020-10-23 15:21 Rafael J. Wysocki
  2020-10-23 15:35 ` [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 15:21 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

Hi All,

There is a problem in intel_pstate that if it works in the passive mode with
HWP enabled, changing the policy max frequency may not cause the HWP max limit
to be updated accordingly which is quite confusing and may be incorrect.

That happens because of two checks, one in the cpufreq core and one in the
driver itself, that are there to avoid unnecessary HW/FW updates when the
current frequency doesn't change.  Of course, that is the case when the
policy max limit changes under the "powersave" governor (which sets the
current frequency to the policy min limit) and in that particular case,
the checks turn out to be harmful.

There is also an analogous problem in the schedutil governor that prevents
driver callbacks from being invoked if the target frequency doesn't change
and which also affects intel_pstate in the passive mode with HWP enabled
(see the changelog of patch [4/4]).

The v2 addresses some review comments from Viresh and adds patches [3-4/4] to
address the schedutil issue.

The cleanup posted as the [2/2] previously will be applied independently and
it is not included in the v2.

Thanks!




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag
  2020-10-23 15:21 [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
@ 2020-10-23 15:35 ` Rafael J. Wysocki
  2020-10-27  3:06   ` Viresh Kumar
  2020-10-23 15:35 ` [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 15:35 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Generally, a cpufreq driver may need to update some internal upper
and lower frequency boundaries on policy max and min changes,
respectively, but currently this does not work if the target
frequency does not change along with the policy limit.

Namely, if the target frequency does not change along with the
policy min or max, the "target_freq == policy->cur" check in
__cpufreq_driver_target() prevents driver callbacks from being
invoked and they do not even have a chance to update the
corresponding internal boundary.

This particularly affects the "powersave" and "performance"
governors that always set the target frequency to one of the
policy limits and it never changes when the other limit is updated.

To allow cpufreq the drivers needing to update internal frequency
boundaries on policy limits changes to avoid this issue, introduce
a new driver flag, CPUFREQ_NEED_UPDATE_LIMITS, that (when set) will
neutralize the check mentioned above.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

New patch in v2.

---
 drivers/cpufreq/cpufreq.c |    3 ++-
 include/linux/cpufreq.h   |   10 +++++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

Index: linux-pm/include/linux/cpufreq.h
===================================================================
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -297,7 +297,7 @@ __ATTR(_name, 0644, show_##_name, store_
 
 struct cpufreq_driver {
 	char		name[CPUFREQ_NAME_LEN];
-	u8		flags;
+	u16		flags;
 	void		*driver_data;
 
 	/* needed by all drivers */
@@ -417,6 +417,14 @@ struct cpufreq_driver {
  */
 #define CPUFREQ_IS_COOLING_DEV			BIT(7)
 
+/*
+ * Set by drivers that need to update internale upper and lower boundaries along
+ * with the target frequency and so the core and governors should also invoke
+ * the diver if the target frequency does not change, but the policy min or max
+ * may have changed.
+ */
+#define CPUFREQ_NEED_UPDATE_LIMITS		BIT(8)
+
 int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
 
Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2191,7 +2191,8 @@ int __cpufreq_driver_target(struct cpufr
 	 * exactly same freq is called again and so we can save on few function
 	 * calls.
 	 */
-	if (target_freq == policy->cur)
+	if (target_freq == policy->cur &&
+	    !(cpufreq_driver->flags & CPUFREQ_NEED_UPDATE_LIMITS))
 		return 0;
 
 	if (cpufreq_driver->target)




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-23 15:21 [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
  2020-10-23 15:35 ` [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag Rafael J. Wysocki
@ 2020-10-23 15:35 ` Rafael J. Wysocki
  2020-10-27  3:06   ` Viresh Kumar
  2020-10-27  8:47   ` Zhang Rui
  2020-10-23 15:35 ` [PATCH v2 3/4] cpufreq: Introduce cpufreq_driver_test_flags() Rafael J. Wysocki
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
  3 siblings, 2 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 15:35 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the cpufreq policy max limit is changed when intel_pstate operates
in the passive mode with HWP enabled and the "powersave" governor is
used on top of it, the HWP max limit is not updated as appropriate.

Namely, in the "powersave" governor case, the target P-state
is always equal to the policy min limit, so if the latter does
not change, intel_cpufreq_adjust_hwp() is not invoked to update
the HWP Request MSR due to the "target_pstate != old_pstate" check
in intel_cpufreq_update_pstate(), so the HWP max limit is not
updated as a result.

Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
driver and the target frequency does not change along with the
policy max limit, the "target_freq == policy->cur" check in
__cpufreq_driver_target() prevents the driver's ->target() callback
from being invoked at all, so the HWP max limit is not updated.

To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
in the intel_cpufreq driver structure if HWP is enabled and modify
intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
always run in the HWP case (it will update HWP Request only if the
cached value of the register is different from the new one including
the limits, so if neither the target P-state value nor the max limit
changes, the register write will still be avoided).

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

The v2 is just the intel_pstate changes (without the core changes) and setting
the new flag.

---
 drivers/cpufreq/intel_pstate.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
 	int old_pstate = cpu->pstate.current_pstate;
 
 	target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
-	if (target_pstate != old_pstate) {
+	if (hwp_active) {
+		intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
+		cpu->pstate.current_pstate = target_pstate;
+	} else if (target_pstate != old_pstate) {
+		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
 		cpu->pstate.current_pstate = target_pstate;
-		if (hwp_active)
-			intel_cpufreq_adjust_hwp(cpu, target_pstate,
-						 fast_switch);
-		else
-			intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
-						      fast_switch);
 	}
 
 	intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
@@ -3014,6 +3012,7 @@ static int __init intel_pstate_init(void
 			hwp_mode_bdw = id->driver_data;
 			intel_pstate.attr = hwp_cpufreq_attrs;
 			intel_cpufreq.attr = hwp_cpufreq_attrs;
+			intel_cpufreq.flags |= CPUFREQ_NEED_UPDATE_LIMITS;
 			if (!default_driver)
 				default_driver = &intel_pstate;
 




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] cpufreq: Introduce cpufreq_driver_test_flags()
  2020-10-23 15:21 [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
  2020-10-23 15:35 ` [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag Rafael J. Wysocki
  2020-10-23 15:35 ` [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
@ 2020-10-23 15:35 ` Rafael J. Wysocki
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
  3 siblings, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 15:35 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a helper function to test the flags of the cpufreq driver in use
againt a given flags mask.

In particular, this will be needed to test the
CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag in the schedutil
governor.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

New patch in v2.

---
 drivers/cpufreq/cpufreq.c |   12 ++++++++++++
 include/linux/cpufreq.h   |    1 +
 2 files changed, 13 insertions(+)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1909,6 +1909,18 @@ void cpufreq_resume(void)
 }
 
 /**
+ * cpufreq_driver_test_flags - Test cpufreq driver's flags against given ones.
+ * @flags: Flags to test against the current cpufreq driver's flags.
+ *
+ * Assumes that the driver is there, so callers must ensure that this is the
+ * case.
+ */
+bool cpufreq_driver_test_flags(u16 flags)
+{
+	return !!(cpufreq_driver->flags & flags);
+}
+
+/**
  *	cpufreq_get_current_driver - return current driver's name
  *
  *	Return the name string of the currently loaded cpufreq driver
Index: linux-pm/include/linux/cpufreq.h
===================================================================
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -428,6 +428,7 @@ struct cpufreq_driver {
 int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
 
+bool cpufreq_driver_test_flags(u16 flags);
 const char *cpufreq_get_current_driver(void);
 void *cpufreq_get_driver_data(void);
 




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set
  2020-10-23 15:21 [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2020-10-23 15:35 ` [PATCH v2 3/4] cpufreq: Introduce cpufreq_driver_test_flags() Rafael J. Wysocki
@ 2020-10-23 15:36 ` Rafael J. Wysocki
  2020-10-27  4:25   ` Viresh Kumar
                     ` (3 more replies)
  3 siblings, 4 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-23 15:36 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because sugov_update_next_freq() may skip a frequency update even if
the need_freq_update flag has been set for the policy at hand, policy
limits updates may not take effect as expected.

For example, if the intel_pstate driver operates in the passive mode
with HWP enabled, it needs to update the HWP min and max limits when
the policy min and max limits change, respectively, but that may not
happen if the target frequency does not change along with the limit
at hand.  In particular, if the policy min is changed first, causing
the target frequency to be adjusted to it, and the policy max limit
is changed later to the same value, the HWP max limit will not be
updated to follow it as expected, because the target frequency is
still equal to the policy min limit and it will not change until
that limit is updated.

To address this issue, modify get_next_freq() to clear
need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
not set for the cpufreq driver in use (and it should be set for all
potentially affected drivers) and make sugov_update_next_freq()
check need_freq_update and continue when it is set regardless of
whether or not the new target frequency is equal to the old one.

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

New patch in v2.

---
 kernel/sched/cpufreq_schedutil.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
 static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
 				   unsigned int next_freq)
 {
-	if (sg_policy->next_freq == next_freq)
+	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
 		return false;
 
 	sg_policy->next_freq = next_freq;
 	sg_policy->last_freq_update_time = time;
+	sg_policy->need_freq_update = false;
 
 	return true;
 }
@@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
 	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
 		return sg_policy->next_freq;
 
-	sg_policy->need_freq_update = false;
+	if (sg_policy->need_freq_update)
+		sg_policy->need_freq_update =
+			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
+
 	sg_policy->cached_raw_freq = freq;
 	return cpufreq_driver_resolve_freq(policy, freq);
 }




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag
  2020-10-23 15:35 ` [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag Rafael J. Wysocki
@ 2020-10-27  3:06   ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2020-10-27  3:06 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 23-10-20, 17:35, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Generally, a cpufreq driver may need to update some internal upper
> and lower frequency boundaries on policy max and min changes,
> respectively, but currently this does not work if the target
> frequency does not change along with the policy limit.
> 
> Namely, if the target frequency does not change along with the
> policy min or max, the "target_freq == policy->cur" check in
> __cpufreq_driver_target() prevents driver callbacks from being
> invoked and they do not even have a chance to update the
> corresponding internal boundary.
> 
> This particularly affects the "powersave" and "performance"
> governors that always set the target frequency to one of the
> policy limits and it never changes when the other limit is updated.
> 
> To allow cpufreq the drivers needing to update internal frequency
> boundaries on policy limits changes to avoid this issue, introduce
> a new driver flag, CPUFREQ_NEED_UPDATE_LIMITS, that (when set) will
> neutralize the check mentioned above.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> New patch in v2.
> 
> ---
>  drivers/cpufreq/cpufreq.c |    3 ++-
>  include/linux/cpufreq.h   |   10 +++++++++-
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/include/linux/cpufreq.h
> ===================================================================
> --- linux-pm.orig/include/linux/cpufreq.h
> +++ linux-pm/include/linux/cpufreq.h
> @@ -297,7 +297,7 @@ __ATTR(_name, 0644, show_##_name, store_
>  
>  struct cpufreq_driver {
>  	char		name[CPUFREQ_NAME_LEN];
> -	u8		flags;
> +	u16		flags;
>  	void		*driver_data;
>  
>  	/* needed by all drivers */
> @@ -417,6 +417,14 @@ struct cpufreq_driver {
>   */
>  #define CPUFREQ_IS_COOLING_DEV			BIT(7)
>  
> +/*
> + * Set by drivers that need to update internale upper and lower boundaries along
> + * with the target frequency and so the core and governors should also invoke
> + * the diver if the target frequency does not change, but the policy min or max
> + * may have changed.
> + */
> +#define CPUFREQ_NEED_UPDATE_LIMITS		BIT(8)
> +
>  int cpufreq_register_driver(struct cpufreq_driver *driver_data);
>  int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
>  
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -2191,7 +2191,8 @@ int __cpufreq_driver_target(struct cpufr
>  	 * exactly same freq is called again and so we can save on few function
>  	 * calls.
>  	 */
> -	if (target_freq == policy->cur)
> +	if (target_freq == policy->cur &&
> +	    !(cpufreq_driver->flags & CPUFREQ_NEED_UPDATE_LIMITS))

I was wondering if the same change should be made in the target_index part as we
do this kind of check again ? But then I thought that since we know there are no
users of that right now, why bother :)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-23 15:35 ` [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
@ 2020-10-27  3:06   ` Viresh Kumar
  2020-10-27  8:47   ` Zhang Rui
  1 sibling, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2020-10-27  3:06 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 23-10-20, 17:35, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the cpufreq policy max limit is changed when intel_pstate operates
> in the passive mode with HWP enabled and the "powersave" governor is
> used on top of it, the HWP max limit is not updated as appropriate.
> 
> Namely, in the "powersave" governor case, the target P-state
> is always equal to the policy min limit, so if the latter does
> not change, intel_cpufreq_adjust_hwp() is not invoked to update
> the HWP Request MSR due to the "target_pstate != old_pstate" check
> in intel_cpufreq_update_pstate(), so the HWP max limit is not
> updated as a result.
> 
> Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
> driver and the target frequency does not change along with the
> policy max limit, the "target_freq == policy->cur" check in
> __cpufreq_driver_target() prevents the driver's ->target() callback
> from being invoked at all, so the HWP max limit is not updated.
> 
> To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
> in the intel_cpufreq driver structure if HWP is enabled and modify
> intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
> check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
> always run in the HWP case (it will update HWP Request only if the
> cached value of the register is different from the new one including
> the limits, so if neither the target P-state value nor the max limit
> changes, the register write will still be avoided).
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> The v2 is just the intel_pstate changes (without the core changes) and setting
> the new flag.
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
>  	int old_pstate = cpu->pstate.current_pstate;
>  
>  	target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
> -	if (target_pstate != old_pstate) {
> +	if (hwp_active) {
> +		intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
> +		cpu->pstate.current_pstate = target_pstate;
> +	} else if (target_pstate != old_pstate) {
> +		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
>  		cpu->pstate.current_pstate = target_pstate;
> -		if (hwp_active)
> -			intel_cpufreq_adjust_hwp(cpu, target_pstate,
> -						 fast_switch);
> -		else
> -			intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
> -						      fast_switch);
>  	}
>  
>  	intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
> @@ -3014,6 +3012,7 @@ static int __init intel_pstate_init(void
>  			hwp_mode_bdw = id->driver_data;
>  			intel_pstate.attr = hwp_cpufreq_attrs;
>  			intel_cpufreq.attr = hwp_cpufreq_attrs;
> +			intel_cpufreq.flags |= CPUFREQ_NEED_UPDATE_LIMITS;
>  			if (!default_driver)
>  				default_driver = &intel_pstate;

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
@ 2020-10-27  4:25   ` Viresh Kumar
  2020-10-27 13:14     ` Rafael J. Wysocki
  2020-10-27  8:47   ` Zhang Rui
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2020-10-27  4:25 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

Spelling mistake in $subject (driver)

On 23-10-20, 17:36, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to clear
> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> not set for the cpufreq driver in use (and it should be set for all
> potentially affected drivers) and make sugov_update_next_freq()
> check need_freq_update and continue when it is set regardless of
> whether or not the new target frequency is equal to the old one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> New patch in v2.
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
>  	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
> +	if (sg_policy->need_freq_update)
> +		sg_policy->need_freq_update =
> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> +

The behavior here is a bit different from what we did in cpufreq.c. In cpufreq
core we are _always_ allowing the call to reach the driver's target() routine,
but here we do it only if limits have changed. Wonder if we should have similar
behavior here as well ?

Over that the code here can be rewritten a bit like:

	if (sg_policy->need_freq_update)
                sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
        else if (freq == sg_policy->cached_raw_freq)
		return sg_policy->next_freq;

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
  2020-10-23 15:35 ` [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
  2020-10-27  3:06   ` Viresh Kumar
@ 2020-10-27  8:47   ` Zhang Rui
  1 sibling, 0 replies; 18+ messages in thread
From: Zhang Rui @ 2020-10-27  8:47 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada

On Fri, 2020-10-23 at 17:35 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the cpufreq policy max limit is changed when intel_pstate operates
> in the passive mode with HWP enabled and the "powersave" governor is
> used on top of it, the HWP max limit is not updated as appropriate.
> 
> Namely, in the "powersave" governor case, the target P-state
> is always equal to the policy min limit, so if the latter does
> not change, intel_cpufreq_adjust_hwp() is not invoked to update
> the HWP Request MSR due to the "target_pstate != old_pstate" check
> in intel_cpufreq_update_pstate(), so the HWP max limit is not
> updated as a result.
> 
> Also, if the CPUFREQ_NEED_UPDATE_LIMITS flag is not set for the
> driver and the target frequency does not change along with the
> policy max limit, the "target_freq == policy->cur" check in
> __cpufreq_driver_target() prevents the driver's ->target() callback
> from being invoked at all, so the HWP max limit is not updated.
> 
> To prevent that occurring, set the CPUFREQ_NEED_UPDATE_LIMITS flag
> in the intel_cpufreq driver structure if HWP is enabled and modify
> intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate"
> check only in the non-HWP case and let intel_cpufreq_adjust_hwp()
> always run in the HWP case (it will update HWP Request only if the
> cached value of the register is different from the new one including
> the limits, so if neither the target P-state value nor the max limit
> changes, the register write will still be avoided).
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode
> with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I have confirmed that the problem is gone with this patch series
applied.
The HWP register is updated after changing the scaling_max_freq sysfs
attribute, with powersave governor.

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui
> ---
> 
> The v2 is just the intel_pstate changes (without the core changes)
> and setting
> the new flag.
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
>  	int old_pstate = cpu->pstate.current_pstate;
>  
>  	target_pstate = intel_pstate_prepare_request(cpu,
> target_pstate);
> -	if (target_pstate != old_pstate) {
> +	if (hwp_active) {
> +		intel_cpufreq_adjust_hwp(cpu, target_pstate,
> fast_switch);
> +		cpu->pstate.current_pstate = target_pstate;
> +	} else if (target_pstate != old_pstate) {
> +		intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
> fast_switch);
>  		cpu->pstate.current_pstate = target_pstate;
> -		if (hwp_active)
> -			intel_cpufreq_adjust_hwp(cpu, target_pstate,
> -						 fast_switch);
> -		else
> -			intel_cpufreq_adjust_perf_ctl(cpu,
> target_pstate,
> -						      fast_switch);
>  	}
>  
>  	intel_cpufreq_trace(cpu, fast_switch ?
> INTEL_PSTATE_TRACE_FAST_SWITCH :
> @@ -3014,6 +3012,7 @@ static int __init intel_pstate_init(void
>  			hwp_mode_bdw = id->driver_data;
>  			intel_pstate.attr = hwp_cpufreq_attrs;
>  			intel_cpufreq.attr = hwp_cpufreq_attrs;
> +			intel_cpufreq.flags |=
> CPUFREQ_NEED_UPDATE_LIMITS;
>  			if (!default_driver)
>  				default_driver = &intel_pstate;
>  
> 
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
  2020-10-27  4:25   ` Viresh Kumar
@ 2020-10-27  8:47   ` Zhang Rui
  2020-10-27 15:35   ` [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver " Rafael J. Wysocki
  2020-10-29 11:12   ` [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS " Rafael J. Wysocki
  3 siblings, 0 replies; 18+ messages in thread
From: Zhang Rui @ 2020-10-27  8:47 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada

On Fri, 2020-10-23 at 17:36 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to clear
> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> not set for the cpufreq driver in use (and it should be set for all
> potentially affected drivers) and make sugov_update_next_freq()
> check need_freq_update and continue when it is set regardless of
> whether or not the new target frequency is equal to the old one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode
> with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I have confirmed that the problem is gone with this patch series
applied.

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui

> ---
> 
> New patch in v2.
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy,
> u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy-
> >need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
>  	if (freq == sg_policy->cached_raw_freq && !sg_policy-
> >need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
> +	if (sg_policy->need_freq_update)
> +		sg_policy->need_freq_update =
> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_L
> IMITS);
> +
>  	sg_policy->cached_raw_freq = freq;
>  	return cpufreq_driver_resolve_freq(policy, freq);
>  }
> 
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set
  2020-10-27  4:25   ` Viresh Kumar
@ 2020-10-27 13:14     ` Rafael J. Wysocki
  0 siblings, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-27 13:14 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On Tue, Oct 27, 2020 at 5:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> Spelling mistake in $subject (driver)
>
> On 23-10-20, 17:36, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Because sugov_update_next_freq() may skip a frequency update even if
> > the need_freq_update flag has been set for the policy at hand, policy
> > limits updates may not take effect as expected.
> >
> > For example, if the intel_pstate driver operates in the passive mode
> > with HWP enabled, it needs to update the HWP min and max limits when
> > the policy min and max limits change, respectively, but that may not
> > happen if the target frequency does not change along with the limit
> > at hand.  In particular, if the policy min is changed first, causing
> > the target frequency to be adjusted to it, and the policy max limit
> > is changed later to the same value, the HWP max limit will not be
> > updated to follow it as expected, because the target frequency is
> > still equal to the policy min limit and it will not change until
> > that limit is updated.
> >
> > To address this issue, modify get_next_freq() to clear
> > need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> > not set for the cpufreq driver in use (and it should be set for all
> > potentially affected drivers) and make sugov_update_next_freq()
> > check need_freq_update and continue when it is set regardless of
> > whether or not the new target frequency is equal to the old one.
> >
> > Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> > Reported-by: Zhang Rui <rui.zhang@intel.com>
> > Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > New patch in v2.
> >
> > ---
> >  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> >                                  unsigned int next_freq)
> >  {
> > -     if (sg_policy->next_freq == next_freq)
> > +     if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
> >               return false;
> >
> >       sg_policy->next_freq = next_freq;
> >       sg_policy->last_freq_update_time = time;
> > +     sg_policy->need_freq_update = false;
> >
> >       return true;
> >  }
> > @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
> >       if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> >               return sg_policy->next_freq;
> >
> > -     sg_policy->need_freq_update = false;
> > +     if (sg_policy->need_freq_update)
> > +             sg_policy->need_freq_update =
> > +                     cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> > +
>
> The behavior here is a bit different from what we did in cpufreq.c. In cpufreq
> core we are _always_ allowing the call to reach the driver's target() routine,
> but here we do it only if limits have changed. Wonder if we should have similar
> behavior here as well ?

I didn't think about that, but now that you mentioned it, I think that
this is a good idea.

Will send an updated patch with that implemented shortly.

> Over that the code here can be rewritten a bit like:
>
>         if (sg_policy->need_freq_update)
>                 sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
>         else if (freq == sg_policy->cached_raw_freq)
>                 return sg_policy->next_freq;

Right, but it will be somewhat different anyway. :-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver if need_freq_update is set
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
  2020-10-27  4:25   ` Viresh Kumar
  2020-10-27  8:47   ` Zhang Rui
@ 2020-10-27 15:35   ` Rafael J. Wysocki
  2020-10-28  3:57     ` Viresh Kumar
  2020-10-29 11:12   ` [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS " Rafael J. Wysocki
  3 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-27 15:35 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because sugov_update_next_freq() may skip a frequency update even if
the need_freq_update flag has been set for the policy at hand, policy
limits updates may not take effect as expected.

For example, if the intel_pstate driver operates in the passive mode
with HWP enabled, it needs to update the HWP min and max limits when
the policy min and max limits change, respectively, but that may not
happen if the target frequency does not change along with the limit
at hand.  In particular, if the policy min is changed first, causing
the target frequency to be adjusted to it, and the policy max limit
is changed later to the same value, the HWP max limit will not be
updated to follow it as expected, because the target frequency is
still equal to the policy min limit and it will not change until
that limit is updated.

To address this issue, modify get_next_freq() to let the driver
callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag
is set regardless of whether or not the new frequency to set is
equal to the previous one.

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v2.1
   * Fix typo in the subject.
   * Make get_next_freq() and sugov_update_next_freq() ignore the
     sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS
     is set for the driver.
   * Add Tested-by from Rui (this version lets the driver callback run more
     often than the v2, so the behavior in the Rui's case doesn't change).

---
 kernel/sched/cpufreq_schedutil.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
 static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
 				   unsigned int next_freq)
 {
-	if (sg_policy->next_freq == next_freq)
+	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
 		return false;
 
 	sg_policy->next_freq = next_freq;
 	sg_policy->last_freq_update_time = time;
+	sg_policy->need_freq_update = false;
 
 	return true;
 }
@@ -161,10 +162,12 @@ static unsigned int get_next_freq(struct
 
 	freq = map_util_freq(util, freq, max);
 
-	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
+	if (cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
+		sg_policy->need_freq_update = true;
+	else if (freq == sg_policy->cached_raw_freq &&
+		 !sg_policy->need_freq_update)
 		return sg_policy->next_freq;
 
-	sg_policy->need_freq_update = false;
 	sg_policy->cached_raw_freq = freq;
 	return cpufreq_driver_resolve_freq(policy, freq);
 }




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver if need_freq_update is set
  2020-10-27 15:35   ` [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver " Rafael J. Wysocki
@ 2020-10-28  3:57     ` Viresh Kumar
  2020-10-29 10:42       ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2020-10-28  3:57 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 27-10-20, 16:35, Rafael J. Wysocki wrote:
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -161,10 +162,12 @@ static unsigned int get_next_freq(struct
>  
>  	freq = map_util_freq(util, freq, max);
>  
> -	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> +	if (cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> +		sg_policy->need_freq_update = true;
> +	else if (freq == sg_policy->cached_raw_freq &&
> +		 !sg_policy->need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
>  	sg_policy->cached_raw_freq = freq;
>  	return cpufreq_driver_resolve_freq(policy, freq);
>  }

What about just this instead ?

  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
  				   unsigned int next_freq)
  {
 -	if (sg_policy->next_freq == next_freq)
 +	if (sg_policy->next_freq == next_freq &&
 +          !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
  		return false;
  
  	sg_policy->next_freq = next_freq;
  	sg_policy->last_freq_update_time = time;
  
  	return true;
  }

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver if need_freq_update is set
  2020-10-28  3:57     ` Viresh Kumar
@ 2020-10-29 10:42       ` Rafael J. Wysocki
  2020-10-29 10:54         ` Viresh Kumar
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-29 10:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On Thu, Oct 29, 2020 at 12:10 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 27-10-20, 16:35, Rafael J. Wysocki wrote:
> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> >                                  unsigned int next_freq)
> >  {
> > -     if (sg_policy->next_freq == next_freq)
> > +     if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
> >               return false;
> >
> >       sg_policy->next_freq = next_freq;
> >       sg_policy->last_freq_update_time = time;
> > +     sg_policy->need_freq_update = false;
> >
> >       return true;
> >  }
> > @@ -161,10 +162,12 @@ static unsigned int get_next_freq(struct
> >
> >       freq = map_util_freq(util, freq, max);
> >
> > -     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > +     if (cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> > +             sg_policy->need_freq_update = true;
> > +     else if (freq == sg_policy->cached_raw_freq &&
> > +              !sg_policy->need_freq_update)
> >               return sg_policy->next_freq;
> >
> > -     sg_policy->need_freq_update = false;
> >       sg_policy->cached_raw_freq = freq;
> >       return cpufreq_driver_resolve_freq(policy, freq);
> >  }
>
> What about just this instead ?
>
>   static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>                                    unsigned int next_freq)
>   {
>  -      if (sg_policy->next_freq == next_freq)
>  +      if (sg_policy->next_freq == next_freq &&
>  +          !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
>                 return false;
>
>         sg_policy->next_freq = next_freq;
>         sg_policy->last_freq_update_time = time;
>
>         return true;
>   }
>

Without any changes in get_next_freq() this is not sufficient, because
get_next_freq() may skip the update too.

If the intention is to always let the driver callback run when
CPUFREQ_NEED_UPDATE_LIMITS is set, then both get_next_freq() and
sugov_update_next_freq() need to be modified.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver if need_freq_update is set
  2020-10-29 10:42       ` Rafael J. Wysocki
@ 2020-10-29 10:54         ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2020-10-29 10:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 29-10-20, 11:42, Rafael J. Wysocki wrote:
> On Thu, Oct 29, 2020 at 12:10 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 27-10-20, 16:35, Rafael J. Wysocki wrote:
> > > Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> > > ===================================================================
> > > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> > > +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> > > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
> > >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> > >                                  unsigned int next_freq)
> > >  {
> > > -     if (sg_policy->next_freq == next_freq)
> > > +     if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
> > >               return false;
> > >
> > >       sg_policy->next_freq = next_freq;
> > >       sg_policy->last_freq_update_time = time;
> > > +     sg_policy->need_freq_update = false;
> > >
> > >       return true;
> > >  }
> > > @@ -161,10 +162,12 @@ static unsigned int get_next_freq(struct
> > >
> > >       freq = map_util_freq(util, freq, max);
> > >
> > > -     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > > +     if (cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> > > +             sg_policy->need_freq_update = true;
> > > +     else if (freq == sg_policy->cached_raw_freq &&
> > > +              !sg_policy->need_freq_update)
> > >               return sg_policy->next_freq;
> > >
> > > -     sg_policy->need_freq_update = false;
> > >       sg_policy->cached_raw_freq = freq;
> > >       return cpufreq_driver_resolve_freq(policy, freq);
> > >  }
> >
> > What about just this instead ?
> >
> >   static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> >                                    unsigned int next_freq)
> >   {
> >  -      if (sg_policy->next_freq == next_freq)
> >  +      if (sg_policy->next_freq == next_freq &&
> >  +          !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> >                 return false;
> >
> >         sg_policy->next_freq = next_freq;
> >         sg_policy->last_freq_update_time = time;
> >
> >         return true;
> >   }
> >
> 
> Without any changes in get_next_freq() this is not sufficient, because
> get_next_freq() may skip the update too.
> 
> If the intention is to always let the driver callback run when
> CPUFREQ_NEED_UPDATE_LIMITS is set, then both get_next_freq() and
> sugov_update_next_freq() need to be modified.

Right, my mistake. I was just suggesting that we may not need to touch
need_freq_update at all but just check the flag.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set
  2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
                     ` (2 preceding siblings ...)
  2020-10-27 15:35   ` [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver " Rafael J. Wysocki
@ 2020-10-29 11:12   ` Rafael J. Wysocki
  2020-10-29 11:23     ` Viresh Kumar
  3 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-29 11:12 UTC (permalink / raw)
  To: Linux PM, Viresh Kumar; +Cc: LKML, Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because sugov_update_next_freq() may skip a frequency update even if
the need_freq_update flag has been set for the policy at hand, policy
limits updates may not take effect as expected.

For example, if the intel_pstate driver operates in the passive mode
with HWP enabled, it needs to update the HWP min and max limits when
the policy min and max limits change, respectively, but that may not
happen if the target frequency does not change along with the limit
at hand.  In particular, if the policy min is changed first, causing
the target frequency to be adjusted to it, and the policy max limit
is changed later to the same value, the HWP max limit will not be
updated to follow it as expected, because the target frequency is
still equal to the policy min limit and it will not change until
that limit is updated.

To address this issue, modify get_next_freq() to let the driver
callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag
is set regardless of whether or not the new frequency to set is
equal to the previous one.

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2.1 -> v2.2:
   * Instead of updating need_freq_update if CPUFREQ_NEED_UPDATE_LIMITS is set
     in get_next_freq() and checking it again in sugov_update_next_freq(),
     check CPUFREQ_NEED_UPDATE_LIMITS directly in sugov_update_next_freq().
   * Update the subject.

v2 -> v2.1:
   * Fix typo in the subject.
   * Make get_next_freq() and sugov_update_next_freq() ignore the
     sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS
     is set for the driver.
   * Add Tested-by from Rui (this version lets the driver callback run more
     often than the v2, so the behavior in the Rui's case doesn't change).

---
 kernel/sched/cpufreq_schedutil.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -102,7 +102,8 @@ static bool sugov_should_update_freq(str
 static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
 				   unsigned int next_freq)
 {
-	if (sg_policy->next_freq == next_freq)
+	if (sg_policy->next_freq == next_freq &&
+	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
 		return false;
 
 	sg_policy->next_freq = next_freq;
@@ -161,7 +162,8 @@ static unsigned int get_next_freq(struct
 
 	freq = map_util_freq(util, freq, max);
 
-	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
+	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update &&
+	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
 		return sg_policy->next_freq;
 
 	sg_policy->need_freq_update = false;





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set
  2020-10-29 11:12   ` [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS " Rafael J. Wysocki
@ 2020-10-29 11:23     ` Viresh Kumar
  2020-10-29 11:29       ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2020-10-29 11:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On 29-10-20, 12:12, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to let the driver
> callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag
> is set regardless of whether or not the new frequency to set is
> equal to the previous one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Tested-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> v2.1 -> v2.2:
>    * Instead of updating need_freq_update if CPUFREQ_NEED_UPDATE_LIMITS is set
>      in get_next_freq() and checking it again in sugov_update_next_freq(),
>      check CPUFREQ_NEED_UPDATE_LIMITS directly in sugov_update_next_freq().
>    * Update the subject.
> 
> v2 -> v2.1:
>    * Fix typo in the subject.
>    * Make get_next_freq() and sugov_update_next_freq() ignore the
>      sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS
>      is set for the driver.
>    * Add Tested-by from Rui (this version lets the driver callback run more
>      often than the v2, so the behavior in the Rui's case doesn't change).
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,7 +102,8 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq &&
> +	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
>  		return false;

Since sg_policy->next_freq is used elsewhere as well, this is the best
we can do here.

>  	sg_policy->next_freq = next_freq;
> @@ -161,7 +162,8 @@ static unsigned int get_next_freq(struct
>  
>  	freq = map_util_freq(util, freq, max);
>  
> -	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> +	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update &&
> +	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
>  		return sg_policy->next_freq;
>  
>  	sg_policy->need_freq_update = false;

But I was wondering if instead of this we just do this here:

        if (!cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
                sg_policy->cached_raw_freq = freq;

And so the above check will always fail.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set
  2020-10-29 11:23     ` Viresh Kumar
@ 2020-10-29 11:29       ` Rafael J. Wysocki
  0 siblings, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-29 11:29 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui

On Thu, Oct 29, 2020 at 12:23 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 29-10-20, 12:12, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Because sugov_update_next_freq() may skip a frequency update even if
> > the need_freq_update flag has been set for the policy at hand, policy
> > limits updates may not take effect as expected.
> >
> > For example, if the intel_pstate driver operates in the passive mode
> > with HWP enabled, it needs to update the HWP min and max limits when
> > the policy min and max limits change, respectively, but that may not
> > happen if the target frequency does not change along with the limit
> > at hand.  In particular, if the policy min is changed first, causing
> > the target frequency to be adjusted to it, and the policy max limit
> > is changed later to the same value, the HWP max limit will not be
> > updated to follow it as expected, because the target frequency is
> > still equal to the policy min limit and it will not change until
> > that limit is updated.
> >
> > To address this issue, modify get_next_freq() to let the driver
> > callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag
> > is set regardless of whether or not the new frequency to set is
> > equal to the previous one.
> >
> > Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
> > Reported-by: Zhang Rui <rui.zhang@intel.com>
> > Tested-by: Zhang Rui <rui.zhang@intel.com>
> > Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > v2.1 -> v2.2:
> >    * Instead of updating need_freq_update if CPUFREQ_NEED_UPDATE_LIMITS is set
> >      in get_next_freq() and checking it again in sugov_update_next_freq(),
> >      check CPUFREQ_NEED_UPDATE_LIMITS directly in sugov_update_next_freq().
> >    * Update the subject.
> >
> > v2 -> v2.1:
> >    * Fix typo in the subject.
> >    * Make get_next_freq() and sugov_update_next_freq() ignore the
> >      sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS
> >      is set for the driver.
> >    * Add Tested-by from Rui (this version lets the driver callback run more
> >      often than the v2, so the behavior in the Rui's case doesn't change).
> >
> > ---
> >  kernel/sched/cpufreq_schedutil.c |    6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> > @@ -102,7 +102,8 @@ static bool sugov_should_update_freq(str
> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> >                                  unsigned int next_freq)
> >  {
> > -     if (sg_policy->next_freq == next_freq)
> > +     if (sg_policy->next_freq == next_freq &&
> > +         !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> >               return false;
>
> Since sg_policy->next_freq is used elsewhere as well, this is the best
> we can do here.
>
> >       sg_policy->next_freq = next_freq;
> > @@ -161,7 +162,8 @@ static unsigned int get_next_freq(struct
> >
> >       freq = map_util_freq(util, freq, max);
> >
> > -     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > +     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update &&
> > +         !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
> >               return sg_policy->next_freq;
> >
> >       sg_policy->need_freq_update = false;
>
> But I was wondering if instead of this we just do this here:
>
>         if (!cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
>                 sg_policy->cached_raw_freq = freq;
>
> And so the above check will always fail.

I wrote it this way, because I want to avoid looking at the driver
flags at all unless the update is going to be skipped.  Otherwise we
may end up fetching a new cache line here every time even if that is
not needed.

> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-10-29 11:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 15:21 [PATCH v2 0/4] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki
2020-10-23 15:35 ` [PATCH v2 1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag Rafael J. Wysocki
2020-10-27  3:06   ` Viresh Kumar
2020-10-23 15:35 ` [PATCH v2 2/4] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki
2020-10-27  3:06   ` Viresh Kumar
2020-10-27  8:47   ` Zhang Rui
2020-10-23 15:35 ` [PATCH v2 3/4] cpufreq: Introduce cpufreq_driver_test_flags() Rafael J. Wysocki
2020-10-23 15:36 ` [PATCH v2 4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set Rafael J. Wysocki
2020-10-27  4:25   ` Viresh Kumar
2020-10-27 13:14     ` Rafael J. Wysocki
2020-10-27  8:47   ` Zhang Rui
2020-10-27 15:35   ` [PATCH v2.1 4/4] cpufreq: schedutil: Always call driver " Rafael J. Wysocki
2020-10-28  3:57     ` Viresh Kumar
2020-10-29 10:42       ` Rafael J. Wysocki
2020-10-29 10:54         ` Viresh Kumar
2020-10-29 11:12   ` [PATCH v2.2 4/4] cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS " Rafael J. Wysocki
2020-10-29 11:23     ` Viresh Kumar
2020-10-29 11:29       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).