* [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor @ 2020-10-22 11:55 Rafael J. Wysocki 2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki 2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki 0 siblings, 2 replies; 6+ messages in thread From: Rafael J. Wysocki @ 2020-10-22 11:55 UTC (permalink / raw) To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui Hi All, There is a problem in intel_pstate that if it works in the passive mode with HWP enabled and the "powersave" governor is used on top of it, then changing the policy max frequency doesn't cause the HWP max limit to be updated which is quite confusing. That happens because of two checks, one in the cpufreq core and one in the driver itself, that are there to avoid unnecessary HW/FW updates when the current frequency doesn't change. Of course, that is the case when the policy max limit changes under the "powersave" governor (which sets the current frequency to the policy min limit), but in that particular case, the checks turn out to be harmful. This is dealt with by the first patch. The second one is an optimization that can be done right away on top of the first one. Thanks! ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode 2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki @ 2020-10-22 11:57 ` Rafael J. Wysocki 2020-10-23 6:10 ` Viresh Kumar 2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki 1 sibling, 1 reply; 6+ messages in thread From: Rafael J. Wysocki @ 2020-10-22 11:57 UTC (permalink / raw) To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> If the cpufreq policy max limit is changed when intel_pstate operates in the passive mode with HWP enabled and the "powersave" governor is used on top of it, the HWP max limit is not updated as appropriate. Namely, in the "powersave" governor case, the target P-state is always equal to the policy min limit, so if the latter does not change, the "target_freq == policy->cur" check in __cpufreq_driver_target() is "true" and the "target_pstate != old_pstate" check in intel_cpufreq_update_pstate() is "false", so intel_cpufreq_adjust_hwp() is not invoked to update the HWP Request MSR and the HWP max limit is not updated as a result. To prevent that from occurring, modify __cpufreq_driver_target() to do the "target_freq == policy->cur" check only in the frequency table case and change intel_cpufreq_update_pstate() to do the "target_pstate != old_pstate" check only in the non-HWP case and let intel_cpufreq_adjust_hwp() always run in the HWP case (it will update HWP Request only if the cached value of the register is different from the new one including the limits, so if neither the target P-state value nor the max limit changes, the register write will still be avoided). Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled") Reported-by: Zhang Rui <rui.zhang@intel.com> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/cpufreq/cpufreq.c | 6 +++--- drivers/cpufreq/intel_pstate.c | 12 +++++------- 2 files changed, 8 insertions(+), 10 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s int old_pstate = cpu->pstate.current_pstate; target_pstate = intel_pstate_prepare_request(cpu, target_pstate); - if (target_pstate != old_pstate) { + if (hwp_active) { + intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch); + cpu->pstate.current_pstate = target_pstate; + } else if (target_pstate != old_pstate) { + intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch); cpu->pstate.current_pstate = target_pstate; - if (hwp_active) - intel_cpufreq_adjust_hwp(cpu, target_pstate, - fast_switch); - else - intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, - fast_switch); } intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH : Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n", policy->cpu, target_freq, relation, old_target_freq); + if (cpufreq_driver->target) + return cpufreq_driver->target(policy, target_freq, relation); + /* * This might look like a redundant call as we are checking it again * after finding index. But it is left intentionally for cases where @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr /* Save last value to restore later on errors */ policy->restore_freq = policy->cur; - if (cpufreq_driver->target) - return cpufreq_driver->target(policy, target_freq, relation); - if (!cpufreq_driver->target_index) return -EINVAL; ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode 2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki @ 2020-10-23 6:10 ` Viresh Kumar 2020-10-23 11:57 ` Rafael J. Wysocki 0 siblings, 1 reply; 6+ messages in thread From: Viresh Kumar @ 2020-10-23 6:10 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui On 22-10-20, 13:57, Rafael J. Wysocki wrote: > Index: linux-pm/drivers/cpufreq/cpufreq.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/cpufreq.c > +++ linux-pm/drivers/cpufreq/cpufreq.c > @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr > pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n", > policy->cpu, target_freq, relation, old_target_freq); > > + if (cpufreq_driver->target) > + return cpufreq_driver->target(policy, target_freq, relation); > + > /* > * This might look like a redundant call as we are checking it again > * after finding index. But it is left intentionally for cases where > @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr > /* Save last value to restore later on errors */ > policy->restore_freq = policy->cur; > > - if (cpufreq_driver->target) > - return cpufreq_driver->target(policy, target_freq, relation); > - > if (!cpufreq_driver->target_index) > return -EINVAL; From what I understood, you want your driver to get notified about policy->min/max changes and right now they are making it work from within the target() callback. Your commit log talks about policy->max and powersave combination, I think the same will be true in case of policy->min and performance ? And also with any other governor (like schedutil) if the target_freq doesn't change for a while. And IMHO, this change is more like a band-aid which is going to remove the check of target != cur for all target() type drivers (which aren't many) and it feels like a penalty on them (which is also there for intel-cpufreq without hwp), and that we will get into the same problem for target_index() drivers as well if they want to do something similar in future, i.e. skip checking for same-freq. Maybe adding a new flag for the cpufreq-driver for force-updates would be a better solution ? Which will make this very much driver dependent. -- viresh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode 2020-10-23 6:10 ` Viresh Kumar @ 2020-10-23 11:57 ` Rafael J. Wysocki 0 siblings, 0 replies; 6+ messages in thread From: Rafael J. Wysocki @ 2020-10-23 11:57 UTC (permalink / raw) To: Viresh Kumar Cc: Rafael J. Wysocki, Linux PM, LKML, Srinivas Pandruvada, Zhang Rui On Fri, Oct 23, 2020 at 8:10 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 22-10-20, 13:57, Rafael J. Wysocki wrote: > > Index: linux-pm/drivers/cpufreq/cpufreq.c > > =================================================================== > > --- linux-pm.orig/drivers/cpufreq/cpufreq.c > > +++ linux-pm/drivers/cpufreq/cpufreq.c > > @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr > > pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n", > > policy->cpu, target_freq, relation, old_target_freq); > > > > + if (cpufreq_driver->target) > > + return cpufreq_driver->target(policy, target_freq, relation); > > + > > /* > > * This might look like a redundant call as we are checking it again > > * after finding index. But it is left intentionally for cases where > > @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr > > /* Save last value to restore later on errors */ > > policy->restore_freq = policy->cur; > > > > - if (cpufreq_driver->target) > > - return cpufreq_driver->target(policy, target_freq, relation); > > - > > if (!cpufreq_driver->target_index) > > return -EINVAL; > > From what I understood, you want your driver to get notified about > policy->min/max changes and right now they are making it work from > within the target() callback. Not exactly. The driver needs to update some internal upper and lower boundary values along with the target freq and skipping the update when target freq doesn't change prevents it from doing that. The policy min and max changes are communicated to the driver via the governor ->limits() callback and that can only call __cpufreq_driver_target() then or defer the update to the next ->fast_switch() invocation. Either way, the driver needs to have a chance to carry out the full update even if the target frequency doesn't change, but the policy min or max limits may have changed. > Your commit log talks about policy->max and powersave combination, Yes, it does, because that's a very clearly visible symptom. > I think the same will be true in case of policy->min and performance ? It might in theory, but it is not in practice, because the HWP min is set to the target freq (and the target freq is already clamped between the policy min and max). But generally speaking you are right, this would be a problem for any driver having to update some internal upper and lower boundary settings along with the target freq. > And also with any other governor (like schedutil) if the target_freq doesn't change for a while. Well, yes. A change of one of the limits that doesn't cause the target to change may be missed in general. > And IMHO, this change is more like a band-aid which is going to remove > the check of target != cur for all target() type drivers (which aren't > many) and it feels like a penalty on them (which is also there for > intel-cpufreq without hwp), and that we will get into the same problem > for target_index() drivers as well if they want to do something > similar in future, i.e. skip checking for same-freq. > > Maybe adding a new flag for the cpufreq-driver for force-updates would > be a better solution ? Which will make this very much driver > dependent. Fair enough, I'll add a driver flag for that. Also I split the patch into the core part and the driver-specific part for clarity. Thanks! ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy 2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki 2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki @ 2020-10-22 11:57 ` Rafael J. Wysocki 2020-10-23 5:39 ` Viresh Kumar 1 sibling, 1 reply; 6+ messages in thread From: Rafael J. Wysocki @ 2020-10-22 11:57 UTC (permalink / raw) To: Linux PM; +Cc: LKML, Viresh Kumar, Srinivas Pandruvada, Zhang Rui From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> The restore_freq field in struct cpufreq_policy is only used by __target_index() in one place and a local variable in that function may as well be used instead of it, so drop it and modify __target_index() accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/cpufreq/cpufreq.c | 10 +++++----- include/linux/cpufreq.h | 5 ----- 2 files changed, 5 insertions(+), 10 deletions(-) Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2112,7 +2112,7 @@ static int __target_intermediate(struct static int __target_index(struct cpufreq_policy *policy, int index) { struct cpufreq_freqs freqs = {.old = policy->cur, .flags = 0}; - unsigned int intermediate_freq = 0; + unsigned int restore_freq, intermediate_freq = 0; unsigned int newfreq = policy->freq_table[index].frequency; int retval = -EINVAL; bool notify; @@ -2120,6 +2120,9 @@ static int __target_index(struct cpufreq if (newfreq == policy->cur) return 0; + /* Save last value to restore later on errors */ + restore_freq = policy->cur; + notify = !(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION); if (notify) { /* Handle switching to intermediate frequency */ @@ -2157,7 +2160,7 @@ static int __target_index(struct cpufreq */ if (unlikely(retval && intermediate_freq)) { freqs.old = intermediate_freq; - freqs.new = policy->restore_freq; + freqs.new = restore_freq; cpufreq_freq_transition_begin(policy, &freqs); cpufreq_freq_transition_end(policy, &freqs, 0); } @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr if (target_freq == policy->cur) return 0; - /* Save last value to restore later on errors */ - policy->restore_freq = policy->cur; - if (!cpufreq_driver->target_index) return -EINVAL; Index: linux-pm/include/linux/cpufreq.h =================================================================== --- linux-pm.orig/include/linux/cpufreq.h +++ linux-pm/include/linux/cpufreq.h @@ -65,7 +65,6 @@ struct cpufreq_policy { unsigned int max; /* in kHz */ unsigned int cur; /* in kHz, only needed if cpufreq * governors are used */ - unsigned int restore_freq; /* = policy->cur before transition */ unsigned int suspend_freq; /* freq to set during suspend */ unsigned int policy; /* see above */ @@ -308,10 +307,6 @@ struct cpufreq_driver { /* define one out of two */ int (*setpolicy)(struct cpufreq_policy *policy); - /* - * On failure, should always restore frequency to policy->restore_freq - * (i.e. old freq). - */ int (*target)(struct cpufreq_policy *policy, unsigned int target_freq, unsigned int relation); /* Deprecated */ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy 2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki @ 2020-10-23 5:39 ` Viresh Kumar 0 siblings, 0 replies; 6+ messages in thread From: Viresh Kumar @ 2020-10-23 5:39 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux PM, LKML, Srinivas Pandruvada, Zhang Rui On 22-10-20, 13:57, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > The restore_freq field in struct cpufreq_policy is only used by > __target_index() in one place and a local variable in that function > may as well be used instead of it, so drop it and modify > __target_index() accordingly. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/cpufreq/cpufreq.c | 10 +++++----- > include/linux/cpufreq.h | 5 ----- > 2 files changed, 5 insertions(+), 10 deletions(-) Acked-by: Viresh Kumar <viresh.kumar@linaro.org> -- viresh ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-10-23 11:58 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-22 11:55 [PATCH 0/2] cpufreq: intel_pstate: Avoid missing HWP max limit updates with powersave governor Rafael J. Wysocki 2020-10-22 11:57 ` [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode Rafael J. Wysocki 2020-10-23 6:10 ` Viresh Kumar 2020-10-23 11:57 ` Rafael J. Wysocki 2020-10-22 11:57 ` [PATCH 2/2] cpufreq: Drop restore_freq from struct cpufreq_policy Rafael J. Wysocki 2020-10-23 5:39 ` Viresh Kumar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).