Hi All, This is a follow-up to the RFT patch set posted previously: https://lore.kernel.org/lkml/9956076.F4luUDm1Dq@aspire.rjw.lan/ Patch [1/3] causes intel_pstate to update all policies if it gets a _PPC change notification and sees a global turbo disable/enable change. Patch [2/3] adds cpufreq_cpu_acquire() and cpufreq_cpu_release() to reduce code duplication after the next patch a bit (and Srinivas wanted the rwsem manipulation to not be done directly by the driver). Patch [3/3] makes intel_pstate update cpuinfo.max_freq for all policies in those cases. I've atted Tested-by tags to patches [1/3] and [3/3], because there are only cosmetic differences between them and what has been tested. Thanks, Rafael
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Subject: [PATCH] cpufreq: intel_pstate: Driver-specific handling of _PPC updates In some cases, the platform firmware disables or enables turbo frequencies for all CPUs globally before triggering a _PPC change notification for one of them. Obviously, that global change affects all CPUs, not just the notified one, and it needs to be acted upon by cpufreq. The intel_pstate driver is able to detect such global changes of the settings, but it also needs to update policy limits for all CPUs if that happens, in particular if turbo frequencies are enabled globally - to allow them to be used. For this reason, introduce a new cpufreq driver callback to be invoked on _PPC notifications, if present, instead of simply calling cpufreq_update_policy() for the notified CPU and make intel_pstate use it to trigger policy updates for all CPUs in the system if global settings change. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759 Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- -> v2: * Rename a struct field (the new name is temporary as it is changed by patch [3/3] anyway). * Use EXPORT_SYMBOL_GPL() for cpufreq_update_limits(). --- drivers/acpi/processor_perflib.c | 2 +- drivers/cpufreq/cpufreq.c | 16 ++++++++++++++++ drivers/cpufreq/intel_pstate.c | 24 ++++++++++++++++++++++++ include/linux/cpufreq.h | 4 ++++ 4 files changed, 45 insertions(+), 1 deletion(-) Index: linux-pm/drivers/acpi/processor_perflib.c =================================================================== --- linux-pm.orig/drivers/acpi/processor_perflib.c +++ linux-pm/drivers/acpi/processor_perflib.c @@ -181,7 +181,7 @@ void acpi_processor_ppc_has_changed(stru acpi_processor_ppc_ost(pr->handle, 0); } if (ret >= 0) - cpufreq_update_policy(pr->id); + cpufreq_update_limits(pr->id); } int acpi_processor_get_bios_limit(int cpu, unsigned int *limit) Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2376,6 +2376,22 @@ unlock: } EXPORT_SYMBOL(cpufreq_update_policy); +/** + * cpufreq_update_limits - Update policy limits for a given CPU. + * @cpu: CPU to update the policy limits for. + * + * Invoke the driver's ->update_limits callback if present or call + * cpufreq_update_policy() for @cpu. + */ +void cpufreq_update_limits(unsigned int cpu) +{ + if (cpufreq_driver->update_limits) + cpufreq_driver->update_limits(cpu); + else + cpufreq_update_policy(cpu); +} +EXPORT_SYMBOL_GPL(cpufreq_update_limits); + /********************************************************************* * BOOST * *********************************************************************/ Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -179,6 +179,7 @@ struct vid_data { * based on the MSR_IA32_MISC_ENABLE value and whether or * not the maximum reported turbo P-state is different from * the maximum reported non-turbo one. + * @turbo_disabled_s: Saved @turbo_disabled value. * @min_perf_pct: Minimum capacity limit in percent of the maximum turbo * P-state capacity. * @max_perf_pct: Maximum capacity limit in percent of the maximum turbo @@ -187,6 +188,7 @@ struct vid_data { struct global_params { bool no_turbo; bool turbo_disabled; + bool turbo_disabled_s; int max_perf_pct; int min_perf_pct; }; @@ -894,6 +896,25 @@ static void intel_pstate_update_policies cpufreq_update_policy(cpu); } +static void intel_pstate_update_limits(unsigned int cpu) +{ + mutex_lock(&intel_pstate_driver_lock); + + update_turbo_state(); + /* + * If turbo has been turned on or off globally, policy limits for + * all CPUs need to be updated to reflect that. + */ + if (global.turbo_disabled_s != global.turbo_disabled) { + global.turbo_disabled_s = global.turbo_disabled; + intel_pstate_update_policies(); + } else { + cpufreq_update_policy(cpu); + } + + mutex_unlock(&intel_pstate_driver_lock); +} + /************************** sysfs begin ************************/ #define show_one(file_name, object) \ static ssize_t show_##file_name \ @@ -2135,6 +2156,7 @@ static int __intel_pstate_cpu_init(struc /* cpuinfo and default policy values */ policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu->pstate.scaling; update_turbo_state(); + global.turbo_disabled_s = global.turbo_disabled; policy->cpuinfo.max_freq = global.turbo_disabled ? cpu->pstate.max_pstate : cpu->pstate.turbo_pstate; policy->cpuinfo.max_freq *= cpu->pstate.scaling; @@ -2179,6 +2201,7 @@ static struct cpufreq_driver intel_pstat .init = intel_pstate_cpu_init, .exit = intel_pstate_cpu_exit, .stop_cpu = intel_pstate_stop_cpu, + .update_limits = intel_pstate_update_limits, .name = "intel_pstate", }; @@ -2313,6 +2336,7 @@ static struct cpufreq_driver intel_cpufr .init = intel_cpufreq_cpu_init, .exit = intel_pstate_cpu_exit, .stop_cpu = intel_cpufreq_stop_cpu, + .update_limits = intel_pstate_update_limits, .name = "intel_cpufreq", }; Index: linux-pm/include/linux/cpufreq.h =================================================================== --- linux-pm.orig/include/linux/cpufreq.h +++ linux-pm/include/linux/cpufreq.h @@ -195,6 +195,7 @@ void disable_cpufreq(void); u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy); int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu); void cpufreq_update_policy(unsigned int cpu); +void cpufreq_update_limits(unsigned int cpu); bool have_governor_per_policy(void); struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy); void cpufreq_enable_fast_switch(struct cpufreq_policy *policy); @@ -322,6 +323,9 @@ struct cpufreq_driver { /* should be defined, if possible */ unsigned int (*get)(unsigned int cpu); + /* Called to update policy limits on firmware notifications. */ + void (*update_limits)(unsigned int cpu); + /* optional */ int (*bios_limit)(int cpu, unsigned int *limit);
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> It sometimes is necessary to find a cpufreq policy for a given CPU and acquire its rwsem (for writing) immediately after that, so introduce cpufreq_cpu_acquire() as a helper for that and the complementary cpufreq_cpu_release(). Make cpufreq_update_policy() use the new functions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- -> v2: New patch. --- drivers/cpufreq/cpufreq.c | 56 ++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 47 insertions(+), 9 deletions(-) Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -256,6 +256,51 @@ void cpufreq_cpu_put(struct cpufreq_poli } EXPORT_SYMBOL_GPL(cpufreq_cpu_put); +/** + * cpufreq_cpu_release - Unlock a policy and decrement its usage counter. + * @policy: cpufreq policy returned by cpufreq_cpu_acquire(). + */ +static void cpufreq_cpu_release(struct cpufreq_policy *policy) +{ + if (WARN_ON(!policy)) + return; + + lockdep_assert_held(&policy->rwsem); + + up_write(&policy->rwsem); + + cpufreq_cpu_put(policy); +} + +/** + * cpufreq_cpu_acquire - Find policy for a CPU, mark it as busy and lock it. + * @cpu: CPU to find the policy for. + * + * Call cpufreq_cpu_get() to get a reference on the cpufreq policy for @cpu and + * if the policy returned by it is not NULL, acquire its rwsem for writing. + * Return the policy if it is active or release it and return NULL otherwise. + * + * The policy returned by this function has to be released with the help of + * cpufreq_cpu_release() in order to release its rwsem and balance its usage + * counter properly. + */ +static struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu) +{ + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + + if (!policy) + return NULL; + + down_write(&policy->rwsem); + + if (policy_is_inactive(policy)) { + cpufreq_cpu_release(policy); + return NULL; + } + + return policy; +} + /********************************************************************* * EXTERNALLY AFFECTING FREQUENCY CHANGES * *********************************************************************/ @@ -2343,17 +2388,12 @@ static int cpufreq_set_policy(struct cpu */ void cpufreq_update_policy(unsigned int cpu) { - struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu); struct cpufreq_policy new_policy; if (!policy) return; - down_write(&policy->rwsem); - - if (policy_is_inactive(policy)) - goto unlock; - /* * BIOS might change freq behind our back * -> ask driver for current freq and notify governors about a change @@ -2370,9 +2410,7 @@ void cpufreq_update_policy(unsigned int cpufreq_set_policy(policy, &new_policy); unlock: - up_write(&policy->rwsem); - - cpufreq_cpu_put(policy); + cpufreq_cpu_release(policy); } EXPORT_SYMBOL(cpufreq_update_policy);
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> While the cpuinfo.max_freq value doesn't really matter for intel_pstate in the active mode, in the passive mode it is used by governors as the maximum physical frequency of the CPU and the results of governor computations generally depend on it. Also it is made available to user space via sysfs and it should match the current HW configuration. For this reason, make intel_pstate update cpuinfo.max_freq for all CPUs if it detects a global change of turbo frequency settings from "disable" to "enable" or the other way associated with a _PPC change notification from the platform firmware. Note that policy_is_inactive(), cpufreq_cpu_acquire(), cpufreq_cpu_release(), and cpufreq_set_policy() need to be made available to it for this purpose. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759 Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- -> v2: * Rename a struct field. * Use cpufreq_cpu_acquire() and cpufreq_cpu_release() (added by patch [2/3]) in intel_pstate_update_max_freq(). --- drivers/cpufreq/cpufreq.c | 16 ++++------------ drivers/cpufreq/intel_pstate.c | 35 +++++++++++++++++++++++++++++------ include/linux/cpufreq.h | 10 ++++++++++ 3 files changed, 43 insertions(+), 18 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -179,7 +179,7 @@ struct vid_data { * based on the MSR_IA32_MISC_ENABLE value and whether or * not the maximum reported turbo P-state is different from * the maximum reported non-turbo one. - * @turbo_disabled_s: Saved @turbo_disabled value. + * @turbo_disabled_mf: The @turbo_disabled value reflected by cpuinfo.max_freq. * @min_perf_pct: Minimum capacity limit in percent of the maximum turbo * P-state capacity. * @max_perf_pct: Maximum capacity limit in percent of the maximum turbo @@ -188,7 +188,7 @@ struct vid_data { struct global_params { bool no_turbo; bool turbo_disabled; - bool turbo_disabled_s; + bool turbo_disabled_mf; int max_perf_pct; int min_perf_pct; }; @@ -896,6 +896,28 @@ static void intel_pstate_update_policies cpufreq_update_policy(cpu); } +static void intel_pstate_update_max_freq(unsigned int cpu) +{ + struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu); + struct cpufreq_policy new_policy; + struct cpudata *cpudata; + + if (!policy) + return; + + cpudata = all_cpu_data[cpu]; + policy->cpuinfo.max_freq = global.turbo_disabled_mf ? + cpudata->pstate.max_freq : cpudata->pstate.turbo_freq; + + memcpy(&new_policy, policy, sizeof(*policy)); + new_policy.max = min(policy->user_policy.max, policy->cpuinfo.max_freq); + new_policy.min = min(policy->user_policy.min, new_policy.max); + + cpufreq_set_policy(policy, &new_policy); + + cpufreq_cpu_release(policy); +} + static void intel_pstate_update_limits(unsigned int cpu) { mutex_lock(&intel_pstate_driver_lock); @@ -905,9 +927,10 @@ static void intel_pstate_update_limits(u * If turbo has been turned on or off globally, policy limits for * all CPUs need to be updated to reflect that. */ - if (global.turbo_disabled_s != global.turbo_disabled) { - global.turbo_disabled_s = global.turbo_disabled; - intel_pstate_update_policies(); + if (global.turbo_disabled_mf != global.turbo_disabled) { + global.turbo_disabled_mf = global.turbo_disabled; + for_each_possible_cpu(cpu) + intel_pstate_update_max_freq(cpu); } else { cpufreq_update_policy(cpu); } @@ -2156,7 +2179,7 @@ static int __intel_pstate_cpu_init(struc /* cpuinfo and default policy values */ policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu->pstate.scaling; update_turbo_state(); - global.turbo_disabled_s = global.turbo_disabled; + global.turbo_disabled_mf = global.turbo_disabled; policy->cpuinfo.max_freq = global.turbo_disabled ? cpu->pstate.max_pstate : cpu->pstate.turbo_pstate; policy->cpuinfo.max_freq *= cpu->pstate.scaling; Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -34,11 +34,6 @@ static LIST_HEAD(cpufreq_policy_list); -static inline bool policy_is_inactive(struct cpufreq_policy *policy) -{ - return cpumask_empty(policy->cpus); -} - /* Macros to iterate over CPU policies */ #define for_each_suitable_policy(__policy, __active) \ list_for_each_entry(__policy, &cpufreq_policy_list, policy_list) \ @@ -260,7 +255,7 @@ EXPORT_SYMBOL_GPL(cpufreq_cpu_put); * cpufreq_cpu_release - Unlock a policy and decrement its usage counter. * @policy: cpufreq policy returned by cpufreq_cpu_acquire(). */ -static void cpufreq_cpu_release(struct cpufreq_policy *policy) +void cpufreq_cpu_release(struct cpufreq_policy *policy) { if (WARN_ON(!policy)) return; @@ -284,7 +279,7 @@ static void cpufreq_cpu_release(struct c * cpufreq_cpu_release() in order to release its rwsem and balance its usage * counter properly. */ -static struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu) +struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); @@ -720,9 +715,6 @@ static ssize_t show_scaling_cur_freq(str return ret; } -static int cpufreq_set_policy(struct cpufreq_policy *policy, - struct cpufreq_policy *new_policy); - /** * cpufreq_per_cpu_attr_write() / store_##file_name() - sysfs write access */ @@ -2280,8 +2272,8 @@ EXPORT_SYMBOL(cpufreq_get_policy); * * The cpuinfo part of @policy is not updated by this function. */ -static int cpufreq_set_policy(struct cpufreq_policy *policy, - struct cpufreq_policy *new_policy) +int cpufreq_set_policy(struct cpufreq_policy *policy, + struct cpufreq_policy *new_policy) { struct cpufreq_governor *old_gov; int ret; Index: linux-pm/include/linux/cpufreq.h =================================================================== --- linux-pm.orig/include/linux/cpufreq.h +++ linux-pm/include/linux/cpufreq.h @@ -178,6 +178,11 @@ static inline struct cpufreq_policy *cpu static inline void cpufreq_cpu_put(struct cpufreq_policy *policy) { } #endif +static inline bool policy_is_inactive(struct cpufreq_policy *policy) +{ + return cpumask_empty(policy->cpus); +} + static inline bool policy_is_shared(struct cpufreq_policy *policy) { return cpumask_weight(policy->cpus) > 1; @@ -193,7 +198,12 @@ unsigned int cpufreq_quick_get_max(unsig void disable_cpufreq(void); u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy); + +struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu); +void cpufreq_cpu_release(struct cpufreq_policy *policy); int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu); +int cpufreq_set_policy(struct cpufreq_policy *policy, + struct cpufreq_policy *new_policy); void cpufreq_update_policy(unsigned int cpu); void cpufreq_update_limits(unsigned int cpu); bool have_governor_per_policy(void);
On Tuesday 05 Mar 2019 at 11:42:06 (+0100), Rafael J. Wysocki wrote: > +static void intel_pstate_update_max_freq(unsigned int cpu) > +{ > + struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu); > + struct cpufreq_policy new_policy; > + struct cpudata *cpudata; > + > + if (!policy) > + return; > + > + cpudata = all_cpu_data[cpu]; > + policy->cpuinfo.max_freq = global.turbo_disabled_mf ? > + cpudata->pstate.max_freq : cpudata->pstate.turbo_freq; I'm not too familiar with how the Intel turbo stuff so bear with me but is this 'pstate.turbo_freq' constant ? Why not just write it unconditionally into cpuinfo.max_freq ? It's not guaranteed to always be reachable anyways no ? So maybe that's OK to always report that one regardless of the boost availability ? > + memcpy(&new_policy, policy, sizeof(*policy)); > + new_policy.max = min(policy->user_policy.max, policy->cpuinfo.max_freq); > + new_policy.min = min(policy->user_policy.min, new_policy.max); > + > + cpufreq_set_policy(policy, &new_policy); > + > + cpufreq_cpu_release(policy); > +} Thanks, Quentin
On Tue, Mar 5, 2019 at 12:43 PM Quentin Perret <quentin.perret@arm.com> wrote: > > On Tuesday 05 Mar 2019 at 11:42:06 (+0100), Rafael J. Wysocki wrote: > > +static void intel_pstate_update_max_freq(unsigned int cpu) > > +{ > > + struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu); > > + struct cpufreq_policy new_policy; > > + struct cpudata *cpudata; > > + > > + if (!policy) > > + return; > > + > > + cpudata = all_cpu_data[cpu]; > > + policy->cpuinfo.max_freq = global.turbo_disabled_mf ? > > + cpudata->pstate.max_freq : cpudata->pstate.turbo_freq; > > I'm not too familiar with how the Intel turbo stuff so bear with me but > is this 'pstate.turbo_freq' constant ? Yes, it is. > Why not just write it unconditionally into cpuinfo.max_freq ? It's not > guaranteed to always be reachable anyways no ? So maybe that's OK to always > report that one regardless of the boost availability ? So the concern is that on some systems turbo is disabled permanently by the platform FW and it doesn't make sense to even take pstate.turbo_freq into consideration then.
On 05-03-19, 11:23, Rafael J. Wysocki wrote:
> Hi All,
>
> This is a follow-up to the RFT patch set posted previously:
> https://lore.kernel.org/lkml/9956076.F4luUDm1Dq@aspire.rjw.lan/
>
> Patch [1/3] causes intel_pstate to update all policies if it gets a _PPC change
> notification and sees a global turbo disable/enable change.
>
> Patch [2/3] adds cpufreq_cpu_acquire() and cpufreq_cpu_release() to reduce
> code duplication after the next patch a bit (and Srinivas wanted the rwsem
> manipulation to not be done directly by the driver).
>
> Patch [3/3] makes intel_pstate update cpuinfo.max_freq for all policies in
> those cases.
>
> I've atted Tested-by tags to patches [1/3] and [3/3], because there are only
> cosmetic differences between them and what has been tested.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
--
viresh