* [PATCH 0/2] cpufreq: intel_pstate: Two fixes related to limis @ 2017-02-28 13:13 Rafael J. Wysocki 2017-02-28 13:14 ` [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 13:13 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML Hi, Both fix some utter confusion in the sysfs interface. The first one makes intel_pstate_update_policies() end up with limits set to the same value that it was set to initially. The second one avoids reinitializing limits in intel_pstate_set_policy() when the moon phase is right and the wind blows from a specific direction. Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode 2017-02-28 13:13 [PATCH 0/2] cpufreq: intel_pstate: Two fixes related to limis Rafael J. Wysocki @ 2017-02-28 13:14 ` Rafael J. Wysocki 2017-02-28 22:21 ` Rafael J. Wysocki 2017-02-28 13:16 ` [PATCH 2/2] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki 2 siblings, 1 reply; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 13:14 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Commit 111b8b3fe4fa (cpufreq: intel_pstate: Always keep all limits settings in sync) changed intel_pstate to invoke cpufreq_update_policy() for every registered CPU on global sysfs attributes updates, but that led to undesirable effects in the active mode if the "performance" P-state selection algorithm is configufred for one CPU and the "powersave" one is chosen for all of the other CPUs. Namely, in that case, the following is possible: # cd /sys/devices/system/cpu/ # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 26 # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 26 The reason why this happens is because intel_pstate attempts to maintain two sets of global limits in the active mode, one for the "performance" P-state selection algorithm and one for the "powersave" P-state selection algorithm, but the P-state selection algorithms are set per policy, so the global limits cannot reflect all of them at the same time if they are different for different policies. In the particular situation above, the attempt to change min_perf_pct to 94 caused cpufreq_update_policy() to be run for a CPU with the "powersave" P-state selection algorithm and intel_pstate_set_policy() called by it silently switched the global limits to the "powersave" set which finally was reflected by the sysfs interface. To prevent that from happening, modify intel_pstate_update_policies() to always switch back to the set of limits that was used right before it has been invoked. Fixes: 111b8b3fe4fa (cpufreq: intel_pstate: Always keep all limits settings in sync) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/cpufreq/intel_pstate.c | 3 +++ 1 file changed, 3 insertions(+) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -974,10 +974,13 @@ static int intel_pstate_resume(struct cp static void intel_pstate_update_policies(void) { + struct perf_limits *last_limits = limits; int cpu; for_each_possible_cpu(cpu) cpufreq_update_policy(cpu); + + limits = last_limits; } /************************** debugfs begin ************************/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode 2017-02-28 13:14 ` [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki @ 2017-02-28 22:21 ` Rafael J. Wysocki 0 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 22:21 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux PM, Srinivas Pandruvada, LKML On Tue, Feb 28, 2017 at 2:14 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Commit 111b8b3fe4fa (cpufreq: intel_pstate: Always keep all > limits settings in sync) changed intel_pstate to invoke > cpufreq_update_policy() for every registered CPU on global sysfs > attributes updates, but that led to undesirable effects in the > active mode if the "performance" P-state selection algorithm is > configufred for one CPU and the "powersave" one is chosen for > all of the other CPUs. > > Namely, in that case, the following is possible: > > # cd /sys/devices/system/cpu/ > # cat intel_pstate/max_perf_pct > 100 > # cat intel_pstate/min_perf_pct > 26 > # echo performance > cpufreq/policy0/scaling_governor > # cat intel_pstate/max_perf_pct > 100 > # cat intel_pstate/min_perf_pct > 100 > # echo 94 > intel_pstate/min_perf_pct > # cat intel_pstate/min_perf_pct > 26 > > The reason why this happens is because intel_pstate attempts to > maintain two sets of global limits in the active mode, one for > the "performance" P-state selection algorithm and one for the > "powersave" P-state selection algorithm, but the P-state selection > algorithms are set per policy, so the global limits cannot reflect > all of them at the same time if they are different for different > policies. > > In the particular situation above, the attempt to change > min_perf_pct to 94 caused cpufreq_update_policy() to be run > for a CPU with the "powersave" P-state selection algorithm > and intel_pstate_set_policy() called by it silently switched the > global limits to the "powersave" set which finally was reflected > by the sysfs interface. > > To prevent that from happening, modify intel_pstate_update_policies() > to always switch back to the set of limits that was used right before > it has been invoked. Scratch this, it's too racy. I'll send a new version shortly. Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 13:13 [PATCH 0/2] cpufreq: intel_pstate: Two fixes related to limis Rafael J. Wysocki 2017-02-28 13:14 ` [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki @ 2017-02-28 13:16 ` Rafael J. Wysocki 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki 2 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 13:16 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> If the current P-state selection algorithm is set to "performance" in intel_pstate_set_policy(), the limits may be initialized from scratch, but only if no_turbo is not set and the maximum frequency allowed for the given CPU (i.e. the policy object representing it) is at least equal to the max frequency supported by the CPU. In all of the other cases, the limits will not be updated. For example, the following can happen: # cat intel_pstate/status active # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 100 # cat cpufreq/policy0/scaling_max_freq 3100000 echo 3000000 > cpufreq/policy0/scaling_max_freq # cat intel_pstate/min_perf_pct 94 # echo 95 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 95 That is confusing for two reasons. First, the initial attempt to change min_perf_pct to 94 seems to have no effect, even though setting the global limits should always work. Second, after changing scaling_max_freq for policy0 the global min_perf_pct attribute shows 94, even though it should have not been affected by that operation in principle. Moreover, the final attempt to change min_perf_pct to 95 worked as expected, because scaling_max_freq for the only policy with scaling_governor equal to "performance" was different from the maximum at that time. To make all that confusion go away, modify intel_pstate_set_policy() so that it doesn't reinitialize the limits at all. At the same time, change intel_pstate_set_performance_limits() to set min_sysfs_pct to 100 in the "performance" limits set so that switching the P-state selection algorithm to "performance" causes intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value min_sysfs_pct in the "performance" limits is set to later). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/cpufreq/intel_pstate.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -382,6 +382,7 @@ static void intel_pstate_set_performance intel_pstate_init_limits(limits); limits->min_perf_pct = 100; limits->min_perf = int_ext_tofp(1); + limits->min_sysfs_pct = 100; } static DEFINE_MUTEX(intel_pstate_driver_lock); @@ -2140,16 +2141,11 @@ static int intel_pstate_set_policy(struc mutex_lock(&intel_pstate_limits_lock); if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + pr_debug("set performance\n"); if (!perf_limits) { limits = &performance_limits; perf_limits = limits; } - if (policy->max >= policy->cpuinfo.max_freq && - !limits->no_turbo) { - pr_debug("set performance\n"); - intel_pstate_set_performance_limits(perf_limits); - goto out; - } } else { pr_debug("set powersave\n"); if (!perf_limits) { @@ -2160,7 +2156,7 @@ static int intel_pstate_set_policy(struc } intel_pstate_update_perf_limits(policy, perf_limits); - out: + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis 2017-02-28 13:13 [PATCH 0/2] cpufreq: intel_pstate: Two fixes related to limis Rafael J. Wysocki 2017-02-28 13:14 ` [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki 2017-02-28 13:16 ` [PATCH 2/2] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki @ 2017-02-28 23:00 ` Rafael J. Wysocki 2017-02-28 23:07 ` [PATCH v2 1/3] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki ` (2 more replies) 2 siblings, 3 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 23:00 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML Hi, On Tuesday, February 28, 2017 02:13:30 PM Rafael J. Wysocki wrote: > Hi, > > Both fix some utter confusion in the sysfs interface. > > The first one makes intel_pstate_update_policies() end up with limits set to > the same value that it was set to initially. > > The second one avoids reinitializing limits in intel_pstate_set_policy() when > the moon phase is right and the wind blows from a specific direction. Resending an updated series, because patch [1/3] required a rework and I found one more problem (addressed by patch [3/3]). Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/3] cpufreq: intel_pstate: Fix global settings in active mode 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki @ 2017-02-28 23:07 ` Rafael J. Wysocki 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-02-28 23:11 ` [PATCH v2 3/3] cpufreq: intel_pstate: Fix intel_pstate_verify_policy() Rafael J. Wysocki 2 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 23:07 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Commit 111b8b3fe4fa (cpufreq: intel_pstate: Always keep all limits settings in sync) changed intel_pstate to invoke cpufreq_update_policy() for every registered CPU on global sysfs attributes updates, but that led to undesirable effects in the active mode if the "performance" P-state selection algorithm is configufred for one CPU and the "powersave" one is chosen for all of the other CPUs. Namely, in that case, the following is possible: # cd /sys/devices/system/cpu/ # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 26 # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 26 The reason why this happens is because intel_pstate attempts to maintain two sets of global limits in the active mode, one for the "performance" P-state selection algorithm and one for the "powersave" P-state selection algorithm, but the P-state selection algorithms are set per policy, so the global limits cannot reflect all of them at the same time if they are different for different policies. In the particular situation above, the attempt to change min_perf_pct to 94 caused cpufreq_update_policy() to be run for a CPU with the "powersave" P-state selection algorithm and intel_pstate_set_policy() called by it silently switched the global limits to the "powersave" set which finally was reflected by the sysfs interface. To prevent that from happening, modify intel_pstate_update_policies() to always switch back to the set of limits that was used right before it has been invoked. Fixes: 111b8b3fe4fa (cpufreq: intel_pstate: Always keep all limits settings in sync) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- -> v2: Save and restore the limits value in intel_pstate_update_policies() under intel_pstate_limits_lock or otherwise (a) it may change in the middle of an update from a concurrent thread and (b) it may not point to the set of limits that has just been updated any more when read outside of the lock. --- drivers/cpufreq/intel_pstate.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -973,11 +973,20 @@ static int intel_pstate_resume(struct cp } static void intel_pstate_update_policies(void) + __releases(&intel_pstate_limits_lock) + __acquires(&intel_pstate_limits_lock) { + struct perf_limits *saved_limits = limits; int cpu; + mutex_unlock(&intel_pstate_limits_lock); + for_each_possible_cpu(cpu) cpufreq_update_policy(cpu); + + mutex_lock(&intel_pstate_limits_lock); + + limits = saved_limits; } /************************** debugfs begin ************************/ @@ -1185,10 +1194,10 @@ static ssize_t store_no_turbo(struct kob limits->no_turbo = clamp_t(int, input, 0, 1); - mutex_unlock(&intel_pstate_limits_lock); - intel_pstate_update_policies(); + mutex_unlock(&intel_pstate_limits_lock); + mutex_unlock(&intel_pstate_driver_lock); return count; @@ -1222,10 +1231,10 @@ static ssize_t store_max_perf_pct(struct limits->max_perf_pct); limits->max_perf = div_ext_fp(limits->max_perf_pct, 100); - mutex_unlock(&intel_pstate_limits_lock); - intel_pstate_update_policies(); + mutex_unlock(&intel_pstate_limits_lock); + mutex_unlock(&intel_pstate_driver_lock); return count; @@ -1259,10 +1268,10 @@ static ssize_t store_min_perf_pct(struct limits->min_perf_pct); limits->min_perf = div_ext_fp(limits->min_perf_pct, 100); - mutex_unlock(&intel_pstate_limits_lock); - intel_pstate_update_policies(); + mutex_unlock(&intel_pstate_limits_lock); + mutex_unlock(&intel_pstate_driver_lock); return count; ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki 2017-02-28 23:07 ` [PATCH v2 1/3] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki @ 2017-02-28 23:09 ` Rafael J. Wysocki 2017-03-02 17:18 ` Rafael J. Wysocki ` (3 more replies) 2017-02-28 23:11 ` [PATCH v2 3/3] cpufreq: intel_pstate: Fix intel_pstate_verify_policy() Rafael J. Wysocki 2 siblings, 4 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 23:09 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> If the current P-state selection algorithm is set to "performance" in intel_pstate_set_policy(), the limits may be initialized from scratch, but only if no_turbo is not set and the maximum frequency allowed for the given CPU (i.e. the policy object representing it) is at least equal to the max frequency supported by the CPU. In all of the other cases, the limits will not be updated. For example, the following can happen: # cat intel_pstate/status active # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 100 # cat cpufreq/policy0/scaling_max_freq 3100000 echo 3000000 > cpufreq/policy0/scaling_max_freq # cat intel_pstate/min_perf_pct 94 # echo 95 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 95 That is confusing for two reasons. First, the initial attempt to change min_perf_pct to 94 seems to have no effect, even though setting the global limits should always work. Second, after changing scaling_max_freq for policy0 the global min_perf_pct attribute shows 94, even though it should have not been affected by that operation in principle. Moreover, the final attempt to change min_perf_pct to 95 worked as expected, because scaling_max_freq for the only policy with scaling_governor equal to "performance" was different from the maximum at that time. To make all that confusion go away, modify intel_pstate_set_policy() so that it doesn't reinitialize the limits at all. At the same time, change intel_pstate_set_performance_limits() to set min_sysfs_pct to 100 in the "performance" limits set so that switching the P-state selection algorithm to "performance" causes intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value min_sysfs_pct in the "performance" limits is set to later). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- -> v2: No changes --- drivers/cpufreq/intel_pstate.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -382,6 +382,7 @@ static void intel_pstate_set_performance intel_pstate_init_limits(limits); limits->min_perf_pct = 100; limits->min_perf = int_ext_tofp(1); + limits->min_sysfs_pct = 100; } static DEFINE_MUTEX(intel_pstate_driver_lock); @@ -2146,16 +2147,11 @@ static int intel_pstate_set_policy(struc mutex_lock(&intel_pstate_limits_lock); if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + pr_debug("set performance\n"); if (!perf_limits) { limits = &performance_limits; perf_limits = limits; } - if (policy->max >= policy->cpuinfo.max_freq && - !limits->no_turbo) { - pr_debug("set performance\n"); - intel_pstate_set_performance_limits(perf_limits); - goto out; - } } else { pr_debug("set powersave\n"); if (!perf_limits) { @@ -2166,7 +2162,7 @@ static int intel_pstate_set_policy(struc } intel_pstate_update_perf_limits(policy, perf_limits); - out: + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki @ 2017-03-02 17:18 ` Rafael J. Wysocki 2017-03-02 17:22 ` Rafael J. Wysocki 2017-03-02 22:29 ` [Update][PATCH v3 " Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 1 reply; 14+ messages in thread From: Rafael J. Wysocki @ 2017-03-02 17:18 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux PM, Srinivas Pandruvada, LKML On Wed, Mar 1, 2017 at 12:09 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the current P-state selection algorithm is set to "performance" > in intel_pstate_set_policy(), the limits may be initialized from > scratch, but only if no_turbo is not set and the maximum frequency > allowed for the given CPU (i.e. the policy object representing it) > is at least equal to the max frequency supported by the CPU. In all > of the other cases, the limits will not be updated. > > For example, the following can happen: > > # cat intel_pstate/status > active > # echo performance > cpufreq/policy0/scaling_governor > # cat intel_pstate/min_perf_pct > 100 > # echo 94 > intel_pstate/min_perf_pct > # cat intel_pstate/min_perf_pct > 100 > # cat cpufreq/policy0/scaling_max_freq > 3100000 > echo 3000000 > cpufreq/policy0/scaling_max_freq > # cat intel_pstate/min_perf_pct > 94 > # echo 95 > intel_pstate/min_perf_pct > # cat intel_pstate/min_perf_pct > 95 > > That is confusing for two reasons. First, the initial attempt to > change min_perf_pct to 94 seems to have no effect, even though > setting the global limits should always work. Second, after > changing scaling_max_freq for policy0 the global min_perf_pct > attribute shows 94, even though it should have not been affected > by that operation in principle. > > Moreover, the final attempt to change min_perf_pct to 95 worked > as expected, because scaling_max_freq for the only policy with > scaling_governor equal to "performance" was different from the > maximum at that time. > > To make all that confusion go away, modify intel_pstate_set_policy() > so that it doesn't reinitialize the limits at all. > > At the same time, change intel_pstate_set_performance_limits() to > set min_sysfs_pct to 100 in the "performance" limits set so that > switching the P-state selection algorithm to "performance" causes > intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value > min_sysfs_pct in the "performance" limits is set to later). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > -> v2: No changes > > --- > drivers/cpufreq/intel_pstate.c | 10 +++------- > 1 file changed, 3 insertions(+), 7 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -382,6 +382,7 @@ static void intel_pstate_set_performance > intel_pstate_init_limits(limits); > limits->min_perf_pct = 100; > limits->min_perf = int_ext_tofp(1); > + limits->min_sysfs_pct = 100; This change breaks the per-CPU limits, so the patch is not correct. Withdrawing. Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-03-02 17:18 ` Rafael J. Wysocki @ 2017-03-02 17:22 ` Rafael J. Wysocki 0 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-03-02 17:22 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Rafael J. Wysocki, Linux PM, Srinivas Pandruvada, LKML On Thu, Mar 2, 2017 at 6:18 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Wed, Mar 1, 2017 at 12:09 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> If the current P-state selection algorithm is set to "performance" >> in intel_pstate_set_policy(), the limits may be initialized from >> scratch, but only if no_turbo is not set and the maximum frequency >> allowed for the given CPU (i.e. the policy object representing it) >> is at least equal to the max frequency supported by the CPU. In all >> of the other cases, the limits will not be updated. >> >> For example, the following can happen: >> >> # cat intel_pstate/status >> active >> # echo performance > cpufreq/policy0/scaling_governor >> # cat intel_pstate/min_perf_pct >> 100 >> # echo 94 > intel_pstate/min_perf_pct >> # cat intel_pstate/min_perf_pct >> 100 >> # cat cpufreq/policy0/scaling_max_freq >> 3100000 >> echo 3000000 > cpufreq/policy0/scaling_max_freq >> # cat intel_pstate/min_perf_pct >> 94 >> # echo 95 > intel_pstate/min_perf_pct >> # cat intel_pstate/min_perf_pct >> 95 >> >> That is confusing for two reasons. First, the initial attempt to >> change min_perf_pct to 94 seems to have no effect, even though >> setting the global limits should always work. Second, after >> changing scaling_max_freq for policy0 the global min_perf_pct >> attribute shows 94, even though it should have not been affected >> by that operation in principle. >> >> Moreover, the final attempt to change min_perf_pct to 95 worked >> as expected, because scaling_max_freq for the only policy with >> scaling_governor equal to "performance" was different from the >> maximum at that time. >> >> To make all that confusion go away, modify intel_pstate_set_policy() >> so that it doesn't reinitialize the limits at all. >> >> At the same time, change intel_pstate_set_performance_limits() to >> set min_sysfs_pct to 100 in the "performance" limits set so that >> switching the P-state selection algorithm to "performance" causes >> intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value >> min_sysfs_pct in the "performance" limits is set to later). >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> --- >> >> -> v2: No changes >> >> --- >> drivers/cpufreq/intel_pstate.c | 10 +++------- >> 1 file changed, 3 insertions(+), 7 deletions(-) >> >> Index: linux-pm/drivers/cpufreq/intel_pstate.c >> =================================================================== >> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c >> +++ linux-pm/drivers/cpufreq/intel_pstate.c >> @@ -382,6 +382,7 @@ static void intel_pstate_set_performance >> intel_pstate_init_limits(limits); >> limits->min_perf_pct = 100; >> limits->min_perf = int_ext_tofp(1); >> + limits->min_sysfs_pct = 100; > > This change breaks the per-CPU limits, so the patch is not correct. > > Withdrawing. Actually, it appears to be fixable, so I will send a new version later, most likely. Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* [Update][PATCH v3 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-03-02 17:18 ` Rafael J. Wysocki @ 2017-03-02 22:29 ` Rafael J. Wysocki 2017-03-29 22:01 ` Doug Smythies 2017-03-29 22:16 ` Doug Smythies 3 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-03-02 22:29 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> If the current P-state selection algorithm is set to "performance" in intel_pstate_set_policy(), the limits may be initialized from scratch, but only if no_turbo is not set and the maximum frequency allowed for the given CPU (i.e. the policy object representing it) is at least equal to the max frequency supported by the CPU. In all of the other cases, the limits will not be updated. For example, the following can happen: # cat intel_pstate/status active # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 100 # cat cpufreq/policy0/scaling_max_freq 3100000 echo 3000000 > cpufreq/policy0/scaling_max_freq # cat intel_pstate/min_perf_pct 94 # echo 95 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 95 That is confusing for two reasons. First, the initial attempt to change min_perf_pct to 94 seems to have no effect, even though setting the global limits should always work. Second, after changing scaling_max_freq for policy0 the global min_perf_pct attribute shows 94, even though it should have not been affected by that operation in principle. Moreover, the final attempt to change min_perf_pct to 95 worked as expected, because scaling_max_freq for the only policy with scaling_governor equal to "performance" was different from the maximum at that time. To make all that confusion go away, modify intel_pstate_set_policy() so that it doesn't reinitialize the limits at all. At the same time, change intel_pstate_set_performance_limits() to set min_sysfs_pct to 100 in the "performance" limits set so that switching the P-state selection algorithm to "performance" causes intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value min_sysfs_pct in the "performance" limits is set to later). That requires per-CPU limits to be initialized explicitly rather than by copying the global limits to avoid setting min_sysfs_pct in the per-CPU limits to 100. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- v2 -> v3: Change the initialization of the per-CPU limits to avoid setting min_sysfs_pct to 100 in them (which would break things). --- drivers/cpufreq/intel_pstate.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -382,6 +382,7 @@ static void intel_pstate_set_performance intel_pstate_init_limits(limits); limits->min_perf_pct = 100; limits->min_perf = int_ext_tofp(1); + limits->min_sysfs_pct = 100; } static DEFINE_MUTEX(intel_pstate_driver_lock); @@ -2146,16 +2147,11 @@ static int intel_pstate_set_policy(struc mutex_lock(&intel_pstate_limits_lock); if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + pr_debug("set performance\n"); if (!perf_limits) { limits = &performance_limits; perf_limits = limits; } - if (policy->max >= policy->cpuinfo.max_freq && - !limits->no_turbo) { - pr_debug("set performance\n"); - intel_pstate_set_performance_limits(perf_limits); - goto out; - } } else { pr_debug("set powersave\n"); if (!perf_limits) { @@ -2166,7 +2162,7 @@ static int intel_pstate_set_policy(struc } intel_pstate_update_perf_limits(policy, perf_limits); - out: + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not @@ -2257,13 +2253,8 @@ static int __intel_pstate_cpu_init(struc cpu = all_cpu_data[policy->cpu]; - /* - * We need sane value in the cpu->perf_limits, so inherit from global - * perf_limits limits, which are seeded with values based on the - * CONFIG_CPU_FREQ_DEFAULT_GOV_*, during boot up. - */ if (per_cpu_limits) - memcpy(cpu->perf_limits, limits, sizeof(struct perf_limits)); + intel_pstate_init_limits(cpu->perf_limits); policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling; policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling; ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [Update][PATCH v3 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-03-02 17:18 ` Rafael J. Wysocki 2017-03-02 22:29 ` [Update][PATCH v3 " Rafael J. Wysocki @ 2017-03-29 22:01 ` Doug Smythies 2017-03-29 22:16 ` Doug Smythies 3 siblings, 0 replies; 14+ messages in thread From: Doug Smythies @ 2017-03-29 22:01 UTC (permalink / raw) To: 'Rafael J. Wysocki' Cc: 'Srinivas Pandruvada', 'LKML', 'Linux PM' Hi Rafael, Sorry for the delay, but I didn't notice until today that this commit causes a regression, at least in my computer. I have not figured out exactly what is wrong, as I must admit I am finding these policy interactions difficult to follow. On 2017.03.02 14:29 Rafael J. Wysocki wrote: >From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the current P-state selection algorithm is set to "performance" > in intel_pstate_set_policy(), the limits may be initialized from ... [cut] ... Going back to kernel 4.11-rc1 I get this after boot**: $ uname -a Linux s15 4.11.0-rc1-stock #217 SMP Sun Mar 5 15:34:38 PST 2017 x86_64 x86_64 x86_64 GNU/Linux $ grep . /sys/devices/system/cpu/intel_pstate/* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 <<< Correct /sys/devices/system/cpu/intel_pstate/min_perf_pct:43 <<< Correct /sys/devices/system/cpu/intel_pstate/no_turbo:0 /sys/devices/system/cpu/intel_pstate/num_pstates:23 /sys/devices/system/cpu/intel_pstate/status:active /sys/devices/system/cpu/intel_pstate/turbo_pct:18 $ grep . /sys/devices/system/cpu/cpu0/cpufreq/* /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0 grep: /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq: Permission denied /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency:4294967295 /sys/devices/system/cpu/cpu0/cpufreq/related_cpus:0 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:performance powersave /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1600805 /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave <<< Notice /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed:<unsupported> $ sudo rdmsr --bitfield 15:8 -d -a 0x199 <<< Requested P-States 16 16 17 17 16 16 16 16 Going back to kernel 4.11-rc2 I get this after boot**: ~$ uname -a Linux s15 4.11.0-rc2-stock #218 SMP Sun Mar 12 23:57:44 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux $ grep . /sys/devices/system/cpu/intel_pstate/* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 <<< Correct /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 <<< Incorrect /sys/devices/system/cpu/intel_pstate/no_turbo:0 /sys/devices/system/cpu/intel_pstate/num_pstates:23 /sys/devices/system/cpu/intel_pstate/status:active /sys/devices/system/cpu/intel_pstate/turbo_pct:18 $ grep . /sys/devices/system/cpu/cpu0/cpufreq/* /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0 grep: /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq: Permission denied /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1600000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency:4294967295 /sys/devices/system/cpu/cpu0/cpufreq/related_cpus:0 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:performance powersave /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1600805 /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave <<< Notice /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:3800000 <<< Correct /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:3800000 <<< Incorrect /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed:<unsupported> $ sudo rdmsr --bitfield 15:8 -d -a 0x199 <<< Requested P-States 38 <<<< All are pinned, system is idle 38 38 38 38 38 38 38 **: After boot means > 1 minute after boot, because my distro (Ubtunu) starts up using the performance governor and then changes to powersave after 1 minute. ... Doug ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [Update][PATCH v3 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki ` (2 preceding siblings ...) 2017-03-29 22:01 ` Doug Smythies @ 2017-03-29 22:16 ` Doug Smythies 2017-03-29 22:17 ` Rafael J. Wysocki 3 siblings, 1 reply; 14+ messages in thread From: Doug Smythies @ 2017-03-29 22:16 UTC (permalink / raw) To: 'Rafael J. Wysocki' Cc: 'Srinivas Pandruvada', 'LKML', 'Linux PM', 'Doug Smythies' Hi Rafael, Disregard. I see this was fixed in kernel 4.11-rc4. While kernel 4.11-rc4 was where I thought I started from earlier, actually I was running 4.11-rc2 at the time. sorry for the noise. On 2017.03.29 15:01 Doug Smythies wrote: > Sorry for the delay, but I didn't notice until today that this > commit causes a regression, at least in my computer. > > I have not figured out exactly what is wrong, as I must > admit I am finding these policy interactions difficult > to follow. > > On 2017.03.02 14:29 Rafael J. Wysocki wrote: >>From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> If the current P-state selection algorithm is set to "performance" >> in intel_pstate_set_policy(), the limits may be initialized from > > ... [cut] ... > > Going back to kernel 4.11-rc1 I get this after boot**: > > $ uname -a > Linux s15 4.11.0-rc1-stock #217 SMP Sun Mar 5 15:34:38 PST 2017 x86_64 x86_64 x86_64 GNU/Linux ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Update][PATCH v3 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy 2017-03-29 22:16 ` Doug Smythies @ 2017-03-29 22:17 ` Rafael J. Wysocki 0 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-03-29 22:17 UTC (permalink / raw) To: Doug Smythies; +Cc: Rafael J. Wysocki, Srinivas Pandruvada, LKML, Linux PM On Thu, Mar 30, 2017 at 12:16 AM, Doug Smythies <dsmythies@telus.net> wrote: > Hi Rafael, > > Disregard. > I see this was fixed in kernel 4.11-rc4. > While kernel 4.11-rc4 was where I thought I started from earlier, > actually I was running 4.11-rc2 at the time. > > sorry for the noise. No worries. Thanks, Rafael ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 3/3] cpufreq: intel_pstate: Fix intel_pstate_verify_policy() 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki 2017-02-28 23:07 ` [PATCH v2 1/3] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki @ 2017-02-28 23:11 ` Rafael J. Wysocki 2 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2017-02-28 23:11 UTC (permalink / raw) To: Linux PM; +Cc: Srinivas Pandruvada, LKML From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> The code added to intel_pstate_verify_policy() by commit 1443ebbacfd7 (cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy) should use perf_limits instead of limits, because otherwise setting global limits via sysfs may affect policies inconsistently. For example, in the sequence of shell commands below, the scaling_min_freq attribute for policy1 and policy2 should be affected in the same way, because scaling_governor is set in the same way for both of them: # cat cpufreq/policy1/scaling_governor powersave # cat cpufreq/policy2/scaling_governor powersave # echo performance > cpufreq/policy0/scaling_governor # echo 94 > intel_pstate/min_perf_pct # cat cpufreq/policy0/scaling_min_freq 2914000 # cat cpufreq/policy1/scaling_min_freq 2914000 # cat cpufreq/policy2/scaling_min_freq 800000 The are affected differently, because intel_pstate_verify_policy() is invoked with limits set to &performance_limits (left behind by policy0) for policy1 and with limits set to &powersave_limits (left behind by policy1) for policy2. Since perf_limits is set to the set of limits matching the policy being updated, using it instead of limits fixes the inconsistency. Fixes: 1443ebbacfd7 (cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/cpufreq/intel_pstate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -2208,9 +2208,9 @@ static int intel_pstate_verify_policy(st unsigned int max_freq, min_freq; max_freq = policy->cpuinfo.max_freq * - limits->max_sysfs_pct / 100; + perf_limits->max_sysfs_pct / 100; min_freq = policy->cpuinfo.max_freq * - limits->min_sysfs_pct / 100; + perf_limits->min_sysfs_pct / 100; cpufreq_verify_within_limits(policy, min_freq, max_freq); } ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-03-29 22:17 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-28 13:13 [PATCH 0/2] cpufreq: intel_pstate: Two fixes related to limis Rafael J. Wysocki 2017-02-28 13:14 ` [PATCH 1/2] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki 2017-02-28 22:21 ` Rafael J. Wysocki 2017-02-28 13:16 ` [PATCH 2/2] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-02-28 23:00 ` [PATCH v2 0/3] cpufreq: intel_pstate: Fixes related to limis Rafael J. Wysocki 2017-02-28 23:07 ` [PATCH v2 1/3] cpufreq: intel_pstate: Fix global settings in active mode Rafael J. Wysocki 2017-02-28 23:09 ` [PATCH v2 2/3] cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy Rafael J. Wysocki 2017-03-02 17:18 ` Rafael J. Wysocki 2017-03-02 17:22 ` Rafael J. Wysocki 2017-03-02 22:29 ` [Update][PATCH v3 " Rafael J. Wysocki 2017-03-29 22:01 ` Doug Smythies 2017-03-29 22:16 ` Doug Smythies 2017-03-29 22:17 ` Rafael J. Wysocki 2017-02-28 23:11 ` [PATCH v2 3/3] cpufreq: intel_pstate: Fix intel_pstate_verify_policy() Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).