linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities
@ 2020-08-20 16:35 Rafael J. Wysocki
  2020-08-20 16:36 ` [PATCH 1/4] cpufreq: intel_pstate: Refuse to turn off with HWP enabled Rafael J. Wysocki
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-20 16:35 UTC (permalink / raw)
  To: Linux PM, Srinivas Pandruvada; +Cc: LKML, Doug Smythies

Hi All,

The purpose of this series is to address some peculiarities related to
taking CPUs offline/online and switching between different operation
modes with HWP enabled that have become visible after allowing the
driver to work in the passive mode with HWP enabled in 5.9-rc1 (and
one that was there earlier, but can be addressed easily after the
changes madein 5.9-rc1).

Please refer to the patch changelogs for details.

For easier testing/review, the series is available from the git branch at:

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 intel_pstate-testing

Thanks,
Rafael




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] cpufreq: intel_pstate: Refuse to turn off with HWP enabled
  2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
@ 2020-08-20 16:36 ` Rafael J. Wysocki
  2020-08-20 16:37 ` [PATCH 2/4] cpufreq: intel_pstate: Always return last EPP value from sysfs Rafael J. Wysocki
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-20 16:36 UTC (permalink / raw)
  To: Linux PM; +Cc: Srinivas Pandruvada, LKML, Doug Smythies

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

After commit f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive
mode with HWP enabled") it is possible to change the driver status
to "off" via sysfs with HWP enabled, which effectively causes the
driver to unregister itself, but HWP remains active and it forces the
minimum performance, so even if another cpufreq driver is loaded,
it will not be able to control the CPU frequency.

For this reason, make the driver refuse to change the status to
"off" with HWP enabled.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index e0220a6fbc69..bcda1e700a73 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2716,9 +2716,15 @@ static int intel_pstate_update_status(const char *buf, size_t size)
 {
 	int ret;
 
-	if (size == 3 && !strncmp(buf, "off", size))
-		return intel_pstate_driver ?
-			intel_pstate_unregister_driver() : -EINVAL;
+	if (size == 3 && !strncmp(buf, "off", size)) {
+		if (!intel_pstate_driver)
+			return -EINVAL;
+
+		if (hwp_active)
+			return -EBUSY;
+
+		return intel_pstate_unregister_driver();
+	}
 
 	if (size == 6 && !strncmp(buf, "active", size)) {
 		if (intel_pstate_driver) {
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] cpufreq: intel_pstate: Always return last EPP value from sysfs
  2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
  2020-08-20 16:36 ` [PATCH 1/4] cpufreq: intel_pstate: Refuse to turn off with HWP enabled Rafael J. Wysocki
@ 2020-08-20 16:37 ` Rafael J. Wysocki
  2020-08-20 16:38 ` [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks Rafael J. Wysocki
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-20 16:37 UTC (permalink / raw)
  To: Linux PM; +Cc: Srinivas Pandruvada, LKML, Doug Smythies

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Make the energy_performance_preference policy attribute in sysfs
always return the last EPP value written to it instead of the one
currently in the HWP Request MSR to avoid possible confusion when
the performance scaling algorithm is used in the active mode with
HWP enabled (in which case the EPP is forced to 0 regardless of
what value it has been set to via sysfs).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index bcda1e700a73..3d18934fa975 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -606,13 +606,10 @@ static const unsigned int epp_values[] = {
 
 static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw_epp)
 {
-	s16 epp;
+	s16 epp = cpu_data->epp_cached;
 	int index = -EINVAL;
 
 	*raw_epp = 0;
-	epp = intel_pstate_get_epp(cpu_data, 0);
-	if (epp < 0)
-		return epp;
 
 	if (boot_cpu_has(X86_FEATURE_HWP_EPP)) {
 		if (epp == HWP_EPP_PERFORMANCE)
@@ -644,6 +641,8 @@ static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw
 
 static int intel_pstate_set_epp(struct cpudata *cpu, u32 epp)
 {
+	int ret;
+
 	/*
 	 * Use the cached HWP Request MSR value, because in the active mode the
 	 * register itself may be updated by intel_pstate_hwp_boost_up() or
@@ -659,7 +658,11 @@ static int intel_pstate_set_epp(struct cpudata *cpu, u32 epp)
 	 * function, so it cannot run in parallel with the update below.
 	 */
 	WRITE_ONCE(cpu->hwp_req_cached, value);
-	return wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value);
+	ret = wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value);
+	if (!ret)
+		cpu->epp_cached = epp;
+
+	return ret;
 }
 
 static int intel_pstate_set_energy_pref_index(struct cpudata *cpu_data,
@@ -762,10 +765,8 @@ static ssize_t store_energy_performance_preference(
 			cpufreq_stop_governor(policy);
 			ret = intel_pstate_set_epp(cpu, epp);
 			err = cpufreq_start_governor(policy);
-			if (!ret) {
-				cpu->epp_cached = epp;
+			if (!ret)
 				ret = err;
-			}
 		}
 	}
 
@@ -2378,6 +2379,13 @@ static int intel_pstate_cpu_init(struct cpufreq_policy *policy)
 	 */
 	policy->policy = CPUFREQ_POLICY_POWERSAVE;
 
+
+	if (hwp_active) {
+		struct cpudata *cpu = all_cpu_data[policy->cpu];
+
+		cpu->epp_cached = intel_pstate_get_epp(cpu, 0);
+	}
+
 	return 0;
 }
 
@@ -2585,7 +2593,7 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy)
 		policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY_HWP;
 		rdmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, &value);
 		WRITE_ONCE(cpu->hwp_req_cached, value);
-		cpu->epp_cached = (value & GENMASK_ULL(31, 24)) >> 24;
+		cpu->epp_cached = intel_pstate_get_epp(cpu, value);
 	} else {
 		turbo_max = cpu->pstate.turbo_pstate;
 		policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY;
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks
  2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
  2020-08-20 16:36 ` [PATCH 1/4] cpufreq: intel_pstate: Refuse to turn off with HWP enabled Rafael J. Wysocki
  2020-08-20 16:37 ` [PATCH 2/4] cpufreq: intel_pstate: Always return last EPP value from sysfs Rafael J. Wysocki
@ 2020-08-20 16:38 ` Rafael J. Wysocki
  2020-08-22  0:47   ` Doug Smythies
  2020-08-20 16:38 ` [PATCH 4/4] cpufreq: intel_pstate: Free memory only when turning off Rafael J. Wysocki
  2020-08-22  0:47 ` [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Doug Smythies
  4 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-20 16:38 UTC (permalink / raw)
  To: Linux PM; +Cc: Srinivas Pandruvada, LKML, Doug Smythies

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Add ->offline and ->online driver callbacksto to do the cleanup
before taking a CPU offline and to restore its working configuration
when it goes back online, respectively, to avoid invoking the ->init
callback on every CPU online which is quite a bit of unnecessary
overhead.

Define ->offline and ->online so that they can be used in the
passive as well as in the active mode and because ->offline will
do the majority of ->stop_cpu work, the passive mode does not
need that callback any more, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c | 38 ++++++++++++++++++++++++++++------
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 3d18934fa975..aca0587b176f 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2297,28 +2297,51 @@ static int intel_pstate_verify_policy(struct cpufreq_policy_data *policy)
 	return 0;
 }
 
-static void intel_cpufreq_stop_cpu(struct cpufreq_policy *policy)
+static int intel_pstate_cpu_offline(struct cpufreq_policy *policy)
 {
+	pr_debug("CPU %d going offline\n", policy->cpu);
+
+	intel_pstate_exit_perf_limits(policy);
+
+	/*
+	 * If the CPU is an SMT thread and it goes offline with the performance
+	 * settings different from the minimum, it will prevent its sibling
+	 * from getting to lower performance levels, so force the minimum
+	 * performance on CPU offline to prevent that form happening.
+	 */
 	if (hwp_active)
 		intel_pstate_hwp_force_min_perf(policy->cpu);
 	else
 		intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]);
+
+	return 0;
+}
+
+static int intel_pstate_cpu_online(struct cpufreq_policy *policy)
+{
+	pr_debug("CPU %d going online\n", policy->cpu);
+
+	intel_pstate_init_acpi_perf_limits(policy);
+
+	if (hwp_active)
+		wrmsrl_on_cpu(policy->cpu, MSR_HWP_REQUEST,
+			      all_cpu_data[policy->cpu]->hwp_req_cached);
+
+	return 0;
 }
 
 static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
 {
-	pr_debug("CPU %d exiting\n", policy->cpu);
+	pr_debug("CPU %d stopping\n", policy->cpu);
 
 	intel_pstate_clear_update_util_hook(policy->cpu);
 	if (hwp_active)
 		intel_pstate_hwp_save_state(policy);
-
-	intel_cpufreq_stop_cpu(policy);
 }
 
 static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
 {
-	intel_pstate_exit_perf_limits(policy);
+	pr_debug("CPU %d exiting\n", policy->cpu);
 
 	policy->fast_switch_possible = false;
 
@@ -2398,6 +2421,8 @@ static struct cpufreq_driver intel_pstate = {
 	.init		= intel_pstate_cpu_init,
 	.exit		= intel_pstate_cpu_exit,
 	.stop_cpu	= intel_pstate_stop_cpu,
+	.offline	= intel_pstate_cpu_offline,
+	.online		= intel_pstate_cpu_online,
 	.update_limits	= intel_pstate_update_limits,
 	.name		= "intel_pstate",
 };
@@ -2652,7 +2677,8 @@ static struct cpufreq_driver intel_cpufreq = {
 	.fast_switch	= intel_cpufreq_fast_switch,
 	.init		= intel_cpufreq_cpu_init,
 	.exit		= intel_cpufreq_cpu_exit,
-	.stop_cpu	= intel_cpufreq_stop_cpu,
+	.offline	= intel_pstate_cpu_offline,
+	.online		= intel_pstate_cpu_online,
 	.update_limits	= intel_pstate_update_limits,
 	.name		= "intel_cpufreq",
 };
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] cpufreq: intel_pstate: Free memory only when turning off
  2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2020-08-20 16:38 ` [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks Rafael J. Wysocki
@ 2020-08-20 16:38 ` Rafael J. Wysocki
  2020-08-22  0:47 ` [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Doug Smythies
  4 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-20 16:38 UTC (permalink / raw)
  To: Linux PM; +Cc: Srinivas Pandruvada, LKML, Doug Smythies

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

When intel_pstate switches the operation mode from "active" to
"passive" or the other way around, freeing its data structures
representing CPUs and allocating them again from scratch is not
necessary and wasteful.  Moreover, if these data structures are
preserved, the cached HWP Request MSR value from there may be
written to the MSR to start with to reinitialize it and help to
restore the EPP value set previously (it is set to 0xFF when CPUs
go offline to allow their SMT siblings to use the full range of
EPP values and that also happens when the driver gets unregistered).

Accordingly, modify the driver to only do a full cleanup on driver
object registration errors and when its status is changed to "off"
via sysfs and to write the cached HWP Request MSR value back to
the MSR on CPU init if the data structure representing the given
CPU is still there.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c | 51 +++++++++++++---------------------
 1 file changed, 19 insertions(+), 32 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index aca0587b176f..a7c6491f2b4a 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2091,30 +2091,32 @@ static int intel_pstate_init_cpu(unsigned int cpunum)
 
 	cpu = all_cpu_data[cpunum];
 
-	if (!cpu) {
+	if (cpu) {
+		if (hwp_active)
+			wrmsrl_on_cpu(cpunum, MSR_HWP_REQUEST,
+				      cpu->hwp_req_cached);
+	} else {
 		cpu = kzalloc(sizeof(*cpu), GFP_KERNEL);
 		if (!cpu)
 			return -ENOMEM;
 
 		all_cpu_data[cpunum] = cpu;
 
+		cpu->cpu = cpunum;
+
 		cpu->epp_default = -EINVAL;
 		cpu->epp_powersave = -EINVAL;
 		cpu->epp_saved = -EINVAL;
-	}
-
-	cpu = all_cpu_data[cpunum];
 
-	cpu->cpu = cpunum;
+		if (hwp_active) {
+			const struct x86_cpu_id *id;
 
-	if (hwp_active) {
-		const struct x86_cpu_id *id;
+			intel_pstate_hwp_enable(cpu);
 
-		intel_pstate_hwp_enable(cpu);
-
-		id = x86_match_cpu(intel_pstate_hwp_boost_ids);
-		if (id && intel_pstate_acpi_pm_profile_server())
-			hwp_boost = true;
+			id = x86_match_cpu(intel_pstate_hwp_boost_ids);
+			if (id && intel_pstate_acpi_pm_profile_server())
+				hwp_boost = true;
+		}
 	}
 
 	intel_pstate_get_cpu_pstates(cpu);
@@ -2701,9 +2703,6 @@ static void intel_pstate_driver_cleanup(void)
 	}
 	put_online_cpus();
 
-	if (intel_pstate_driver == &intel_pstate)
-		intel_pstate_sysfs_hide_hwp_dynamic_boost();
-
 	intel_pstate_driver = NULL;
 }
 
@@ -2729,14 +2728,6 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
 	return 0;
 }
 
-static int intel_pstate_unregister_driver(void)
-{
-	cpufreq_unregister_driver(intel_pstate_driver);
-	intel_pstate_driver_cleanup();
-
-	return 0;
-}
-
 static ssize_t intel_pstate_show_status(char *buf)
 {
 	if (!intel_pstate_driver)
@@ -2748,8 +2739,6 @@ static ssize_t intel_pstate_show_status(char *buf)
 
 static int intel_pstate_update_status(const char *buf, size_t size)
 {
-	int ret;
-
 	if (size == 3 && !strncmp(buf, "off", size)) {
 		if (!intel_pstate_driver)
 			return -EINVAL;
@@ -2757,7 +2746,8 @@ static int intel_pstate_update_status(const char *buf, size_t size)
 		if (hwp_active)
 			return -EBUSY;
 
-		return intel_pstate_unregister_driver();
+		cpufreq_unregister_driver(intel_pstate_driver);
+		intel_pstate_driver_cleanup();
 	}
 
 	if (size == 6 && !strncmp(buf, "active", size)) {
@@ -2765,9 +2755,7 @@ static int intel_pstate_update_status(const char *buf, size_t size)
 			if (intel_pstate_driver == &intel_pstate)
 				return 0;
 
-			ret = intel_pstate_unregister_driver();
-			if (ret)
-				return ret;
+			cpufreq_unregister_driver(intel_pstate_driver);
 		}
 
 		return intel_pstate_register_driver(&intel_pstate);
@@ -2778,9 +2766,8 @@ static int intel_pstate_update_status(const char *buf, size_t size)
 			if (intel_pstate_driver == &intel_cpufreq)
 				return 0;
 
-			ret = intel_pstate_unregister_driver();
-			if (ret)
-				return ret;
+			cpufreq_unregister_driver(intel_pstate_driver);
+			intel_pstate_sysfs_hide_hwp_dynamic_boost();
 		}
 
 		return intel_pstate_register_driver(&intel_cpufreq);
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities
  2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2020-08-20 16:38 ` [PATCH 4/4] cpufreq: intel_pstate: Free memory only when turning off Rafael J. Wysocki
@ 2020-08-22  0:47 ` Doug Smythies
  2020-08-24 14:06   ` Rafael J. Wysocki
  4 siblings, 1 reply; 9+ messages in thread
From: Doug Smythies @ 2020-08-22  0:47 UTC (permalink / raw)
  To: 'Rafael J. Wysocki', 'Srinivas Pandruvada'
  Cc: 'LKML', 'Linux PM'

Hi Rafael,

On 2020.08.20 09:35 Rafael J. Wysocki wrote:
> 
> The purpose of this series is to address some peculiarities related to
> taking CPUs offline/online and switching between different operation
> modes with HWP enabled that have become visible after allowing the
> driver to work in the passive mode with HWP enabled in 5.9-rc1 (and
> one that was there earlier, but can be addressed easily after the
> changes madein 5.9-rc1).
> 
> Please refer to the patch changelogs for details.
> 
> For easier testing/review, the series is available from the git branch at:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
>  intel_pstate-testing

Thanks for that.

There still seems to be a problem with EPP getting messed up.
I have not been able to find the exact spot in the code.

One problem is that EPP can end up as 0, and thereafter stays
at 0. In sysfs terms, it ends up as "performance" and thereafter
stays as "performance". Meanwhile I never modified it, and it started
as "balance_performance".

It happens when changing from active to passive if the governor is performance.
If the governor is not performance things work as expected.

Another problem is that EPP will end up as 128 when changing from passive
to active. This erroneous condition is cleared by changing the governor to
powersave and back to performance. It also doesn't occur the first
time after boot, when booting to intel_cpufreq/performance/HWP.
(confused yet?) The sysfs value is O.K. during this.

Supporting data:
Example 1:

Grub:
GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 consoleblank=450 msr.allow_writes=on cpuidle.governor=teo"

So I boot to intel_pstate/performance/HWP:

# /home/doug/c/msr-decoder (always edited for only relevant parts)
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E :
    epp:        0 :        0 :        0 :        0 :        0 :        0 :

# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
intel_pstate
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
performance
# cat  /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
balance_performance

# echo passive > /sys/devices/system/cpu/intel_pstate/status

Note: the following results are incorrect:

# cat  /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
performance
# echo "ondemand" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
ondemand
# cat  /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
performance
# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 00002E08 : 00002E08 : 00002E08 : 00002E0B : 00002E13 : 00002E08 :
    epp:        0 :        0 :        0 :        0 :        0 :        0 :

# echo active > /sys/devices/system/cpu/intel_pstate/status
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
performance
# cat  /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
performance
# echo "powersave" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
# cat  /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
performance
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
powersave
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
intel_pstate

Example 2:
Grub:
GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 consoleblank=450 intel_pstate=passive msr.allow_writes=on cpuidle.governor=teo"

So I boot to intel_cpufreq/performance/HWP:

# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E :
    epp:      128 :      128 :      128 :      128 :      128 :      128 :

# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
intel_cpufreq
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
performance
# cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
balance_performance
# echo active > /sys/devices/system/cpu/intel_pstate/status
# cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
balance_performance
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
performance
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
intel_pstate
# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E :
    epp:        0 :        0 :        0 :        0 :        0 :        0 :

# echo passive > /sys/devices/system/cpu/intel_pstate/status
# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E :
    epp:      128 :      128 :      128 :      128 :      128 :      128 :
# echo active > /sys/devices/system/cpu/intel_pstate/status

Note: the following results are incorrect:

# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E :
    epp:      128 :      128 :      128 :      128 :      128 :      128 :
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
performance
# cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
intel_pstate
# cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference
balance_performance

Note: But the problem can be cleared:

# echo "powersave" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
# echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
# /home/doug/c/msr-decoder
6.) 0x774: IA32_HWP_REQUEST:    CPU 0-5 :
    raw: 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E : 00002E2E :
    epp:        0 :        0 :        0 :        0 :        0 :        0 :

... Doug



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks
  2020-08-20 16:38 ` [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks Rafael J. Wysocki
@ 2020-08-22  0:47   ` Doug Smythies
  2020-08-24 13:40     ` [PATCH v2 " Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Doug Smythies @ 2020-08-22  0:47 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Srinivas Pandruvada', 'LKML', 'Linux PM'

Hi Rafael,

Just annoying typo type feedback.

On 2020.08.20 09:38 Rafael J. Wysocki wrote:

> 
> From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> 
> Add ->offline and ->online driver callbacksto to do the cleanup

"to to" and suggest this:

Add ->offline and ->online driver callbacks to cleanup

> before taking a CPU offline and to restore its working configuration
> when it goes back online, respectively, to avoid invoking the ->init
> callback on every CPU online which is quite a bit of unnecessary
> overhead.
> 
> Define ->offline and ->online so that they can be used in the
> passive as well as in the active mode and because ->offline will

  passive mode

> do the majority of ->stop_cpu work, the passive mode does not
> need that callback any more, so drop it.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c | 38 ++++++++++++++++++++++++++++------
>  1 file changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 3d18934fa975..aca0587b176f 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -2297,28 +2297,51 @@ static int intel_pstate_verify_policy(struct cpufreq_policy_data *policy)
>  	return 0;
>  }
> 
> -static void intel_cpufreq_stop_cpu(struct cpufreq_policy *policy)
> +static int intel_pstate_cpu_offline(struct cpufreq_policy *policy)
>  {
> +	pr_debug("CPU %d going offline\n", policy->cpu);
> +
> +	intel_pstate_exit_perf_limits(policy);
> +
> +	/*
> +	 * If the CPU is an SMT thread and it goes offline with the performance
> +	 * settings different from the minimum, it will prevent its sibling
> +	 * from getting to lower performance levels, so force the minimum
> +	 * performance on CPU offline to prevent that form happening.

form/from

> +	 */
>  	if (hwp_active)
>  		intel_pstate_hwp_force_min_perf(policy->cpu);
>  	else
>  		intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]);
> +
> +	return 0;
> +}
> +
> +static int intel_pstate_cpu_online(struct cpufreq_policy *policy)
> +{
> +	pr_debug("CPU %d going online\n", policy->cpu);
> +
> +	intel_pstate_init_acpi_perf_limits(policy);
> +
> +	if (hwp_active)
> +		wrmsrl_on_cpu(policy->cpu, MSR_HWP_REQUEST,
> +			      all_cpu_data[policy->cpu]->hwp_req_cached);
> +
> +	return 0;
>  }
> 
>  static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
>  {
> -	pr_debug("CPU %d exiting\n", policy->cpu);
> +	pr_debug("CPU %d stopping\n", policy->cpu);
> 
>  	intel_pstate_clear_update_util_hook(policy->cpu);
>  	if (hwp_active)
>  		intel_pstate_hwp_save_state(policy);
> -
> -	intel_cpufreq_stop_cpu(policy);
>  }
> 
>  static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
>  {
> -	intel_pstate_exit_perf_limits(policy);
> +	pr_debug("CPU %d exiting\n", policy->cpu);
> 
>  	policy->fast_switch_possible = false;
> 
> @@ -2398,6 +2421,8 @@ static struct cpufreq_driver intel_pstate = {
>  	.init		= intel_pstate_cpu_init,
>  	.exit		= intel_pstate_cpu_exit,
>  	.stop_cpu	= intel_pstate_stop_cpu,
> +	.offline	= intel_pstate_cpu_offline,
> +	.online		= intel_pstate_cpu_online,
>  	.update_limits	= intel_pstate_update_limits,
>  	.name		= "intel_pstate",
>  };
> @@ -2652,7 +2677,8 @@ static struct cpufreq_driver intel_cpufreq = {
>  	.fast_switch	= intel_cpufreq_fast_switch,
>  	.init		= intel_cpufreq_cpu_init,
>  	.exit		= intel_cpufreq_cpu_exit,
> -	.stop_cpu	= intel_cpufreq_stop_cpu,
> +	.offline	= intel_pstate_cpu_offline,
> +	.online		= intel_pstate_cpu_online,
>  	.update_limits	= intel_pstate_update_limits,
>  	.name		= "intel_cpufreq",
>  };
> --
> 2.26.2
> 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks
  2020-08-22  0:47   ` Doug Smythies
@ 2020-08-24 13:40     ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-24 13:40 UTC (permalink / raw)
  To: Doug Smythies
  Cc: 'Srinivas Pandruvada', 'LKML', 'Linux PM'

On Saturday, August 22, 2020 2:47:16 AM CEST Doug Smythies wrote:
> Hi Rafael,
> 
> Just annoying typo type feedback.

Thanks!

Please find a corrected patch below.

Cheers!

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] cpufreq: intel_pstate: Add ->offline and ->online callbacks

Add ->offline and ->online driver callbacks to prepare for taking a
CPU offline and to restore its working configuration when it goes
back online, respectively, to avoid invoking the ->init callback on
every CPU online which is quite a bit of unnecessary overhead.

Define ->offline and ->online so that they can be used in the
passive mode as well as in the active mode and because ->offline
will do the majority of ->stop_cpu work, the passive mode does
not need that callback any more, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |   38 ++++++++++++++++++++++++++++++++------
 1 file changed, 32 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -2297,28 +2297,51 @@ static int intel_pstate_verify_policy(st
 	return 0;
 }
 
-static void intel_cpufreq_stop_cpu(struct cpufreq_policy *policy)
+static int intel_pstate_cpu_offline(struct cpufreq_policy *policy)
 {
+	pr_debug("CPU %d going offline\n", policy->cpu);
+
+	intel_pstate_exit_perf_limits(policy);
+
+	/*
+	 * If the CPU is an SMT thread and it goes offline with the performance
+	 * settings different from the minimum, it will prevent its sibling
+	 * from getting to lower performance levels, so force the minimum
+	 * performance on CPU offline to prevent that from happening.
+	 */
 	if (hwp_active)
 		intel_pstate_hwp_force_min_perf(policy->cpu);
 	else
 		intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]);
+
+	return 0;
+}
+
+static int intel_pstate_cpu_online(struct cpufreq_policy *policy)
+{
+	pr_debug("CPU %d going online\n", policy->cpu);
+
+	intel_pstate_init_acpi_perf_limits(policy);
+
+	if (hwp_active)
+		wrmsrl_on_cpu(policy->cpu, MSR_HWP_REQUEST,
+			      all_cpu_data[policy->cpu]->hwp_req_cached);
+
+	return 0;
 }
 
 static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
 {
-	pr_debug("CPU %d exiting\n", policy->cpu);
+	pr_debug("CPU %d stopping\n", policy->cpu);
 
 	intel_pstate_clear_update_util_hook(policy->cpu);
 	if (hwp_active)
 		intel_pstate_hwp_save_state(policy);
-
-	intel_cpufreq_stop_cpu(policy);
 }
 
 static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
 {
-	intel_pstate_exit_perf_limits(policy);
+	pr_debug("CPU %d exiting\n", policy->cpu);
 
 	policy->fast_switch_possible = false;
 
@@ -2398,6 +2421,8 @@ static struct cpufreq_driver intel_pstat
 	.init		= intel_pstate_cpu_init,
 	.exit		= intel_pstate_cpu_exit,
 	.stop_cpu	= intel_pstate_stop_cpu,
+	.offline	= intel_pstate_cpu_offline,
+	.online		= intel_pstate_cpu_online,
 	.update_limits	= intel_pstate_update_limits,
 	.name		= "intel_pstate",
 };
@@ -2652,7 +2677,8 @@ static struct cpufreq_driver intel_cpufr
 	.fast_switch	= intel_cpufreq_fast_switch,
 	.init		= intel_cpufreq_cpu_init,
 	.exit		= intel_cpufreq_cpu_exit,
-	.stop_cpu	= intel_cpufreq_stop_cpu,
+	.offline	= intel_pstate_cpu_offline,
+	.online		= intel_pstate_cpu_online,
 	.update_limits	= intel_pstate_update_limits,
 	.name		= "intel_cpufreq",
 };




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities
  2020-08-22  0:47 ` [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Doug Smythies
@ 2020-08-24 14:06   ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2020-08-24 14:06 UTC (permalink / raw)
  To: Doug Smythies
  Cc: 'Srinivas Pandruvada', 'LKML', 'Linux PM'

On Saturday, August 22, 2020 2:47:05 AM CEST Doug Smythies wrote:
> Hi Rafael,
> 
> On 2020.08.20 09:35 Rafael J. Wysocki wrote:
> > 
> > The purpose of this series is to address some peculiarities related to
> > taking CPUs offline/online and switching between different operation
> > modes with HWP enabled that have become visible after allowing the
> > driver to work in the passive mode with HWP enabled in 5.9-rc1 (and
> > one that was there earlier, but can be addressed easily after the
> > changes madein 5.9-rc1).
> > 
> > Please refer to the patch changelogs for details.
> > 
> > For easier testing/review, the series is available from the git branch at:
> > 
> >  git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> >  intel_pstate-testing
> 
> Thanks for that.
> 
> There still seems to be a problem with EPP getting messed up.
> I have not been able to find the exact spot in the code.
> 
> One problem is that EPP can end up as 0, and thereafter stays
> at 0. In sysfs terms, it ends up as "performance" and thereafter
> stays as "performance". Meanwhile I never modified it, and it started
> as "balance_performance".
> 
> It happens when changing from active to passive if the governor is performance.
> If the governor is not performance things work as expected.

One change is missing from the patches in the $subject series and IMO it
doesn't belong to any of them, so please find it appended below (on top
of the $subject series).

With it applied, this particular issue should go away.

> Another problem is that EPP will end up as 128 when changing from passive
> to active.

I don't seem to be able to reproduce this (at least not without involving
system-wide suspend/resume).

Cheers!

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] cpufreq: intel_pstate: Restore cached EPP value during offline

Because hwp_req_cached contains the effective EPP value (0) when the
"performance" scaling algorithm is used in the active mode, replace
it with the cached EPP value during CPU offline to prevent it from
being used (unexpectedly) after switching over from the active mode
to the passive mode.

Also rename intel_pstate_hwp_force_min_perf() because it will do more
than just forcing the minimum performance now.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |   19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -904,12 +904,23 @@ skip_epp:
 	wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value);
 }
 
-static void intel_pstate_hwp_force_min_perf(int cpu)
+static void intel_pstate_hwp_offline(int cpu)
 {
-	u64 value;
+	struct cpudata *cpudata = all_cpu_data[cpu];
+	u64 value = READ_ONCE(cpudata->hwp_req_cached);
 	int min_perf;
 
-	value = all_cpu_data[cpu]->hwp_req_cached;
+	if (boot_cpu_has(X86_FEATURE_HWP_EPP)) {
+		/*
+		 * In case the EPP has been set to "performance" by the
+		 * active mode "performance" scaling algorithm, replace that
+		 * temporary value with the cached EPP one.
+		 */
+		value &= ~GENMASK_ULL(31, 24);
+		value |= HWP_ENERGY_PERF_PREFERENCE(cpudata->epp_cached);
+		WRITE_ONCE(cpudata->hwp_req_cached, value);
+	}
+
 	value &= ~GENMASK_ULL(31, 0);
 	min_perf = HWP_LOWEST_PERF(all_cpu_data[cpu]->hwp_cap_cached);
 
@@ -2312,7 +2323,7 @@ static int intel_pstate_cpu_offline(stru
 	 * performance on CPU offline to prevent that from happening.
 	 */
 	if (hwp_active)
-		intel_pstate_hwp_force_min_perf(policy->cpu);
+		intel_pstate_hwp_offline(policy->cpu);
 	else
 		intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]);
 




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-24 14:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 16:35 [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Rafael J. Wysocki
2020-08-20 16:36 ` [PATCH 1/4] cpufreq: intel_pstate: Refuse to turn off with HWP enabled Rafael J. Wysocki
2020-08-20 16:37 ` [PATCH 2/4] cpufreq: intel_pstate: Always return last EPP value from sysfs Rafael J. Wysocki
2020-08-20 16:38 ` [PATCH 3/4] cpufreq: intel_pstate: Add ->offline and ->online callbacks Rafael J. Wysocki
2020-08-22  0:47   ` Doug Smythies
2020-08-24 13:40     ` [PATCH v2 " Rafael J. Wysocki
2020-08-20 16:38 ` [PATCH 4/4] cpufreq: intel_pstate: Free memory only when turning off Rafael J. Wysocki
2020-08-22  0:47 ` [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities Doug Smythies
2020-08-24 14:06   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).