All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V7 0/7] amd-pstate preferred core
@ 2023-09-18  8:14 Meng Li
  2023-09-18  8:14 ` [PATCH V7 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
                   ` (10 more replies)
  0 siblings, 11 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

Hi all:

The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.

Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.

Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.

Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- acpi: cppc:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.

Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.

Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq: 
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``

Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.

Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate: 
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.

Changes form V1->V2:
- acpi: cppc:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate: 
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate: 
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.

Meng Li (7):
  x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
  acpi: cppc: Add get the highest performance cppc control
  cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
  cpufreq: Add a notification message that the highest perf has changed
  cpufreq: amd-pstate: Update amd-pstate preferred core ranking
    dynamically
  Documentation: amd-pstate: introduce amd-pstate preferred core
  Documentation: introduce amd-pstate preferrd core mode kernel command
    line options

 .../admin-guide/kernel-parameters.txt         |   5 +
 Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
 arch/x86/Kconfig                              |   5 +-
 drivers/acpi/cppc_acpi.c                      |  13 ++
 drivers/acpi/processor_driver.c               |   6 +
 drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
 drivers/cpufreq/cpufreq.c                     |  13 ++
 include/acpi/cppc_acpi.h                      |   5 +
 include/linux/amd-pstate.h                    |   6 +
 include/linux/cpufreq.h                       |   5 +
 10 files changed, 291 insertions(+), 22 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH V7 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18  8:14 ` [PATCH V7 2/7] acpi: cppc: Add get the highest performance cppc control Meng Li
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

amd-pstate driver also uses SCHED_MC_PRIO, so decouple the requirement
of CPU_SUP_INTEL from the dependencies to allow compilation in kernels
without Intel CPU support.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
---
 arch/x86/Kconfig | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 982b777eadc7..c37ef2e6940b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1054,8 +1054,9 @@ config SCHED_MC
 
 config SCHED_MC_PRIO
 	bool "CPU core priorities scheduler support"
-	depends on SCHED_MC && CPU_SUP_INTEL
-	select X86_INTEL_PSTATE
+	depends on SCHED_MC
+	select X86_INTEL_PSTATE if CPU_SUP_INTEL
+	select X86_AMD_PSTATE if CPU_SUP_AMD && ACPI
 	select CPU_FREQ
 	default y
 	help
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 2/7] acpi: cppc: Add get the highest performance cppc control
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
  2023-09-18  8:14 ` [PATCH V7 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18  8:14 ` [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny

Add support for getting the highest performance to the
generic CPPC driver. This enables downstream drivers
such as amd-pstate to discover and use these values.

Please refer to the ACPI_Spec for details on continuous
performance control of CPPC.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#highest-performance
---
 drivers/acpi/cppc_acpi.c | 13 +++++++++++++
 include/acpi/cppc_acpi.h |  5 +++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 7ff269a78c20..ad388a0e8484 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1154,6 +1154,19 @@ int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
 	return cppc_get_perf(cpunum, NOMINAL_PERF, nominal_perf);
 }
 
+/**
+ * cppc_get_highest_perf - Get the highest performance register value.
+ * @cpunum: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+	return cppc_get_perf(cpunum, HIGHEST_PERF, highest_perf);
+}
+EXPORT_SYMBOL_GPL(cppc_get_highest_perf);
+
 /**
  * cppc_get_epp_perf - Get the epp register value.
  * @cpunum: CPU from which to get epp preference value.
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 6126c977ece0..c0b69ffe7bdb 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -139,6 +139,7 @@ struct cppc_cpudata {
 #ifdef CONFIG_ACPI_CPPC_LIB
 extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
 extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
+extern int cppc_get_highest_perf(int cpunum, u64 *highest_perf);
 extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 extern int cppc_set_enable(int cpu, bool enable);
@@ -165,6 +166,10 @@ static inline int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
 {
 	return -ENOTSUPP;
 }
+static inline int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+	return -ENOTSUPP;
+}
 static inline int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
 {
 	return -ENOTSUPP;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
  2023-09-18  8:14 ` [PATCH V7 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
  2023-09-18  8:14 ` [PATCH V7 2/7] acpi: cppc: Add get the highest performance cppc control Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-20  2:43   ` Huang Rui
  2023-09-18  8:14 ` [PATCH V7 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

The initial core rankings are set up by amd-pstate when the
system boots.

Add device attribute for hardware preferred core. It will check
if the processor and power firmware support preferred core
feature.

Add device attribute for preferred core. Only when hardware
supports preferred core and user set `enabled` in early parameter,
it can be set to enabled.

Add one new early parameter `disable` to allow user to disable
the preferred core.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 163 +++++++++++++++++++++++++++++++----
 1 file changed, 147 insertions(+), 16 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 9a1e194d5cf8..050e23594057 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -37,6 +37,7 @@
 #include <linux/uaccess.h>
 #include <linux/static_call.h>
 #include <linux/amd-pstate.h>
+#include <linux/topology.h>
 
 #include <acpi/processor.h>
 #include <acpi/cppc_acpi.h>
@@ -49,6 +50,8 @@
 
 #define AMD_PSTATE_TRANSITION_LATENCY	20000
 #define AMD_PSTATE_TRANSITION_DELAY	1000
+#define AMD_PSTATE_PREFCORE_THRESHOLD	166
+#define AMD_PSTATE_MAX_CPPC_PERF	255
 
 /*
  * TODO: We need more time to fine tune processors with shared memory solution
@@ -65,6 +68,12 @@ static struct cpufreq_driver amd_pstate_epp_driver;
 static int cppc_state = AMD_PSTATE_UNDEFINED;
 static bool cppc_enabled;
 
+/*HW preferred Core featue is supported*/
+static bool hw_prefcore = true;
+
+/*Preferred Core featue is supported*/
+static bool prefcore = true;
+
 /*
  * AMD Energy Preference Performance (EPP)
  * The EPP is used in the CCLK DPM controller to drive
@@ -290,23 +299,21 @@ static inline int amd_pstate_enable(bool enable)
 static int pstate_init_perf(struct amd_cpudata *cpudata)
 {
 	u64 cap1;
-	u32 highest_perf;
 
 	int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
 				     &cap1);
 	if (ret)
 		return ret;
 
-	/*
-	 * TODO: Introduce AMD specific power feature.
-	 *
-	 * CPPC entry doesn't indicate the highest performance in some ASICs.
+	/* For platforms that do not support the preferred core feature, the
+	 * highest_pef may be configured with 166 or 255, to avoid max frequency
+	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
+	 * the default max perf.
 	 */
-	highest_perf = amd_get_highest_perf();
-	if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
-		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
-
-	WRITE_ONCE(cpudata->highest_perf, highest_perf);
+	if (hw_prefcore)
+		WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
+	else
+		WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
 
 	WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
@@ -318,17 +325,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 static int cppc_init_perf(struct amd_cpudata *cpudata)
 {
 	struct cppc_perf_caps cppc_perf;
-	u32 highest_perf;
 
 	int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
 	if (ret)
 		return ret;
 
-	highest_perf = amd_get_highest_perf();
-	if (highest_perf > cppc_perf.highest_perf)
-		highest_perf = cppc_perf.highest_perf;
-
-	WRITE_ONCE(cpudata->highest_perf, highest_perf);
+	if (hw_prefcore)
+		WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
+	else
+		WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
 
 	WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf,
@@ -676,6 +681,90 @@ static void amd_perf_ctl_reset(unsigned int cpu)
 	wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0);
 }
 
+/*
+ * Set amd-pstate preferred core enable can't be done directly from cpufreq callbacks
+ * due to locking, so queue the work for later.
+ */
+static void amd_pstste_sched_prefcore_workfn(struct work_struct *work)
+{
+	sched_set_itmt_support();
+}
+static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
+
+/*
+ * Get the highest performance register value.
+ * @cpu: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+static int amd_pstate_get_highest_perf(int cpu, u32 *highest_perf)
+{
+	int ret;
+	u64 cppc_highest_perf;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		u64 cap1;
+
+		ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
+		if (ret)
+			return ret;
+		WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
+	} else {
+		ret = cppc_get_highest_perf(cpu, &cppc_highest_perf);
+		*highest_perf = (u32)(cppc_highest_perf & 0xFFFF);
+	}
+
+	return (ret);
+}
+
+static void amd_pstate_init_prefcore(unsigned int cpu)
+{
+	int ret;
+	u32 highest_perf;
+	static u32 max_highest_perf = 0, min_highest_perf = U32_MAX;
+
+	if (!prefcore)
+		return;
+
+	ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
+	if (ret)
+		return;
+
+	/*
+	 * The priorities can be set regardless of whether or not
+	 * sched_set_itmt_support(true) has been called and it is valid to
+	 * update them at any time after it has been called.
+	 */
+	sched_set_itmt_core_prio(highest_perf, cpu);
+
+	/* check if CPPC preferred core feature is enabled*/
+	if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) {
+		pr_debug("AMD CPPC preferred core is unsupported!\n");
+		hw_prefcore = false;
+		prefcore = false;
+		return;
+	}
+
+	if (max_highest_perf <= min_highest_perf) {
+		if (highest_perf > max_highest_perf)
+			max_highest_perf = highest_perf;
+
+		if (highest_perf < min_highest_perf)
+			min_highest_perf = highest_perf;
+
+		if (max_highest_perf > min_highest_perf) {
+			/*
+			 * This code can be run during CPU online under the
+			 * CPU hotplug locks, so sched_set_itmt_support()
+			 * cannot be called from here.  Queue up a work item
+			 * to invoke it.
+			 */
+			schedule_work(&sched_prefcore_work);
+		}
+	}
+}
+
 static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 {
 	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -697,6 +786,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 
 	cpudata->cpu = policy->cpu;
 
+	amd_pstate_init_prefcore(policy->cpu);
+
 	ret = amd_pstate_init_perf(cpudata);
 	if (ret)
 		goto free_cpudata1;
@@ -763,6 +854,22 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 	return ret;
 }
 
+static int amd_pstate_cpu_online(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+
+	pr_debug("CPU %d going online\n", cpudata->cpu);
+
+	amd_pstate_init_prefcore(cpudata->cpu);
+
+	return 0;
+}
+
+static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
+{
+	return 0;
+}
+
 static int amd_pstate_cpu_exit(struct cpufreq_policy *policy)
 {
 	struct amd_cpudata *cpudata = policy->driver_data;
@@ -1037,6 +1144,12 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
 	return ret < 0 ? ret : count;
 }
 
+static ssize_t prefcore_show(struct device *dev,
+			     struct device_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled");
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
@@ -1044,6 +1157,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
 cpufreq_freq_attr_rw(energy_performance_preference);
 cpufreq_freq_attr_ro(energy_performance_available_preferences);
 static DEVICE_ATTR_RW(status);
+static DEVICE_ATTR_RO(prefcore);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
@@ -1063,6 +1177,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 
 static struct attribute *pstate_global_attributes[] = {
 	&dev_attr_status.attr,
+	&dev_attr_prefcore.attr,
 	NULL
 };
 
@@ -1114,6 +1229,8 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
 	cpudata->cpu = policy->cpu;
 	cpudata->epp_policy = 0;
 
+	amd_pstate_init_prefcore(policy->cpu);
+
 	ret = amd_pstate_init_perf(cpudata);
 	if (ret)
 		goto free_cpudata1;
@@ -1285,6 +1402,8 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
 
 	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
 
+	amd_pstate_init_prefcore(cpudata->cpu);
+
 	if (cppc_state == AMD_PSTATE_ACTIVE) {
 		amd_pstate_epp_reenable(cpudata);
 		cpudata->suspended = false;
@@ -1389,6 +1508,8 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.fast_switch    = amd_pstate_fast_switch,
 	.init		= amd_pstate_cpu_init,
 	.exit		= amd_pstate_cpu_exit,
+	.offline	= amd_pstate_cpu_offline,
+	.online		= amd_pstate_cpu_online,
 	.suspend	= amd_pstate_cpu_suspend,
 	.resume		= amd_pstate_cpu_resume,
 	.set_boost	= amd_pstate_set_boost,
@@ -1527,7 +1648,17 @@ static int __init amd_pstate_param(char *str)
 
 	return amd_pstate_set_driver(mode_idx);
 }
+
+static int __init amd_prefcore_param(char *str)
+{
+	if (!strcmp(str, "disable"))
+		prefcore = false;
+
+	return 0;
+}
+
 early_param("amd_pstate", amd_pstate_param);
+early_param("amd_prefcore", amd_prefcore_param);
 
 MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
 MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 4/7] cpufreq: Add a notification message that the highest perf has changed
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (2 preceding siblings ...)
  2023-09-18  8:14 ` [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18  8:14 ` [PATCH V7 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

ACPI 6.5 section 8.4.6.1.1.1 specifies that Notify event 0x85 can be
emmitted to cause the the OSPM to re-evaluate the highest performance
register. Add support for this event.

Signed-off-by: Meng Li <li.meng@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#processor-device-notification-values
---
 drivers/acpi/processor_driver.c |  6 ++++++
 drivers/cpufreq/cpufreq.c       | 13 +++++++++++++
 include/linux/cpufreq.h         |  5 +++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 4bd16b3f0781..29b2fb68a35d 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -27,6 +27,7 @@
 #define ACPI_PROCESSOR_NOTIFY_PERFORMANCE 0x80
 #define ACPI_PROCESSOR_NOTIFY_POWER	0x81
 #define ACPI_PROCESSOR_NOTIFY_THROTTLING	0x82
+#define ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED	0x85
 
 MODULE_AUTHOR("Paul Diefenbaugh");
 MODULE_DESCRIPTION("ACPI Processor Driver");
@@ -83,6 +84,11 @@ static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)
 		acpi_bus_generate_netlink_event(device->pnp.device_class,
 						  dev_name(&device->dev), event, 0);
 		break;
+	case ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED:
+		cpufreq_update_highest_perf(pr->id);
+		acpi_bus_generate_netlink_event(device->pnp.device_class,
+						  dev_name(&device->dev), event, 0);
+		break;
 	default:
 		acpi_handle_debug(handle, "Unsupported event [0x%x]\n", event);
 		break;
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 60ed89000e82..4ada787ff105 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2718,6 +2718,19 @@ void cpufreq_update_limits(unsigned int cpu)
 }
 EXPORT_SYMBOL_GPL(cpufreq_update_limits);
 
+/**
+ * cpufreq_update_highest_perf - Update highest performance for a given CPU.
+ * @cpu: CPU to update the highest performance for.
+ *
+ * Invoke the driver's ->update_highest_perf callback if present
+ */
+void cpufreq_update_highest_perf(unsigned int cpu)
+{
+	if (cpufreq_driver->update_highest_perf)
+		cpufreq_driver->update_highest_perf(cpu);
+}
+EXPORT_SYMBOL_GPL(cpufreq_update_highest_perf);
+
 /*********************************************************************
  *               BOOST						     *
  *********************************************************************/
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 71d186d6933a..1cc1241fb698 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -235,6 +235,7 @@ int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
 void refresh_frequency_limits(struct cpufreq_policy *policy);
 void cpufreq_update_policy(unsigned int cpu);
 void cpufreq_update_limits(unsigned int cpu);
+void cpufreq_update_highest_perf(unsigned int cpu);
 bool have_governor_per_policy(void);
 bool cpufreq_supports_freq_invariance(void);
 struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
@@ -263,6 +264,7 @@ static inline bool cpufreq_supports_freq_invariance(void)
 	return false;
 }
 static inline void disable_cpufreq(void) { }
+static inline void cpufreq_update_highest_perf(unsigned int cpu) { }
 #endif
 
 #ifdef CONFIG_CPU_FREQ_STAT
@@ -380,6 +382,9 @@ struct cpufreq_driver {
 	/* Called to update policy limits on firmware notifications. */
 	void		(*update_limits)(unsigned int cpu);
 
+	/* Called to update highest performance on firmware notifications. */
+	void		(*update_highest_perf)(unsigned int cpu);
+
 	/* optional */
 	int		(*bios_limit)(int cpu, unsigned int *limit);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (3 preceding siblings ...)
  2023-09-18  8:14 ` [PATCH V7 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18  8:14 ` [PATCH V7 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny

Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.

Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
 include/linux/amd-pstate.h   |  6 ++++++
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 050e23594057..97b1d4674b4f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -318,6 +318,7 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 	WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
 	WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
+	WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
 
 	return 0;
 }
@@ -339,6 +340,7 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf,
 		   cppc_perf.lowest_nonlinear_perf);
 	WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+	WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
 
 	if (cppc_state == AMD_PSTATE_ACTIVE)
 		return 0;
@@ -545,7 +547,7 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
 	if (target_perf < capacity)
 		des_perf = DIV_ROUND_UP(cap_perf * target_perf, capacity);
 
-	min_perf = READ_ONCE(cpudata->highest_perf);
+	min_perf = READ_ONCE(cpudata->lowest_perf);
 	if (_min_perf < capacity)
 		min_perf = DIV_ROUND_UP(cap_perf * _min_perf, capacity);
 
@@ -765,6 +767,32 @@ static void amd_pstate_init_prefcore(unsigned int cpu)
 	}
 }
 
+static void amd_pstate_update_highest_perf(unsigned int cpu)
+{
+	struct cpufreq_policy *policy;
+	struct amd_cpudata *cpudata;
+	u32 prev_high = 0, cur_high = 0;
+	int ret;
+
+	if (!prefcore)
+		return;
+
+	ret = amd_pstate_get_highest_perf(cpu, &cur_high);
+	if (ret)
+		return;
+
+	policy = cpufreq_cpu_get(cpu);
+	cpudata = policy->driver_data;
+	prev_high = READ_ONCE(cpudata->prefcore_ranking);
+
+	if (prev_high != cur_high) {
+		WRITE_ONCE(cpudata->prefcore_ranking, cur_high);
+		sched_set_itmt_core_prio(cur_high, cpu);
+	}
+
+	cpufreq_cpu_put(policy);
+}
+
 static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 {
 	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -947,7 +975,7 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 	u32 perf;
 	struct amd_cpudata *cpudata = policy->driver_data;
 
-	perf = READ_ONCE(cpudata->highest_perf);
+	perf = READ_ONCE(cpudata->prefcore_ranking);
 
 	return sysfs_emit(buf, "%u\n", perf);
 }
@@ -1513,6 +1541,7 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.suspend	= amd_pstate_cpu_suspend,
 	.resume		= amd_pstate_cpu_resume,
 	.set_boost	= amd_pstate_set_boost,
+	.update_highest_perf	= amd_pstate_update_highest_perf,
 	.name		= "amd-pstate",
 	.attr		= amd_pstate_attr,
 };
@@ -1527,6 +1556,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.online		= amd_pstate_epp_cpu_online,
 	.suspend	= amd_pstate_epp_suspend,
 	.resume		= amd_pstate_epp_resume,
+	.update_highest_perf	= amd_pstate_update_highest_perf,
 	.name		= "amd-pstate-epp",
 	.attr		= amd_pstate_epp_attr,
 };
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 446394f84606..030a6a97c2b9 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -39,11 +39,16 @@ struct amd_aperf_mperf {
  * @cppc_req_cached: cached performance request hints
  * @highest_perf: the maximum performance an individual processor may reach,
  *		  assuming ideal conditions
+ *		  For platforms that do not support the preferred core feature, the
+ *		  highest_pef may be configured with 166 or 255, to avoid max frequency
+ *		  calculated wrongly. we take the fixed value as the highest_perf.
  * @nominal_perf: the maximum sustained performance level of the processor,
  *		  assuming ideal operating conditions
  * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
  *			   savings are achieved
  * @lowest_perf: the absolute lowest performance level of the processor
+ * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
+ * 		  priority.
  * @max_freq: the frequency that mapped to highest_perf
  * @min_freq: the frequency that mapped to lowest_perf
  * @nominal_freq: the frequency that mapped to nominal_perf
@@ -70,6 +75,7 @@ struct amd_cpudata {
 	u32	nominal_perf;
 	u32	lowest_nonlinear_perf;
 	u32	lowest_perf;
+	u32     prefcore_ranking;
 
 	u32	max_freq;
 	u32	min_freq;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (4 preceding siblings ...)
  2023-09-18  8:14 ` [PATCH V7 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18  8:14 ` [PATCH V7 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

Introduce amd-pstate preferred core.

check preferred core state:
$ cat /sys/devices/system/cpu/amd-pstate/prefcore

Signed-off-by: Meng Li <li.meng@amd.com>
---
 Documentation/admin-guide/pm/amd-pstate.rst | 58 ++++++++++++++++++++-
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 1cf40f69278c..b729bc6dabd8 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -300,8 +300,8 @@ platforms. The AMD P-States mechanism is the more performance and energy
 efficiency frequency management method on AMD processors.
 
 
-AMD Pstate Driver Operation Modes
-=================================
+``amd-pstate`` Driver Operation Modes
+======================================
 
 ``amd_pstate`` CPPC has 3 operation modes: autonomous (active) mode,
 non-autonomous (passive) mode and guided autonomous (guided) mode.
@@ -353,6 +353,48 @@ is activated.  In this mode, driver requests minimum and maximum performance
 level and the platform autonomously selects a performance level in this range
 and appropriate to the current workload.
 
+``amd-pstate`` Preferred Core
+=================================
+
+The core frequency is subjected to the process variation in semiconductors.
+Not all cores are able to reach the maximum frequency respecting the
+infrastructure limits. Consequently, AMD has redefined the concept of
+maximum frequency of a part. This means that a fraction of cores can reach
+maximum frequency. To find the best process scheduling policy for a given
+scenario, OS needs to know the core ordering informed by the platform through
+highest performance capability register of the CPPC interface.
+
+``amd-pstate`` preferred core enables the scheduler to prefer scheduling on
+cores that can achieve a higher frequency with lower voltage. The preferred
+core rankings can dynamically change based on the workload, platform conditions,
+thermals and ageing.
+
+The priority metric will be initialized by the ``amd-pstate`` driver. The ``amd-pstate``
+driver will also determine whether or not ``amd-pstate`` preferred core is
+supported by the platform.
+
+``amd-pstate`` driver will provide an initial core ordering when the system boots.
+The platform uses the CPPC interfaces to communicate the core ranking to the
+operating system and scheduler to make sure that OS is choosing the cores
+with highest performance firstly for scheduling the process. When ``amd-pstate``
+driver receives a message with the highest performance change, it will
+update the core ranking and set the cpu's priority.
+
+``amd-pstate`` Preferred Core Switch
+=================================
+Kernel Parameters
+-----------------
+
+``amd-pstate`` peferred core`` has two states: enable and disable.
+Enable/disable states can be chosen by different kernel parameters.
+Default enable ``amd-pstate`` preferred core.
+
+``amd_prefcore=disable``
+
+For systems that support ``amd-pstate`` preferred core, the core rankings will
+always be advertised by the platform. But OS can choose to ignore that via the
+kernel parameter ``amd_prefcore=disable``.
+
 User Space Interface in ``sysfs`` - General
 ===========================================
 
@@ -385,6 +427,18 @@ control its functionality at the system level.  They are located in the
         to the operation mode represented by that string - or to be
         unregistered in the "disable" case.
 
+``prefcore``
+	Preferred core state of the driver: "enabled" or "disabled".
+
+	"enabled"
+		Enable the ``amd-pstate`` preferred core.
+
+	"disabled"
+		Disable the ``amd-pstate`` preferred core
+
+
+        This attribute is read-only to check the state of preferred core.
+
 ``cpupower`` tool support for ``amd-pstate``
 ===============================================
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V7 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (5 preceding siblings ...)
  2023-09-18  8:14 ` [PATCH V7 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
@ 2023-09-18  8:14 ` Meng Li
  2023-09-18 17:40 ` [PATCH V7 0/7] amd-pstate preferred core Mario Limonciello
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2023-09-18  8:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny

amd-pstate driver support enable/disable preferred core.
Default enabled on platforms supporting amd-pstate preferred core.
Disable amd-pstate preferred core with
"amd_prefcore=disable" added to the kernel command line.

Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 0a1731a0f0ef..e35b795aa8aa 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -363,6 +363,11 @@
 			  selects a performance level in this range and appropriate
 			  to the current workload.
 
+	amd_prefcore=
+			[X86]
+			disable
+			  Disable amd-pstate preferred core.
+
 	amijoy.map=	[HW,JOY] Amiga joystick support
 			Map of devices attached to JOY0DAT and JOY1DAT
 			Format: <a>,<b>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (6 preceding siblings ...)
  2023-09-18  8:14 ` [PATCH V7 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
@ 2023-09-18 17:40 ` Mario Limonciello
  2023-09-19  0:50   ` Meng, Li (Jassmine)
  2023-09-18 17:44 ` Shuah Khan
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Mario Limonciello @ 2023-09-18 17:40 UTC (permalink / raw)
  To: Meng Li, Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov

On 9/18/2023 03:14, Meng Li wrote:
> Hi all:
> 
> The core frequency is subjected to the process variation in semiconductors.
> Not all cores are able to reach the maximum frequency respecting the
> infrastructure limits. Consequently, AMD has redefined the concept of
> maximum frequency of a part. This means that a fraction of cores can reach
> maximum frequency. To find the best process scheduling policy for a given
> scenario, OS needs to know the core ordering informed by the platform through
> highest performance capability register of the CPPC interface.
> 
> Earlier implementations of amd-pstate preferred core only support a static
> core ranking and targeted performance. Now it has the ability to dynamically
> change the preferred core based on the workload and platform conditions and
> accounting for thermals and aging.
> 
> Amd-pstate driver utilizes the functions and data structures provided by
> the ITMT architecture to enable the scheduler to favor scheduling on cores
> which can be get a higher frequency with lower voltage.
> We call it amd-pstate preferred core.
> 
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> Amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
> 
> Amd-pstate driver will provide an initial core ordering at boot time.
> It relies on the CPPC interface to communicate the core ranking to the
> operating system and scheduler to make sure that OS is choosing the cores
> with highest performance firstly for scheduling the process. When amd-pstate
> driver receives a message with the highest performance change, it will
> update the core ranking.
> 

For the remaining patches missing my tag:

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

> Changes form V6->V7:
> - x86:
> - - Modify kconfig about X86_AMD_PSTATE.
> - cpufreq: amd-pstate:
> - - modify incorrect comments about scheduler_work().
> - - convert highest_perf data type.
> - - modify preferred core init when cpu init and online.
> - acpi: cppc:
> - - modify link of CPPC highest performance.
> - cpufreq:
> - - modify link of CPPC highest performance changed.
> 
> Changes form V5->V6:
> - cpufreq: amd-pstate:
> - - modify the wrong tag order.
> - - modify warning about hw_prefcore sysfs attribute.
> - - delete duplicate comments.
> - - modify the variable name cppc_highest_perf to prefcore_ranking.
> - - modify judgment conditions for setting highest_perf.
> - - modify sysfs attribute for CPPC highest perf to pr_debug message.
> - Documentation: amd-pstate:
> - - modify warning: title underline too short.
> 
> Changes form V4->V5:
> - cpufreq: amd-pstate:
> - - modify sysfs attribute for CPPC highest perf.
> - - modify warning about comments
> - - rebase linux-next
> - cpufreq:
> - - Moidfy warning about function declarations.
> - Documentation: amd-pstate:
> - - align with ``amd-pstat``
> 
> Changes form V3->V4:
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V2->V3:
> - x86:
> - - Modify kconfig and description.
> - cpufreq: amd-pstate:
> - - Add Co-developed-by tag in commit message.
> - cpufreq:
> - - Modify commit message.
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V1->V2:
> - acpi: cppc:
> - - Add reference link.
> - cpufreq:
> - - Moidfy link error.
> - cpufreq: amd-pstate:
> - - Init the priorities of all online CPUs
> - - Use a single variable to represent the status of preferred core.
> - Documentation:
> - - Default enabled preferred core.
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> - - Default enabled preferred core.
> - - Use a single variable to represent the status of preferred core.
> 
> Meng Li (7):
>    x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
>    acpi: cppc: Add get the highest performance cppc control
>    cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
>    cpufreq: Add a notification message that the highest perf has changed
>    cpufreq: amd-pstate: Update amd-pstate preferred core ranking
>      dynamically
>    Documentation: amd-pstate: introduce amd-pstate preferred core
>    Documentation: introduce amd-pstate preferrd core mode kernel command
>      line options
> 
>   .../admin-guide/kernel-parameters.txt         |   5 +
>   Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
>   arch/x86/Kconfig                              |   5 +-
>   drivers/acpi/cppc_acpi.c                      |  13 ++
>   drivers/acpi/processor_driver.c               |   6 +
>   drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
>   drivers/cpufreq/cpufreq.c                     |  13 ++
>   include/acpi/cppc_acpi.h                      |   5 +
>   include/linux/amd-pstate.h                    |   6 +
>   include/linux/cpufreq.h                       |   5 +
>   10 files changed, 291 insertions(+), 22 deletions(-)
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (7 preceding siblings ...)
  2023-09-18 17:40 ` [PATCH V7 0/7] amd-pstate preferred core Mario Limonciello
@ 2023-09-18 17:44 ` Shuah Khan
  2023-09-18 18:23   ` Shuah
  2023-09-19 19:01 ` Oleksandr Natalenko
  2023-09-20  2:50 ` Huang Rui
  10 siblings, 1 reply; 19+ messages in thread
From: Shuah Khan @ 2023-09-18 17:44 UTC (permalink / raw)
  To: Meng Li, Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, linux-kselftest,
	Nathan Fontenot, Deepak Sharma, Alex Deucher, Mario Limonciello,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov, Shuah Khan

On 9/18/23 02:14, Meng Li wrote:
> Hi all:
> 
> The core frequency is subjected to the process variation in semiconductors.
> Not all cores are able to reach the maximum frequency respecting the
> infrastructure limits. Consequently, AMD has redefined the concept of
> maximum frequency of a part. This means that a fraction of cores can reach
> maximum frequency. To find the best process scheduling policy for a given
> scenario, OS needs to know the core ordering informed by the platform through
> highest performance capability register of the CPPC interface.
> 
> Earlier implementations of amd-pstate preferred core only support a static
> core ranking and targeted performance. Now it has the ability to dynamically
> change the preferred core based on the workload and platform conditions and
> accounting for thermals and aging.
> 
> Amd-pstate driver utilizes the functions and data structures provided by
> the ITMT architecture to enable the scheduler to favor scheduling on cores
> which can be get a higher frequency with lower voltage.
> We call it amd-pstate preferred core.
> 
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> Amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
> 
> Amd-pstate driver will provide an initial core ordering at boot time.
> It relies on the CPPC interface to communicate the core ranking to the
> operating system and scheduler to make sure that OS is choosing the cores
> with highest performance firstly for scheduling the process. When amd-pstate
> driver receives a message with the highest performance change, it will
> update the core ranking.
> 
> Changes form V6->V7:
> - x86:
> - - Modify kconfig about X86_AMD_PSTATE.
> - cpufreq: amd-pstate:
> - - modify incorrect comments about scheduler_work().
> - - convert highest_perf data type.
> - - modify preferred core init when cpu init and online.
> - acpi: cppc:
> - - modify link of CPPC highest performance.
> - cpufreq:
> - - modify link of CPPC highest performance changed.
> 

This series in now in linux-kselftest next branch for Linux 6.7-rc1.

If there are any changes and/or fixes, please send patches on top of
linux-kselftest next branch.

thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18 17:44 ` Shuah Khan
@ 2023-09-18 18:23   ` Shuah
  0 siblings, 0 replies; 19+ messages in thread
From: Shuah @ 2023-09-18 18:23 UTC (permalink / raw)
  To: Shuah Khan, Meng Li, Rafael J . Wysocki, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, linux-kselftest,
	Nathan Fontenot, Deepak Sharma, Alex Deucher, Mario Limonciello,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov, Shuah Khan

On 9/18/23 11:44, Shuah Khan wrote:
> On 9/18/23 02:14, Meng Li wrote:
>> Hi all:
>>
>> The core frequency is subjected to the process variation in semiconductors.
>> Not all cores are able to reach the maximum frequency respecting the
>> infrastructure limits. Consequently, AMD has redefined the concept of
>> maximum frequency of a part. This means that a fraction of cores can reach
>> maximum frequency. To find the best process scheduling policy for a given
>> scenario, OS needs to know the core ordering informed by the platform through
>> highest performance capability register of the CPPC interface.
>>
>> Earlier implementations of amd-pstate preferred core only support a static
>> core ranking and targeted performance. Now it has the ability to dynamically
>> change the preferred core based on the workload and platform conditions and
>> accounting for thermals and aging.
>>
>> Amd-pstate driver utilizes the functions and data structures provided by
>> the ITMT architecture to enable the scheduler to favor scheduling on cores
>> which can be get a higher frequency with lower voltage.
>> We call it amd-pstate preferred core.
>>
>> Here sched_set_itmt_core_prio() is called to set priorities and
>> sched_set_itmt_support() is called to enable ITMT feature.
>> Amd-pstate driver uses the highest performance value to indicate
>> the priority of CPU. The higher value has a higher priority.
>>
>> Amd-pstate driver will provide an initial core ordering at boot time.
>> It relies on the CPPC interface to communicate the core ranking to the
>> operating system and scheduler to make sure that OS is choosing the cores
>> with highest performance firstly for scheduling the process. When amd-pstate
>> driver receives a message with the highest performance change, it will
>> update the core ranking.
>>
>> Changes form V6->V7:
>> - x86:
>> - - Modify kconfig about X86_AMD_PSTATE.
>> - cpufreq: amd-pstate:
>> - - modify incorrect comments about scheduler_work().
>> - - convert highest_perf data type.
>> - - modify preferred core init when cpu init and online.
>> - acpi: cppc:
>> - - modify link of CPPC highest performance.
>> - cpufreq:
>> - - modify link of CPPC highest performance changed.
>>
> 
> This series in now in linux-kselftest next branch for Linux 6.7-rc1.
> 
> If there are any changes and/or fixes, please send patches on top of
> linux-kselftest next branch.
> 

Sorry for the mix-up. Wrong series - my bad. This series isn't
in linux-kselftest next.

thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18 17:40 ` [PATCH V7 0/7] amd-pstate preferred core Mario Limonciello
@ 2023-09-19  0:50   ` Meng, Li (Jassmine)
  0 siblings, 0 replies; 19+ messages in thread
From: Meng, Li (Jassmine) @ 2023-09-19  0:50 UTC (permalink / raw)
  To: Limonciello, Mario, Rafael J . Wysocki, Huang, Ray
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Fontenot, Nathan, Sharma, Deepak, Deucher,
	Alexander, Huang, Shimmer, Yuan, Perry, Du, Xiaojian,
	Viresh Kumar, Borislav Petkov

[AMD Official Use Only - General]

Hi Mario:

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Tuesday, September 19, 2023 1:41 AM
> To: Meng, Li (Jassmine) <Li.Meng@amd.com>; Rafael J . Wysocki
> <rafael.j.wysocki@intel.com>; Huang, Ray <Ray.Huang@amd.com>
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> x86@kernel.org; linux-acpi@vger.kernel.org; Shuah Khan
> <skhan@linuxfoundation.org>; linux-kselftest@vger.kernel.org; Fontenot,
> Nathan <Nathan.Fontenot@amd.com>; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>; Du,
> Xiaojian <Xiaojian.Du@amd.com>; Viresh Kumar <viresh.kumar@linaro.org>;
> Borislav Petkov <bp@alien8.de>
> Subject: Re: [PATCH V7 0/7] amd-pstate preferred core
>
> On 9/18/2023 03:14, Meng Li wrote:
> > Hi all:
> >
> > The core frequency is subjected to the process variation in semiconductors.
> > Not all cores are able to reach the maximum frequency respecting the
> > infrastructure limits. Consequently, AMD has redefined the concept of
> > maximum frequency of a part. This means that a fraction of cores can
> > reach maximum frequency. To find the best process scheduling policy
> > for a given scenario, OS needs to know the core ordering informed by
> > the platform through highest performance capability register of the CPPC
> interface.
> >
> > Earlier implementations of amd-pstate preferred core only support a
> > static core ranking and targeted performance. Now it has the ability
> > to dynamically change the preferred core based on the workload and
> > platform conditions and accounting for thermals and aging.
> >
> > Amd-pstate driver utilizes the functions and data structures provided
> > by the ITMT architecture to enable the scheduler to favor scheduling
> > on cores which can be get a higher frequency with lower voltage.
> > We call it amd-pstate preferred core.
> >
> > Here sched_set_itmt_core_prio() is called to set priorities and
> > sched_set_itmt_support() is called to enable ITMT feature.
> > Amd-pstate driver uses the highest performance value to indicate the
> > priority of CPU. The higher value has a higher priority.
> >
> > Amd-pstate driver will provide an initial core ordering at boot time.
> > It relies on the CPPC interface to communicate the core ranking to the
> > operating system and scheduler to make sure that OS is choosing the
> > cores with highest performance firstly for scheduling the process.
> > When amd-pstate driver receives a message with the highest performance
> > change, it will update the core ranking.
> >
>
> For the remaining patches missing my tag:
>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
>
[Meng, Li (Jassmine)]
Thank you very much!
I will add Review-by flag for the remaining patches.

> > Changes form V6->V7:
> > - x86:
> > - - Modify kconfig about X86_AMD_PSTATE.
> > - cpufreq: amd-pstate:
> > - - modify incorrect comments about scheduler_work().
> > - - convert highest_perf data type.
> > - - modify preferred core init when cpu init and online.
> > - acpi: cppc:
> > - - modify link of CPPC highest performance.
> > - cpufreq:
> > - - modify link of CPPC highest performance changed.
> >
> > Changes form V5->V6:
> > - cpufreq: amd-pstate:
> > - - modify the wrong tag order.
> > - - modify warning about hw_prefcore sysfs attribute.
> > - - delete duplicate comments.
> > - - modify the variable name cppc_highest_perf to prefcore_ranking.
> > - - modify judgment conditions for setting highest_perf.
> > - - modify sysfs attribute for CPPC highest perf to pr_debug message.
> > - Documentation: amd-pstate:
> > - - modify warning: title underline too short.
> >
> > Changes form V4->V5:
> > - cpufreq: amd-pstate:
> > - - modify sysfs attribute for CPPC highest perf.
> > - - modify warning about comments
> > - - rebase linux-next
> > - cpufreq:
> > - - Moidfy warning about function declarations.
> > - Documentation: amd-pstate:
> > - - align with ``amd-pstat``
> >
> > Changes form V3->V4:
> > - Documentation: amd-pstate:
> > - - Modify inappropriate descriptions.
> >
> > Changes form V2->V3:
> > - x86:
> > - - Modify kconfig and description.
> > - cpufreq: amd-pstate:
> > - - Add Co-developed-by tag in commit message.
> > - cpufreq:
> > - - Modify commit message.
> > - Documentation: amd-pstate:
> > - - Modify inappropriate descriptions.
> >
> > Changes form V1->V2:
> > - acpi: cppc:
> > - - Add reference link.
> > - cpufreq:
> > - - Moidfy link error.
> > - cpufreq: amd-pstate:
> > - - Init the priorities of all online CPUs
> > - - Use a single variable to represent the status of preferred core.
> > - Documentation:
> > - - Default enabled preferred core.
> > - Documentation: amd-pstate:
> > - - Modify inappropriate descriptions.
> > - - Default enabled preferred core.
> > - - Use a single variable to represent the status of preferred core.
> >
> > Meng Li (7):
> >    x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
> >    acpi: cppc: Add get the highest performance cppc control
> >    cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
> >    cpufreq: Add a notification message that the highest perf has changed
> >    cpufreq: amd-pstate: Update amd-pstate preferred core ranking
> >      dynamically
> >    Documentation: amd-pstate: introduce amd-pstate preferred core
> >    Documentation: introduce amd-pstate preferrd core mode kernel
> command
> >      line options
> >
> >   .../admin-guide/kernel-parameters.txt         |   5 +
> >   Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
> >   arch/x86/Kconfig                              |   5 +-
> >   drivers/acpi/cppc_acpi.c                      |  13 ++
> >   drivers/acpi/processor_driver.c               |   6 +
> >   drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
> >   drivers/cpufreq/cpufreq.c                     |  13 ++
> >   include/acpi/cppc_acpi.h                      |   5 +
> >   include/linux/amd-pstate.h                    |   6 +
> >   include/linux/cpufreq.h                       |   5 +
> >   10 files changed, 291 insertions(+), 22 deletions(-)
> >


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (8 preceding siblings ...)
  2023-09-18 17:44 ` Shuah Khan
@ 2023-09-19 19:01 ` Oleksandr Natalenko
  2023-09-20 16:56   ` Mario Limonciello
  2023-09-20  2:50 ` Huang Rui
  10 siblings, 1 reply; 19+ messages in thread
From: Oleksandr Natalenko @ 2023-09-19 19:01 UTC (permalink / raw)
  To: Rafael J . Wysocki, Huang Rui, Meng Li
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Meng Li

[-- Attachment #1: Type: text/plain, Size: 5772 bytes --]

Hello.

On pondělí 18. září 2023 10:14:00 CEST Meng Li wrote:
> Hi all:
> 
> The core frequency is subjected to the process variation in semiconductors.
> Not all cores are able to reach the maximum frequency respecting the
> infrastructure limits. Consequently, AMD has redefined the concept of
> maximum frequency of a part. This means that a fraction of cores can reach
> maximum frequency. To find the best process scheduling policy for a given
> scenario, OS needs to know the core ordering informed by the platform through
> highest performance capability register of the CPPC interface.
> 
> Earlier implementations of amd-pstate preferred core only support a static
> core ranking and targeted performance. Now it has the ability to dynamically
> change the preferred core based on the workload and platform conditions and
> accounting for thermals and aging.
> 
> Amd-pstate driver utilizes the functions and data structures provided by
> the ITMT architecture to enable the scheduler to favor scheduling on cores
> which can be get a higher frequency with lower voltage.
> We call it amd-pstate preferred core.
> 
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> Amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
> 
> Amd-pstate driver will provide an initial core ordering at boot time.
> It relies on the CPPC interface to communicate the core ranking to the
> operating system and scheduler to make sure that OS is choosing the cores
> with highest performance firstly for scheduling the process. When amd-pstate
> driver receives a message with the highest performance change, it will
> update the core ranking.
> 
> Changes form V6->V7:
> - x86:
> - - Modify kconfig about X86_AMD_PSTATE.
> - cpufreq: amd-pstate:
> - - modify incorrect comments about scheduler_work().
> - - convert highest_perf data type.
> - - modify preferred core init when cpu init and online.
> - acpi: cppc:
> - - modify link of CPPC highest performance.
> - cpufreq:
> - - modify link of CPPC highest performance changed.
> 
> Changes form V5->V6:
> - cpufreq: amd-pstate:
> - - modify the wrong tag order.
> - - modify warning about hw_prefcore sysfs attribute.
> - - delete duplicate comments.
> - - modify the variable name cppc_highest_perf to prefcore_ranking.
> - - modify judgment conditions for setting highest_perf.
> - - modify sysfs attribute for CPPC highest perf to pr_debug message.
> - Documentation: amd-pstate:
> - - modify warning: title underline too short.
> 
> Changes form V4->V5:
> - cpufreq: amd-pstate:
> - - modify sysfs attribute for CPPC highest perf.
> - - modify warning about comments
> - - rebase linux-next
> - cpufreq: 
> - - Moidfy warning about function declarations.
> - Documentation: amd-pstate:
> - - align with ``amd-pstat``
> 
> Changes form V3->V4:
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V2->V3:
> - x86:
> - - Modify kconfig and description.
> - cpufreq: amd-pstate: 
> - - Add Co-developed-by tag in commit message.
> - cpufreq:
> - - Modify commit message.
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V1->V2:
> - acpi: cppc:
> - - Add reference link.
> - cpufreq:
> - - Moidfy link error.
> - cpufreq: amd-pstate: 
> - - Init the priorities of all online CPUs
> - - Use a single variable to represent the status of preferred core.
> - Documentation:
> - - Default enabled preferred core.
> - Documentation: amd-pstate: 
> - - Modify inappropriate descriptions.
> - - Default enabled preferred core.
> - - Use a single variable to represent the status of preferred core.
> 
> Meng Li (7):
>   x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
>   acpi: cppc: Add get the highest performance cppc control
>   cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
>   cpufreq: Add a notification message that the highest perf has changed
>   cpufreq: amd-pstate: Update amd-pstate preferred core ranking
>     dynamically
>   Documentation: amd-pstate: introduce amd-pstate preferred core
>   Documentation: introduce amd-pstate preferrd core mode kernel command
>     line options
> 
>  .../admin-guide/kernel-parameters.txt         |   5 +
>  Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
>  arch/x86/Kconfig                              |   5 +-
>  drivers/acpi/cppc_acpi.c                      |  13 ++
>  drivers/acpi/processor_driver.c               |   6 +
>  drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
>  drivers/cpufreq/cpufreq.c                     |  13 ++
>  include/acpi/cppc_acpi.h                      |   5 +
>  include/linux/amd-pstate.h                    |   6 +
>  include/linux/cpufreq.h                       |   5 +
>  10 files changed, 291 insertions(+), 22 deletions(-)

When applied on top of v6.5.3 this breaks turbo on my 5950X after suspend/resume cycle. Please see the scenario description below.

If I boot v6.5.3 + this patchset, then `turbostat` reports ~4.9 GHz on core 0 where `taskset -c 0 dd if=/dev/zero of=/dev/null` is being run.

After I suspend the machine and then resume it, and run `dd` again, `turbostat` reports the core to be capped to a stock frequency of ~3.4 GHz. Rebooting the machine fixes this, and the CPU can boost again.

If this patchset is reverted, then the CPU can turbo after suspend/resume cycle just fine.

I'm using `amd_pstate=guided`.

Is this behaviour expected?

Thanks.

-- 
Oleksandr Natalenko (post-factum)

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
  2023-09-18  8:14 ` [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
@ 2023-09-20  2:43   ` Huang Rui
  0 siblings, 0 replies; 19+ messages in thread
From: Huang Rui @ 2023-09-20  2:43 UTC (permalink / raw)
  To: Meng, Li (Jassmine)
  Cc: Rafael J . Wysocki, linux-pm, linux-kernel, x86, linux-acpi,
	Shuah Khan, linux-kselftest, Fontenot, Nathan, Sharma, Deepak,
	Deucher, Alexander, Limonciello, Mario, Huang, Shimmer, Yuan,
	Perry, Du, Xiaojian, Viresh Kumar, Borislav Petkov

On Mon, Sep 18, 2023 at 04:14:03PM +0800, Meng, Li (Jassmine) wrote:
> amd-pstate driver utilizes the functions and data structures
> provided by the ITMT architecture to enable the scheduler to
> favor scheduling on cores which can be get a higher frequency
> with lower voltage. We call it amd-pstate preferrred core.
> 
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
> 
> The initial core rankings are set up by amd-pstate when the
> system boots.
> 
> Add device attribute for hardware preferred core. It will check
> if the processor and power firmware support preferred core
> feature.
> 
> Add device attribute for preferred core. Only when hardware
> supports preferred core and user set `enabled` in early parameter,
> it can be set to enabled.
> 
> Add one new early parameter `disable` to allow user to disable
> the preferred core.
> 
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> Signed-off-by: Meng Li <li.meng@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 163 +++++++++++++++++++++++++++++++----
>  1 file changed, 147 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 9a1e194d5cf8..050e23594057 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -37,6 +37,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/static_call.h>
>  #include <linux/amd-pstate.h>
> +#include <linux/topology.h>
>  
>  #include <acpi/processor.h>
>  #include <acpi/cppc_acpi.h>
> @@ -49,6 +50,8 @@
>  
>  #define AMD_PSTATE_TRANSITION_LATENCY	20000
>  #define AMD_PSTATE_TRANSITION_DELAY	1000
> +#define AMD_PSTATE_PREFCORE_THRESHOLD	166
> +#define AMD_PSTATE_MAX_CPPC_PERF	255
>  
>  /*
>   * TODO: We need more time to fine tune processors with shared memory solution
> @@ -65,6 +68,12 @@ static struct cpufreq_driver amd_pstate_epp_driver;
>  static int cppc_state = AMD_PSTATE_UNDEFINED;
>  static bool cppc_enabled;
>  
> +/*HW preferred Core featue is supported*/
> +static bool hw_prefcore = true;
> +
> +/*Preferred Core featue is supported*/
> +static bool prefcore = true;
> +
>  /*
>   * AMD Energy Preference Performance (EPP)
>   * The EPP is used in the CCLK DPM controller to drive
> @@ -290,23 +299,21 @@ static inline int amd_pstate_enable(bool enable)
>  static int pstate_init_perf(struct amd_cpudata *cpudata)
>  {
>  	u64 cap1;
> -	u32 highest_perf;
>  
>  	int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
>  				     &cap1);
>  	if (ret)
>  		return ret;
>  
> -	/*
> -	 * TODO: Introduce AMD specific power feature.
> -	 *
> -	 * CPPC entry doesn't indicate the highest performance in some ASICs.
> +	/* For platforms that do not support the preferred core feature, the
> +	 * highest_pef may be configured with 166 or 255, to avoid max frequency
> +	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
> +	 * the default max perf.
>  	 */
> -	highest_perf = amd_get_highest_perf();
> -	if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
> -		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> -
> -	WRITE_ONCE(cpudata->highest_perf, highest_perf);
> +	if (hw_prefcore)
> +		WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> +	else
> +		WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
>  
>  	WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
>  	WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
> @@ -318,17 +325,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
>  static int cppc_init_perf(struct amd_cpudata *cpudata)
>  {
>  	struct cppc_perf_caps cppc_perf;
> -	u32 highest_perf;
>  
>  	int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
>  	if (ret)
>  		return ret;
>  
> -	highest_perf = amd_get_highest_perf();
> -	if (highest_perf > cppc_perf.highest_perf)
> -		highest_perf = cppc_perf.highest_perf;
> -
> -	WRITE_ONCE(cpudata->highest_perf, highest_perf);
> +	if (hw_prefcore)
> +		WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> +	else
> +		WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
>  
>  	WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
>  	WRITE_ONCE(cpudata->lowest_nonlinear_perf,
> @@ -676,6 +681,90 @@ static void amd_perf_ctl_reset(unsigned int cpu)
>  	wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0);
>  }
>  
> +/*
> + * Set amd-pstate preferred core enable can't be done directly from cpufreq callbacks
> + * due to locking, so queue the work for later.
> + */
> +static void amd_pstste_sched_prefcore_workfn(struct work_struct *work)
> +{
> +	sched_set_itmt_support();
> +}
> +static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
> +
> +/*
> + * Get the highest performance register value.
> + * @cpu: CPU from which to get highest performance.
> + * @highest_perf: Return address.
> + *
> + * Return: 0 for success, -EIO otherwise.
> + */
> +static int amd_pstate_get_highest_perf(int cpu, u32 *highest_perf)
> +{
> +	int ret;
> +	u64 cppc_highest_perf;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		u64 cap1;
> +
> +		ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
> +	} else {
> +		ret = cppc_get_highest_perf(cpu, &cppc_highest_perf);
> +		*highest_perf = (u32)(cppc_highest_perf & 0xFFFF);
> +	}
> +
> +	return (ret);
> +}
> +
> +static void amd_pstate_init_prefcore(unsigned int cpu)
> +{
> +	int ret;
> +	u32 highest_perf;
> +	static u32 max_highest_perf = 0, min_highest_perf = U32_MAX;
> +
> +	if (!prefcore)
> +		return;
> +
> +	ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
> +	if (ret)
> +		return;
> +
> +	/*
> +	 * The priorities can be set regardless of whether or not
> +	 * sched_set_itmt_support(true) has been called and it is valid to
> +	 * update them at any time after it has been called.
> +	 */
> +	sched_set_itmt_core_prio(highest_perf, cpu);
> +
> +	/* check if CPPC preferred core feature is enabled*/
> +	if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) {
> +		pr_debug("AMD CPPC preferred core is unsupported!\n");
> +		hw_prefcore = false;
> +		prefcore = false;

The problem that I commented in below version is still there. The
amd_pstate_init_prefcore() will be called in amd_pstate_cpu_init() which
will be initialized on each cpu. So the hw_perfcore/prefcore will be
overwrited at last cpu initialization.

https://lore.kernel.org/linux-pm/ZPiEM+gusure7vKy@amd.com/

Thanks,
Ray

> +		return;
> +	}
> +
> +	if (max_highest_perf <= min_highest_perf) {
> +		if (highest_perf > max_highest_perf)
> +			max_highest_perf = highest_perf;
> +
> +		if (highest_perf < min_highest_perf)
> +			min_highest_perf = highest_perf;
> +
> +		if (max_highest_perf > min_highest_perf) {
> +			/*
> +			 * This code can be run during CPU online under the
> +			 * CPU hotplug locks, so sched_set_itmt_support()
> +			 * cannot be called from here.  Queue up a work item
> +			 * to invoke it.
> +			 */
> +			schedule_work(&sched_prefcore_work);
> +		}
> +	}
> +}
> +
>  static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  {
>  	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@ -697,6 +786,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  
>  	cpudata->cpu = policy->cpu;
>  
> +	amd_pstate_init_prefcore(policy->cpu);
> +
>  	ret = amd_pstate_init_perf(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
> @@ -763,6 +854,22 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  	return ret;
>  }
>  
> +static int amd_pstate_cpu_online(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +
> +	pr_debug("CPU %d going online\n", cpudata->cpu);
> +
> +	amd_pstate_init_prefcore(cpudata->cpu);
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	return 0;
> +}
> +
>  static int amd_pstate_cpu_exit(struct cpufreq_policy *policy)
>  {
>  	struct amd_cpudata *cpudata = policy->driver_data;
> @@ -1037,6 +1144,12 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
>  	return ret < 0 ? ret : count;
>  }
>  
> +static ssize_t prefcore_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
> +{
> +	return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled");
> +}
> +
>  cpufreq_freq_attr_ro(amd_pstate_max_freq);
>  cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>  
> @@ -1044,6 +1157,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
>  cpufreq_freq_attr_rw(energy_performance_preference);
>  cpufreq_freq_attr_ro(energy_performance_available_preferences);
>  static DEVICE_ATTR_RW(status);
> +static DEVICE_ATTR_RO(prefcore);
>  
>  static struct freq_attr *amd_pstate_attr[] = {
>  	&amd_pstate_max_freq,
> @@ -1063,6 +1177,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
>  
>  static struct attribute *pstate_global_attributes[] = {
>  	&dev_attr_status.attr,
> +	&dev_attr_prefcore.attr,
>  	NULL
>  };
>  
> @@ -1114,6 +1229,8 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>  	cpudata->cpu = policy->cpu;
>  	cpudata->epp_policy = 0;
>  
> +	amd_pstate_init_prefcore(policy->cpu);
> +
>  	ret = amd_pstate_init_perf(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
> @@ -1285,6 +1402,8 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>  
>  	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
>  
> +	amd_pstate_init_prefcore(cpudata->cpu);
> +
>  	if (cppc_state == AMD_PSTATE_ACTIVE) {
>  		amd_pstate_epp_reenable(cpudata);
>  		cpudata->suspended = false;
> @@ -1389,6 +1508,8 @@ static struct cpufreq_driver amd_pstate_driver = {
>  	.fast_switch    = amd_pstate_fast_switch,
>  	.init		= amd_pstate_cpu_init,
>  	.exit		= amd_pstate_cpu_exit,
> +	.offline	= amd_pstate_cpu_offline,
> +	.online		= amd_pstate_cpu_online,
>  	.suspend	= amd_pstate_cpu_suspend,
>  	.resume		= amd_pstate_cpu_resume,
>  	.set_boost	= amd_pstate_set_boost,
> @@ -1527,7 +1648,17 @@ static int __init amd_pstate_param(char *str)
>  
>  	return amd_pstate_set_driver(mode_idx);
>  }
> +
> +static int __init amd_prefcore_param(char *str)
> +{
> +	if (!strcmp(str, "disable"))
> +		prefcore = false;
> +
> +	return 0;
> +}
> +
>  early_param("amd_pstate", amd_pstate_param);
> +early_param("amd_prefcore", amd_prefcore_param);
>  
>  MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
>  MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
                   ` (9 preceding siblings ...)
  2023-09-19 19:01 ` Oleksandr Natalenko
@ 2023-09-20  2:50 ` Huang Rui
  10 siblings, 0 replies; 19+ messages in thread
From: Huang Rui @ 2023-09-20  2:50 UTC (permalink / raw)
  To: Meng, Li (Jassmine)
  Cc: Rafael J . Wysocki, linux-pm, linux-kernel, x86, linux-acpi,
	Shuah Khan, linux-kselftest, Fontenot, Nathan, Sharma, Deepak,
	Deucher, Alexander, Limonciello, Mario, Huang, Shimmer, Yuan,
	Perry, Du, Xiaojian, Viresh Kumar, Borislav Petkov

On Mon, Sep 18, 2023 at 04:14:00PM +0800, Meng, Li (Jassmine) wrote:
> Hi all:
> 
> The core frequency is subjected to the process variation in semiconductors.
> Not all cores are able to reach the maximum frequency respecting the
> infrastructure limits. Consequently, AMD has redefined the concept of
> maximum frequency of a part. This means that a fraction of cores can reach
> maximum frequency. To find the best process scheduling policy for a given
> scenario, OS needs to know the core ordering informed by the platform through
> highest performance capability register of the CPPC interface.
> 
> Earlier implementations of amd-pstate preferred core only support a static
> core ranking and targeted performance. Now it has the ability to dynamically
> change the preferred core based on the workload and platform conditions and
> accounting for thermals and aging.
> 
> Amd-pstate driver utilizes the functions and data structures provided by
> the ITMT architecture to enable the scheduler to favor scheduling on cores
> which can be get a higher frequency with lower voltage.
> We call it amd-pstate preferred core.
> 
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> Amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
> 
> Amd-pstate driver will provide an initial core ordering at boot time.
> It relies on the CPPC interface to communicate the core ranking to the
> operating system and scheduler to make sure that OS is choosing the cores
> with highest performance firstly for scheduling the process. When amd-pstate
> driver receives a message with the highest performance change, it will
> update the core ranking.
> 
> Changes form V6->V7:
> - x86:
> - - Modify kconfig about X86_AMD_PSTATE.
> - cpufreq: amd-pstate:
> - - modify incorrect comments about scheduler_work().
> - - convert highest_perf data type.
> - - modify preferred core init when cpu init and online.
> - acpi: cppc:
> - - modify link of CPPC highest performance.
> - cpufreq:
> - - modify link of CPPC highest performance changed.
> 
> Changes form V5->V6:
> - cpufreq: amd-pstate:
> - - modify the wrong tag order.
> - - modify warning about hw_prefcore sysfs attribute.
> - - delete duplicate comments.
> - - modify the variable name cppc_highest_perf to prefcore_ranking.
> - - modify judgment conditions for setting highest_perf.
> - - modify sysfs attribute for CPPC highest perf to pr_debug message.
> - Documentation: amd-pstate:
> - - modify warning: title underline too short.

Apart from the comment in patch 3, others look good for me.

Please feel free to add my RB in other patches:

Reviewed-by: Huang Rui <ray.huang@amd.com>

> 
> Changes form V4->V5:
> - cpufreq: amd-pstate:
> - - modify sysfs attribute for CPPC highest perf.
> - - modify warning about comments
> - - rebase linux-next
> - cpufreq: 
> - - Moidfy warning about function declarations.
> - Documentation: amd-pstate:
> - - align with ``amd-pstat``
> 
> Changes form V3->V4:
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V2->V3:
> - x86:
> - - Modify kconfig and description.
> - cpufreq: amd-pstate: 
> - - Add Co-developed-by tag in commit message.
> - cpufreq:
> - - Modify commit message.
> - Documentation: amd-pstate:
> - - Modify inappropriate descriptions.
> 
> Changes form V1->V2:
> - acpi: cppc:
> - - Add reference link.
> - cpufreq:
> - - Moidfy link error.
> - cpufreq: amd-pstate: 
> - - Init the priorities of all online CPUs
> - - Use a single variable to represent the status of preferred core.
> - Documentation:
> - - Default enabled preferred core.
> - Documentation: amd-pstate: 
> - - Modify inappropriate descriptions.
> - - Default enabled preferred core.
> - - Use a single variable to represent the status of preferred core.
> 
> Meng Li (7):
>   x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
>   acpi: cppc: Add get the highest performance cppc control
>   cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
>   cpufreq: Add a notification message that the highest perf has changed
>   cpufreq: amd-pstate: Update amd-pstate preferred core ranking
>     dynamically
>   Documentation: amd-pstate: introduce amd-pstate preferred core
>   Documentation: introduce amd-pstate preferrd core mode kernel command
>     line options
> 
>  .../admin-guide/kernel-parameters.txt         |   5 +
>  Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
>  arch/x86/Kconfig                              |   5 +-
>  drivers/acpi/cppc_acpi.c                      |  13 ++
>  drivers/acpi/processor_driver.c               |   6 +
>  drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
>  drivers/cpufreq/cpufreq.c                     |  13 ++
>  include/acpi/cppc_acpi.h                      |   5 +
>  include/linux/amd-pstate.h                    |   6 +
>  include/linux/cpufreq.h                       |   5 +
>  10 files changed, 291 insertions(+), 22 deletions(-)
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-19 19:01 ` Oleksandr Natalenko
@ 2023-09-20 16:56   ` Mario Limonciello
  2023-09-20 19:34     ` Oleksandr Natalenko
  0 siblings, 1 reply; 19+ messages in thread
From: Mario Limonciello @ 2023-09-20 16:56 UTC (permalink / raw)
  To: Oleksandr Natalenko, Huang Rui, Meng Li
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov, Rafael J . Wysocki

On 9/19/2023 14:01, Oleksandr Natalenko wrote:
>> Meng Li (7):
>>    x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
>>    acpi: cppc: Add get the highest performance cppc control
>>    cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
>>    cpufreq: Add a notification message that the highest perf has changed
>>    cpufreq: amd-pstate: Update amd-pstate preferred core ranking
>>      dynamically
>>    Documentation: amd-pstate: introduce amd-pstate preferred core
>>    Documentation: introduce amd-pstate preferrd core mode kernel command
>>      line options
>>
>>   .../admin-guide/kernel-parameters.txt         |   5 +
>>   Documentation/admin-guide/pm/amd-pstate.rst   |  58 +++++-
>>   arch/x86/Kconfig                              |   5 +-
>>   drivers/acpi/cppc_acpi.c                      |  13 ++
>>   drivers/acpi/processor_driver.c               |   6 +
>>   drivers/cpufreq/amd-pstate.c                  | 197 ++++++++++++++++--
>>   drivers/cpufreq/cpufreq.c                     |  13 ++
>>   include/acpi/cppc_acpi.h                      |   5 +
>>   include/linux/amd-pstate.h                    |   6 +
>>   include/linux/cpufreq.h                       |   5 +
>>   10 files changed, 291 insertions(+), 22 deletions(-)
> 
> When applied on top of v6.5.3 this breaks turbo on my 5950X after suspend/resume cycle. Please see the scenario description below.
> 
> If I boot v6.5.3 + this patchset, then `turbostat` reports ~4.9 GHz on core 0 where `taskset -c 0 dd if=/dev/zero of=/dev/null` is being run.
> 
> After I suspend the machine and then resume it, and run `dd` again, `turbostat` reports the core to be capped to a stock frequency of ~3.4 GHz. Rebooting the machine fixes this, and the CPU can boost again.
> 
> If this patchset is reverted, then the CPU can turbo after suspend/resume cycle just fine.
> 
> I'm using `amd_pstate=guided`.
> 
> Is this behaviour expected?

To help confirm where the issue is, can I ask you to do three 
experiments with the patch series applied:

1) 'amd_pstate=active' on your kernel command line.
2) 'amd_pstate=active amd_prefcore=disable' on your kernel command line.
3) 'amd_pstate=guided amd_prefcore=disable' on your kernel command line.

Looking through the code, I anticipate from your report that it 
reproduces on "1" but not "2" and "3".

Meng,

Can you try to repro?

I think that it's probably a call to amd_pstate_init_prefcore() missing
from amd_pstate_cpu_resume() and also amd_pstate_epp_resume().


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-20 16:56   ` Mario Limonciello
@ 2023-09-20 19:34     ` Oleksandr Natalenko
  2023-09-20 20:11       ` Mario Limonciello
  0 siblings, 1 reply; 19+ messages in thread
From: Oleksandr Natalenko @ 2023-09-20 19:34 UTC (permalink / raw)
  To: Huang Rui, Meng Li, Mario Limonciello
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov, Rafael J . Wysocki

[-- Attachment #1: Type: text/plain, Size: 1901 bytes --]

Hello.

On středa 20. září 2023 18:56:09 CEST Mario Limonciello wrote:
> > When applied on top of v6.5.3 this breaks turbo on my 5950X after suspend/resume cycle. Please see the scenario description below.
> > 
> > If I boot v6.5.3 + this patchset, then `turbostat` reports ~4.9 GHz on core 0 where `taskset -c 0 dd if=/dev/zero of=/dev/null` is being run.
> > 
> > After I suspend the machine and then resume it, and run `dd` again, `turbostat` reports the core to be capped to a stock frequency of ~3.4 GHz. Rebooting the machine fixes this, and the CPU can boost again.
> > 
> > If this patchset is reverted, then the CPU can turbo after suspend/resume cycle just fine.
> > 
> > I'm using `amd_pstate=guided`.
> > 
> > Is this behaviour expected?
> 
> To help confirm where the issue is, can I ask you to do three 
> experiments with the patch series applied:
> 
> 1) 'amd_pstate=active' on your kernel command line.

The issue is reproducible. If I toggle the governor in cpupower to `powersave` and back to `performance`, boost is restored.

> 2) 'amd_pstate=active amd_prefcore=disable' on your kernel command line.

The issue is not reproducible.

> 3) 'amd_pstate=guided amd_prefcore=disable' on your kernel command line.

The issue is not reproducible.

I should also mention that in my initial configuration I use `amd_pstate=guided` and `schedutil`. If I switch to `performance` after suspend-resume cycle, the boost is restored. However, if I switch back to `schedutil`, the freq is capped.

Does this info help?

> Looking through the code, I anticipate from your report that it 
> reproduces on "1" but not "2" and "3".
> 
> Meng,
> 
> Can you try to repro?
> 
> I think that it's probably a call to amd_pstate_init_prefcore() missing
> from amd_pstate_cpu_resume() and also amd_pstate_epp_resume().

-- 
Oleksandr Natalenko (post-factum)

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-20 19:34     ` Oleksandr Natalenko
@ 2023-09-20 20:11       ` Mario Limonciello
  2023-09-21  5:51         ` Meng, Li (Jassmine)
  0 siblings, 1 reply; 19+ messages in thread
From: Mario Limonciello @ 2023-09-20 20:11 UTC (permalink / raw)
  To: Oleksandr Natalenko, Huang Rui, Meng Li
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Shimmer Huang, Perry Yuan, Xiaojian Du, Viresh Kumar,
	Borislav Petkov, Rafael J . Wysocki

On 9/20/2023 14:34, Oleksandr Natalenko wrote:
> Hello.
> 
> On středa 20. září 2023 18:56:09 CEST Mario Limonciello wrote:
>>> When applied on top of v6.5.3 this breaks turbo on my 5950X after suspend/resume cycle. Please see the scenario description below.
>>>
>>> If I boot v6.5.3 + this patchset, then `turbostat` reports ~4.9 GHz on core 0 where `taskset -c 0 dd if=/dev/zero of=/dev/null` is being run.
>>>
>>> After I suspend the machine and then resume it, and run `dd` again, `turbostat` reports the core to be capped to a stock frequency of ~3.4 GHz. Rebooting the machine fixes this, and the CPU can boost again.
>>>
>>> If this patchset is reverted, then the CPU can turbo after suspend/resume cycle just fine.
>>>
>>> I'm using `amd_pstate=guided`.
>>>
>>> Is this behaviour expected?
>>
>> To help confirm where the issue is, can I ask you to do three
>> experiments with the patch series applied:
>>
>> 1) 'amd_pstate=active' on your kernel command line.
> 
> The issue is reproducible. If I toggle the governor in cpupower to `powersave` and back to `performance`, boost is restored.
> 
>> 2) 'amd_pstate=active amd_prefcore=disable' on your kernel command line.
> 
> The issue is not reproducible.
> 
>> 3) 'amd_pstate=guided amd_prefcore=disable' on your kernel command line.
> 
> The issue is not reproducible.
> 
> I should also mention that in my initial configuration I use `amd_pstate=guided` and `schedutil`. If I switch to `performance` after suspend-resume cycle, the boost is restored. However, if I switch back to `schedutil`, the freq is capped.
> 
> Does this info help?
> 

Yeah, it matches my expectations for this issue you reported.
Thanks!

Jassmine can dig into a fix for another spin of this series.

>> Looking through the code, I anticipate from your report that it
>> reproduces on "1" but not "2" and "3".
>>
>> Meng,
>>
>> Can you try to repro?
>>
>> I think that it's probably a call to amd_pstate_init_prefcore() missing
>> from amd_pstate_cpu_resume() and also amd_pstate_epp_resume().
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH V7 0/7] amd-pstate preferred core
  2023-09-20 20:11       ` Mario Limonciello
@ 2023-09-21  5:51         ` Meng, Li (Jassmine)
  0 siblings, 0 replies; 19+ messages in thread
From: Meng, Li (Jassmine) @ 2023-09-21  5:51 UTC (permalink / raw)
  To: Limonciello, Mario, Oleksandr Natalenko, Huang, Ray
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Fontenot, Nathan, Sharma, Deepak, Deucher,
	Alexander, Huang, Shimmer, Yuan, Perry, Du, Xiaojian,
	Viresh Kumar, Borislav Petkov, Rafael J . Wysocki

[AMD Official Use Only - General]

Hi Natalenko and Mario:

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Thursday, September 21, 2023 4:12 AM
> To: Oleksandr Natalenko <oleksandr@natalenko.name>; Huang, Ray
> <Ray.Huang@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>
> Cc: linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> x86@kernel.org; linux-acpi@vger.kernel.org; Shuah Khan
> <skhan@linuxfoundation.org>; linux-kselftest@vger.kernel.org; Fontenot,
> Nathan <Nathan.Fontenot@amd.com>; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>; Du,
> Xiaojian <Xiaojian.Du@amd.com>; Viresh Kumar <viresh.kumar@linaro.org>;
> Borislav Petkov <bp@alien8.de>; Rafael J . Wysocki
> <rafael.j.wysocki@intel.com>
> Subject: Re: [PATCH V7 0/7] amd-pstate preferred core
>
> On 9/20/2023 14:34, Oleksandr Natalenko wrote:
> > Hello.
> >
> > On středa 20. září 2023 18:56:09 CEST Mario Limonciello wrote:
> >>> When applied on top of v6.5.3 this breaks turbo on my 5950X after
> suspend/resume cycle. Please see the scenario description below.
> >>>
> >>> If I boot v6.5.3 + this patchset, then `turbostat` reports ~4.9 GHz on core
> 0 where `taskset -c 0 dd if=/dev/zero of=/dev/null` is being run.
> >>>
> >>> After I suspend the machine and then resume it, and run `dd` again,
> `turbostat` reports the core to be capped to a stock frequency of ~3.4 GHz.
> Rebooting the machine fixes this, and the CPU can boost again.
> >>>
> >>> If this patchset is reverted, then the CPU can turbo after
> suspend/resume cycle just fine.
> >>>
> >>> I'm using `amd_pstate=guided`.
> >>>
> >>> Is this behaviour expected?
> >>
> >> To help confirm where the issue is, can I ask you to do three
> >> experiments with the patch series applied:
> >>
> >> 1) 'amd_pstate=active' on your kernel command line.
> >
> > The issue is reproducible. If I toggle the governor in cpupower to
> `powersave` and back to `performance`, boost is restored.
> >
> >> 2) 'amd_pstate=active amd_prefcore=disable' on your kernel command
> line.
> >
> > The issue is not reproducible.
> >
> >> 3) 'amd_pstate=guided amd_prefcore=disable' on your kernel command
> line.
> >
> > The issue is not reproducible.
> >
> > I should also mention that in my initial configuration I use
> `amd_pstate=guided` and `schedutil`. If I switch to `performance` after
> suspend-resume cycle, the boost is restored. However, if I switch back to
> `schedutil`, the freq is capped.
> >
> > Does this info help?
> >
>
> Yeah, it matches my expectations for this issue you reported.
> Thanks!
>
> Jassmine can dig into a fix for another spin of this series.
[Meng, Li (Jassmine)]
Thank you very much!
I will fix this issue in the next patches.
>
> >> Looking through the code, I anticipate from your report that it
> >> reproduces on "1" but not "2" and "3".
> >>
> >> Meng,
> >>
> >> Can you try to repro?
> >>
> >> I think that it's probably a call to amd_pstate_init_prefcore()
> >> missing from amd_pstate_cpu_resume() and also
> amd_pstate_epp_resume().
> >


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-09-21 16:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18  8:14 [PATCH V7 0/7] amd-pstate preferred core Meng Li
2023-09-18  8:14 ` [PATCH V7 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
2023-09-18  8:14 ` [PATCH V7 2/7] acpi: cppc: Add get the highest performance cppc control Meng Li
2023-09-18  8:14 ` [PATCH V7 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
2023-09-20  2:43   ` Huang Rui
2023-09-18  8:14 ` [PATCH V7 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
2023-09-18  8:14 ` [PATCH V7 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
2023-09-18  8:14 ` [PATCH V7 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
2023-09-18  8:14 ` [PATCH V7 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
2023-09-18 17:40 ` [PATCH V7 0/7] amd-pstate preferred core Mario Limonciello
2023-09-19  0:50   ` Meng, Li (Jassmine)
2023-09-18 17:44 ` Shuah Khan
2023-09-18 18:23   ` Shuah
2023-09-19 19:01 ` Oleksandr Natalenko
2023-09-20 16:56   ` Mario Limonciello
2023-09-20 19:34     ` Oleksandr Natalenko
2023-09-20 20:11       ` Mario Limonciello
2023-09-21  5:51         ` Meng, Li (Jassmine)
2023-09-20  2:50 ` Huang Rui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.