linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V14 0/7] amd-pstate preferred core
@ 2024-01-19  9:04 Meng Li
  2024-01-19  9:04 ` [PATCH V14 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:04 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li

Hi all:

The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.

Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.

Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.

Changes from V13->V14:
- cpufreq:
- - fix build error without CONFIG_CPU_FREQ

- ACPI: CPPC:
Changes from V12->V13:
- ACPI: CPPC:
- - modify commit message.
- - modify handle function of the notify(0x85).
- cpufreq: amd-pstate:
- - implement update_limits() callback function.
- x86:
- - pick up Acked-By flag added by Petkov.

Changes from V11->V12:
- all:
- - pick up Reviewed-By flag added by Perry.
- cpufreq: amd-pstate:
- - rebase the latest linux-next and fixed conflicts.
- - fixed the issue about cpudata without init in amd_pstate_update_highest_perf().

Changes from V10->V11:
- cpufreq: amd-pstate:
- - according Perry's commnts, I replace the string with str_enabled_disable().

Changes from V9->V10:
- cpufreq: amd-pstate:
- - add judgement for highest_perf. When it is less than 255, the
  preferred core feature is enabled. And it will set the priority.
- - deleset "static u32 max_highest_perf" etc, because amd p-state
  perferred coe does not require specail process for hotpulg.

Changes form V8->V9:
- all:
- - pick up Tested-By flag added by Oleksandr.
- cpufreq: amd-pstate:
- - pick up Review-By flag added by Wyes.
- - ignore modification of bug.
- - add a attribute of prefcore_ranking.
- - modify data type conversion from u32 to int.
- Documentation: amd-pstate:
- - pick up Review-By flag added by Wyes.

Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.

Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- ACPI: CPPC:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.

Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.

Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq: 
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``

Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.

Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate: 
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.

Changes form V1->V2:
- ACPI: CPPC:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate: 
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate: 
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.

Meng Li (7):
  x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
  ACPI: CPPC: Add get the highest performance cppc control
  cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
  cpufreq: Add a notification message that the highest perf has changed
  cpufreq: amd-pstate: Update amd-pstate preferred core ranking
    dynamically
  Documentation: amd-pstate: introduce amd-pstate preferred core
  Documentation: introduce amd-pstate preferrd core mode kernel command
    line options

 .../admin-guide/kernel-parameters.txt         |   5 +
 Documentation/admin-guide/pm/amd-pstate.rst   |  59 +++++-
 arch/x86/Kconfig                              |   5 +-
 drivers/acpi/cppc_acpi.c                      |  13 ++
 drivers/acpi/processor_driver.c               |   6 +
 drivers/cpufreq/amd-pstate.c                  | 183 +++++++++++++++++-
 include/acpi/cppc_acpi.h                      |   5 +
 include/linux/amd-pstate.h                    |  10 +
 include/linux/cpufreq.h                       |   1 +
 9 files changed, 275 insertions(+), 12 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH V14 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
@ 2024-01-19  9:04 ` Meng Li
  2024-01-19  9:04 ` [PATCH V14 2/7] ACPI: CPPC: Add get the highest performance cppc control Meng Li
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:04 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Perry Yuan

amd-pstate driver also uses SCHED_MC_PRIO, so decouple the requirement
of CPU_SUP_INTEL from the dependencies to allow compilation in kernels
without Intel CPU support.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 arch/x86/Kconfig | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 53f2e7797b1d..8dfb08ceb6ec 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1051,8 +1051,9 @@ config SCHED_MC
 
 config SCHED_MC_PRIO
 	bool "CPU core priorities scheduler support"
-	depends on SCHED_MC && CPU_SUP_INTEL
-	select X86_INTEL_PSTATE
+	depends on SCHED_MC
+	select X86_INTEL_PSTATE if CPU_SUP_INTEL
+	select X86_AMD_PSTATE if CPU_SUP_AMD && ACPI
 	select CPU_FREQ
 	default y
 	help
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 2/7] ACPI: CPPC: Add get the highest performance cppc control
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
  2024-01-19  9:04 ` [PATCH V14 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
@ 2024-01-19  9:04 ` Meng Li
  2024-01-19  9:04 ` [PATCH V14 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:04 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Wyes Karny, Perry Yuan

Add support for getting the highest performance to the
generic CPPC driver. This enables downstream drivers
such as amd-pstate to discover and use these values.

Please refer to Chapter 8.4.6.1.1.1. Highest Performance
of ACPI Specification 6.5 for details on continuous
performance control of CPPC. Also see the Link below.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#highest-performance
---
 drivers/acpi/cppc_acpi.c | 13 +++++++++++++
 include/acpi/cppc_acpi.h |  5 +++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index d155a86a8614..a50e70abdf19 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1157,6 +1157,19 @@ int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
 	return cppc_get_perf(cpunum, NOMINAL_PERF, nominal_perf);
 }
 
+/**
+ * cppc_get_highest_perf - Get the highest performance register value.
+ * @cpunum: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+	return cppc_get_perf(cpunum, HIGHEST_PERF, highest_perf);
+}
+EXPORT_SYMBOL_GPL(cppc_get_highest_perf);
+
 /**
  * cppc_get_epp_perf - Get the epp register value.
  * @cpunum: CPU from which to get epp preference value.
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 3a0995f8bce8..930b6afba6f4 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -139,6 +139,7 @@ struct cppc_cpudata {
 #ifdef CONFIG_ACPI_CPPC_LIB
 extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
 extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
+extern int cppc_get_highest_perf(int cpunum, u64 *highest_perf);
 extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 extern int cppc_set_enable(int cpu, bool enable);
@@ -167,6 +168,10 @@ static inline int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
 {
 	return -ENOTSUPP;
 }
+static inline int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+	return -ENOTSUPP;
+}
 static inline int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
 {
 	return -ENOTSUPP;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
  2024-01-19  9:04 ` [PATCH V14 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
  2024-01-19  9:04 ` [PATCH V14 2/7] ACPI: CPPC: Add get the highest performance cppc control Meng Li
@ 2024-01-19  9:04 ` Meng Li
  2024-01-19  9:04 ` [PATCH V14 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:04 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Wyes Karny

amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

The initial core rankings are set up by amd-pstate when the
system boots.

Add a variable hw_prefcore in cpudata structure. It will check
if the processor and power firmware support preferred core
feature.

Add one new early parameter `disable` to allow user to disable
the preferred core.

Only when hardware supports preferred core and user set `enabled`
in early parameter, amd pstate driver supports preferred core featue.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 131 ++++++++++++++++++++++++++++++++---
 include/linux/amd-pstate.h   |   4 ++
 2 files changed, 127 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 1f6186475715..9c2790753f99 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -37,6 +37,7 @@
 #include <linux/uaccess.h>
 #include <linux/static_call.h>
 #include <linux/amd-pstate.h>
+#include <linux/topology.h>
 
 #include <acpi/processor.h>
 #include <acpi/cppc_acpi.h>
@@ -49,6 +50,7 @@
 
 #define AMD_PSTATE_TRANSITION_LATENCY	20000
 #define AMD_PSTATE_TRANSITION_DELAY	1000
+#define AMD_PSTATE_PREFCORE_THRESHOLD	166
 
 /*
  * TODO: We need more time to fine tune processors with shared memory solution
@@ -64,6 +66,7 @@ static struct cpufreq_driver amd_pstate_driver;
 static struct cpufreq_driver amd_pstate_epp_driver;
 static int cppc_state = AMD_PSTATE_UNDEFINED;
 static bool cppc_enabled;
+static bool amd_pstate_prefcore = true;
 
 /*
  * AMD Energy Preference Performance (EPP)
@@ -297,13 +300,14 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 	if (ret)
 		return ret;
 
-	/*
-	 * TODO: Introduce AMD specific power feature.
-	 *
-	 * CPPC entry doesn't indicate the highest performance in some ASICs.
+	/* For platforms that do not support the preferred core feature, the
+	 * highest_pef may be configured with 166 or 255, to avoid max frequency
+	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
+	 * the default max perf.
 	 */
-	highest_perf = amd_get_highest_perf();
-	if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
+	if (cpudata->hw_prefcore)
+		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
+	else
 		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
@@ -324,8 +328,9 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	if (ret)
 		return ret;
 
-	highest_perf = amd_get_highest_perf();
-	if (highest_perf > cppc_perf.highest_perf)
+	if (cpudata->hw_prefcore)
+		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
+	else
 		highest_perf = cppc_perf.highest_perf;
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
@@ -706,6 +711,80 @@ static void amd_perf_ctl_reset(unsigned int cpu)
 	wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0);
 }
 
+/*
+ * Set amd-pstate preferred core enable can't be done directly from cpufreq callbacks
+ * due to locking, so queue the work for later.
+ */
+static void amd_pstste_sched_prefcore_workfn(struct work_struct *work)
+{
+	sched_set_itmt_support();
+}
+static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
+
+/*
+ * Get the highest performance register value.
+ * @cpu: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+static int amd_pstate_get_highest_perf(int cpu, u32 *highest_perf)
+{
+	int ret;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		u64 cap1;
+
+		ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
+		if (ret)
+			return ret;
+		WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
+	} else {
+		u64 cppc_highest_perf;
+
+		ret = cppc_get_highest_perf(cpu, &cppc_highest_perf);
+		if (ret)
+			return ret;
+		WRITE_ONCE(*highest_perf, cppc_highest_perf);
+	}
+
+	return (ret);
+}
+
+#define CPPC_MAX_PERF	U8_MAX
+
+static void amd_pstate_init_prefcore(struct amd_cpudata *cpudata)
+{
+	int ret, prio;
+	u32 highest_perf;
+
+	ret = amd_pstate_get_highest_perf(cpudata->cpu, &highest_perf);
+	if (ret)
+		return;
+
+	cpudata->hw_prefcore = true;
+	/* check if CPPC preferred core feature is enabled*/
+	if (highest_perf < CPPC_MAX_PERF)
+		prio = (int)highest_perf;
+	else {
+		pr_debug("AMD CPPC preferred core is unsupported!\n");
+		cpudata->hw_prefcore = false;
+		return;
+	}
+
+	if (!amd_pstate_prefcore)
+		return;
+
+	/*
+	 * The priorities can be set regardless of whether or not
+	 * sched_set_itmt_support(true) has been called and it is valid to
+	 * update them at any time after it has been called.
+	 */
+	sched_set_itmt_core_prio(prio, cpudata->cpu);
+
+	schedule_work(&sched_prefcore_work);
+}
+
 static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 {
 	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -727,6 +806,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 
 	cpudata->cpu = policy->cpu;
 
+	amd_pstate_init_prefcore(cpudata);
+
 	ret = amd_pstate_init_perf(cpudata);
 	if (ret)
 		goto free_cpudata1;
@@ -877,6 +958,17 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 	return sysfs_emit(buf, "%u\n", perf);
 }
 
+static ssize_t show_amd_pstate_hw_prefcore(struct cpufreq_policy *policy,
+					   char *buf)
+{
+	bool hw_prefcore;
+	struct amd_cpudata *cpudata = policy->driver_data;
+
+	hw_prefcore = READ_ONCE(cpudata->hw_prefcore);
+
+	return sysfs_emit(buf, "%s\n", str_enabled_disabled(hw_prefcore));
+}
+
 static ssize_t show_energy_performance_available_preferences(
 				struct cpufreq_policy *policy, char *buf)
 {
@@ -1074,18 +1166,27 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
 	return ret < 0 ? ret : count;
 }
 
+static ssize_t prefcore_show(struct device *dev,
+			     struct device_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%s\n", str_enabled_disabled(amd_pstate_prefcore));
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
 cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_ro(amd_pstate_hw_prefcore);
 cpufreq_freq_attr_rw(energy_performance_preference);
 cpufreq_freq_attr_ro(energy_performance_available_preferences);
 static DEVICE_ATTR_RW(status);
+static DEVICE_ATTR_RO(prefcore);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
 	&amd_pstate_lowest_nonlinear_freq,
 	&amd_pstate_highest_perf,
+	&amd_pstate_hw_prefcore,
 	NULL,
 };
 
@@ -1093,6 +1194,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 	&amd_pstate_max_freq,
 	&amd_pstate_lowest_nonlinear_freq,
 	&amd_pstate_highest_perf,
+	&amd_pstate_hw_prefcore,
 	&energy_performance_preference,
 	&energy_performance_available_preferences,
 	NULL,
@@ -1100,6 +1202,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 
 static struct attribute *pstate_global_attributes[] = {
 	&dev_attr_status.attr,
+	&dev_attr_prefcore.attr,
 	NULL
 };
 
@@ -1151,6 +1254,8 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
 	cpudata->cpu = policy->cpu;
 	cpudata->epp_policy = 0;
 
+	amd_pstate_init_prefcore(cpudata);
+
 	ret = amd_pstate_init_perf(cpudata);
 	if (ret)
 		goto free_cpudata1;
@@ -1568,7 +1673,17 @@ static int __init amd_pstate_param(char *str)
 
 	return amd_pstate_set_driver(mode_idx);
 }
+
+static int __init amd_prefcore_param(char *str)
+{
+	if (!strcmp(str, "disable"))
+		amd_pstate_prefcore = false;
+
+	return 0;
+}
+
 early_param("amd_pstate", amd_pstate_param);
+early_param("amd_prefcore", amd_prefcore_param);
 
 MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
 MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 6ad02ad9c7b4..68fc1bd8d851 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -52,6 +52,9 @@ struct amd_aperf_mperf {
  * @prev: Last Aperf/Mperf/tsc count value read from register
  * @freq: current cpu frequency value
  * @boost_supported: check whether the Processor or SBIOS supports boost mode
+ * @hw_prefcore: check whether HW supports preferred core featue.
+ * 		  Only when hw_prefcore and early prefcore param are true,
+ * 		  AMD P-State driver supports preferred core featue.
  * @epp_policy: Last saved policy used to set energy-performance preference
  * @epp_cached: Cached CPPC energy-performance preference value
  * @policy: Cpufreq policy value
@@ -85,6 +88,7 @@ struct amd_cpudata {
 
 	u64	freq;
 	bool	boost_supported;
+	bool	hw_prefcore;
 
 	/* EPP feature related attributes*/
 	s16	epp_policy;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 4/7] cpufreq: Add a notification message that the highest perf has changed
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
                   ` (2 preceding siblings ...)
  2024-01-19  9:04 ` [PATCH V14 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
@ 2024-01-19  9:04 ` Meng Li
  2024-01-19  9:05 ` [PATCH V14 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:04 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Perry Yuan

BIOS issues the notify 0x85 to OS that the highest performance
changed. And it will affect the ranking of the preferred core.
AMD-pstate driver will set the priority of cores based on the
preferred core ranking.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#processor-device-notification-values
---
 drivers/acpi/processor_driver.c | 6 ++++++
 include/linux/cpufreq.h         | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 4bd16b3f0781..67db60eda370 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -27,6 +27,7 @@
 #define ACPI_PROCESSOR_NOTIFY_PERFORMANCE 0x80
 #define ACPI_PROCESSOR_NOTIFY_POWER	0x81
 #define ACPI_PROCESSOR_NOTIFY_THROTTLING	0x82
+#define ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED	0x85
 
 MODULE_AUTHOR("Paul Diefenbaugh");
 MODULE_DESCRIPTION("ACPI Processor Driver");
@@ -83,6 +84,11 @@ static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)
 		acpi_bus_generate_netlink_event(device->pnp.device_class,
 						  dev_name(&device->dev), event, 0);
 		break;
+	case ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED:
+		cpufreq_update_limits(pr->id);
+		acpi_bus_generate_netlink_event(device->pnp.device_class,
+						  dev_name(&device->dev), event, 0);
+		break;
 	default:
 		acpi_handle_debug(handle, "Unsupported event [0x%x]\n", event);
 		break;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index afda5f24d3dd..9bebeec24abb 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -263,6 +263,7 @@ static inline bool cpufreq_supports_freq_invariance(void)
 	return false;
 }
 static inline void disable_cpufreq(void) { }
+static inline void cpufreq_update_limits(unsigned int cpu) { }
 #endif
 
 #ifdef CONFIG_CPU_FREQ_STAT
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
                   ` (3 preceding siblings ...)
  2024-01-19  9:04 ` [PATCH V14 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
@ 2024-01-19  9:05 ` Meng Li
  2024-01-19  9:05 ` [PATCH V14 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:05 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Wyes Karny, Perry Yuan

Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 52 ++++++++++++++++++++++++++++++++++++
 include/linux/amd-pstate.h   |  6 +++++
 2 files changed, 58 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 9c2790753f99..3034b8ff3f8e 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -315,6 +315,7 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 	WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
 	WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
+	WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
 	WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
 	return 0;
 }
@@ -339,6 +340,7 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	WRITE_ONCE(cpudata->lowest_nonlinear_perf,
 		   cppc_perf.lowest_nonlinear_perf);
 	WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+	WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
 	WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
 
 	if (cppc_state == AMD_PSTATE_ACTIVE)
@@ -785,6 +787,40 @@ static void amd_pstate_init_prefcore(struct amd_cpudata *cpudata)
 	schedule_work(&sched_prefcore_work);
 }
 
+static void amd_pstate_update_limits(unsigned int cpu)
+{
+	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+	struct amd_cpudata *cpudata = policy->driver_data;
+	u32 prev_high = 0, cur_high = 0;
+	int ret;
+	bool highest_perf_changed = false;
+
+	mutex_lock(&amd_pstate_driver_lock);
+	if ((!amd_pstate_prefcore) || (!cpudata->hw_prefcore))
+		goto free_cpufreq_put;
+
+	ret = amd_pstate_get_highest_perf(cpu, &cur_high);
+	if (ret)
+		goto free_cpufreq_put;
+
+	prev_high = READ_ONCE(cpudata->prefcore_ranking);
+	if (prev_high != cur_high) {
+		highest_perf_changed = true;
+		WRITE_ONCE(cpudata->prefcore_ranking, cur_high);
+
+		if (cur_high < CPPC_MAX_PERF)
+			sched_set_itmt_core_prio((int)cur_high, cpu);
+	}
+
+free_cpufreq_put:
+	cpufreq_cpu_put(policy);
+
+	if (!highest_perf_changed)
+		cpufreq_update_policy(cpu);
+
+	mutex_unlock(&amd_pstate_driver_lock);
+}
+
 static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 {
 	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -958,6 +994,17 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 	return sysfs_emit(buf, "%u\n", perf);
 }
 
+static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy *policy,
+						char *buf)
+{
+	u32 perf;
+	struct amd_cpudata *cpudata = policy->driver_data;
+
+	perf = READ_ONCE(cpudata->prefcore_ranking);
+
+	return sysfs_emit(buf, "%u\n", perf);
+}
+
 static ssize_t show_amd_pstate_hw_prefcore(struct cpufreq_policy *policy,
 					   char *buf)
 {
@@ -1176,6 +1223,7 @@ cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
 cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_ro(amd_pstate_prefcore_ranking);
 cpufreq_freq_attr_ro(amd_pstate_hw_prefcore);
 cpufreq_freq_attr_rw(energy_performance_preference);
 cpufreq_freq_attr_ro(energy_performance_available_preferences);
@@ -1186,6 +1234,7 @@ static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
 	&amd_pstate_lowest_nonlinear_freq,
 	&amd_pstate_highest_perf,
+	&amd_pstate_prefcore_ranking,
 	&amd_pstate_hw_prefcore,
 	NULL,
 };
@@ -1194,6 +1243,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 	&amd_pstate_max_freq,
 	&amd_pstate_lowest_nonlinear_freq,
 	&amd_pstate_highest_perf,
+	&amd_pstate_prefcore_ranking,
 	&amd_pstate_hw_prefcore,
 	&energy_performance_preference,
 	&energy_performance_available_preferences,
@@ -1538,6 +1588,7 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.suspend	= amd_pstate_cpu_suspend,
 	.resume		= amd_pstate_cpu_resume,
 	.set_boost	= amd_pstate_set_boost,
+	.update_limits	= amd_pstate_update_limits,
 	.name		= "amd-pstate",
 	.attr		= amd_pstate_attr,
 };
@@ -1552,6 +1603,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.online		= amd_pstate_epp_cpu_online,
 	.suspend	= amd_pstate_epp_suspend,
 	.resume		= amd_pstate_epp_resume,
+	.update_limits	= amd_pstate_update_limits,
 	.name		= "amd-pstate-epp",
 	.attr		= amd_pstate_epp_attr,
 };
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 68fc1bd8d851..d21838835abd 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -39,11 +39,16 @@ struct amd_aperf_mperf {
  * @cppc_req_cached: cached performance request hints
  * @highest_perf: the maximum performance an individual processor may reach,
  *		  assuming ideal conditions
+ *		  For platforms that do not support the preferred core feature, the
+ *		  highest_pef may be configured with 166 or 255, to avoid max frequency
+ *		  calculated wrongly. we take the fixed value as the highest_perf.
  * @nominal_perf: the maximum sustained performance level of the processor,
  *		  assuming ideal operating conditions
  * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
  *			   savings are achieved
  * @lowest_perf: the absolute lowest performance level of the processor
+ * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
+ * 		  priority.
  * @max_freq: the frequency that mapped to highest_perf
  * @min_freq: the frequency that mapped to lowest_perf
  * @nominal_freq: the frequency that mapped to nominal_perf
@@ -73,6 +78,7 @@ struct amd_cpudata {
 	u32	nominal_perf;
 	u32	lowest_nonlinear_perf;
 	u32	lowest_perf;
+	u32     prefcore_ranking;
 	u32     min_limit_perf;
 	u32     max_limit_perf;
 	u32     min_limit_freq;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
                   ` (4 preceding siblings ...)
  2024-01-19  9:05 ` [PATCH V14 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
@ 2024-01-19  9:05 ` Meng Li
  2024-01-19  9:05 ` [PATCH V14 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
  2024-01-29 15:18 ` [PATCH V14 0/7] amd-pstate preferred core Rafael J. Wysocki
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:05 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Wyes Karny, Perry Yuan

Introduce amd-pstate preferred core.

check preferred core state set by the kernel parameter:
$ cat /sys/devices/system/cpu/amd-pstate/prefcore

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
---
 Documentation/admin-guide/pm/amd-pstate.rst | 59 ++++++++++++++++++++-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 9eb26014d34b..0a3aa6b8ffd5 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -300,8 +300,8 @@ platforms. The AMD P-States mechanism is the more performance and energy
 efficiency frequency management method on AMD processors.
 
 
-AMD Pstate Driver Operation Modes
-=================================
+``amd-pstate`` Driver Operation Modes
+======================================
 
 ``amd_pstate`` CPPC has 3 operation modes: autonomous (active) mode,
 non-autonomous (passive) mode and guided autonomous (guided) mode.
@@ -353,6 +353,48 @@ is activated.  In this mode, driver requests minimum and maximum performance
 level and the platform autonomously selects a performance level in this range
 and appropriate to the current workload.
 
+``amd-pstate`` Preferred Core
+=================================
+
+The core frequency is subjected to the process variation in semiconductors.
+Not all cores are able to reach the maximum frequency respecting the
+infrastructure limits. Consequently, AMD has redefined the concept of
+maximum frequency of a part. This means that a fraction of cores can reach
+maximum frequency. To find the best process scheduling policy for a given
+scenario, OS needs to know the core ordering informed by the platform through
+highest performance capability register of the CPPC interface.
+
+``amd-pstate`` preferred core enables the scheduler to prefer scheduling on
+cores that can achieve a higher frequency with lower voltage. The preferred
+core rankings can dynamically change based on the workload, platform conditions,
+thermals and ageing.
+
+The priority metric will be initialized by the ``amd-pstate`` driver. The ``amd-pstate``
+driver will also determine whether or not ``amd-pstate`` preferred core is
+supported by the platform.
+
+``amd-pstate`` driver will provide an initial core ordering when the system boots.
+The platform uses the CPPC interfaces to communicate the core ranking to the
+operating system and scheduler to make sure that OS is choosing the cores
+with highest performance firstly for scheduling the process. When ``amd-pstate``
+driver receives a message with the highest performance change, it will
+update the core ranking and set the cpu's priority.
+
+``amd-pstate`` Preferred Core Switch
+=================================
+Kernel Parameters
+-----------------
+
+``amd-pstate`` peferred core`` has two states: enable and disable.
+Enable/disable states can be chosen by different kernel parameters.
+Default enable ``amd-pstate`` preferred core.
+
+``amd_prefcore=disable``
+
+For systems that support ``amd-pstate`` preferred core, the core rankings will
+always be advertised by the platform. But OS can choose to ignore that via the
+kernel parameter ``amd_prefcore=disable``.
+
 User Space Interface in ``sysfs`` - General
 ===========================================
 
@@ -385,6 +427,19 @@ control its functionality at the system level.  They are located in the
         to the operation mode represented by that string - or to be
         unregistered in the "disable" case.
 
+``prefcore``
+	Preferred core state of the driver: "enabled" or "disabled".
+
+	"enabled"
+		Enable the ``amd-pstate`` preferred core.
+
+	"disabled"
+		Disable the ``amd-pstate`` preferred core
+
+
+        This attribute is read-only to check the state of preferred core set
+        by the kernel parameter.
+
 ``cpupower`` tool support for ``amd-pstate``
 ===============================================
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V14 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
                   ` (5 preceding siblings ...)
  2024-01-19  9:05 ` [PATCH V14 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
@ 2024-01-19  9:05 ` Meng Li
  2024-01-29 15:18 ` [PATCH V14 0/7] amd-pstate preferred core Rafael J. Wysocki
  7 siblings, 0 replies; 19+ messages in thread
From: Meng Li @ 2024-01-19  9:05 UTC (permalink / raw)
  To: Rafael J . Wysocki, Borislav Petkov, Huang Rui
  Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
	linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
	Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
	Viresh Kumar, Borislav Petkov, Oleksandr Natalenko, Meng Li,
	Wyes Karny, Perry Yuan

amd-pstate driver support enable/disable preferred core.
Default enabled on platforms supporting amd-pstate preferred core.
Disable amd-pstate preferred core with
"amd_prefcore=disable" added to the kernel command line.

Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index e0891ac76ab3..88b29efc474f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -363,6 +363,11 @@
 			  selects a performance level in this range and appropriate
 			  to the current workload.
 
+	amd_prefcore=
+			[X86]
+			disable
+			  Disable amd-pstate preferred core.
+
 	amijoy.map=	[HW,JOY] Amiga joystick support
 			Map of devices attached to JOY0DAT and JOY1DAT
 			Format: <a>,<b>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V14 0/7] amd-pstate preferred core
  2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
                   ` (6 preceding siblings ...)
  2024-01-19  9:05 ` [PATCH V14 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
@ 2024-01-29 15:18 ` Rafael J. Wysocki
  2024-01-29 15:33   ` Borislav Petkov
  7 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2024-01-29 15:18 UTC (permalink / raw)
  To: Meng Li, Borislav Petkov
  Cc: Rafael J . Wysocki, Huang Rui, linux-pm, linux-kernel, x86,
	linux-acpi, Shuah Khan, linux-kselftest, Nathan Fontenot,
	Deepak Sharma, Alex Deucher, Mario Limonciello, Shimmer Huang,
	Perry Yuan, Xiaojian Du, Viresh Kumar, Borislav Petkov,
	Oleksandr Natalenko

On Fri, Jan 19, 2024 at 10:05 AM Meng Li <li.meng@amd.com> wrote:
>
> Hi all:
>
> The core frequency is subjected to the process variation in semiconductors.
> Not all cores are able to reach the maximum frequency respecting the
> infrastructure limits. Consequently, AMD has redefined the concept of
> maximum frequency of a part. This means that a fraction of cores can reach
> maximum frequency. To find the best process scheduling policy for a given
> scenario, OS needs to know the core ordering informed by the platform through
> highest performance capability register of the CPPC interface.
>
> Earlier implementations of amd-pstate preferred core only support a static
> core ranking and targeted performance. Now it has the ability to dynamically
> change the preferred core based on the workload and platform conditions and
> accounting for thermals and aging.
>
> Amd-pstate driver utilizes the functions and data structures provided by
> the ITMT architecture to enable the scheduler to favor scheduling on cores
> which can be get a higher frequency with lower voltage.
> We call it amd-pstate preferred core.
>
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> Amd-pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
>
> Amd-pstate driver will provide an initial core ordering at boot time.
> It relies on the CPPC interface to communicate the core ranking to the
> operating system and scheduler to make sure that OS is choosing the cores
> with highest performance firstly for scheduling the process. When amd-pstate
> driver receives a message with the highest performance change, it will
> update the core ranking.

Hi Boris,

You've had comments on the previous version of this.

Have they all been addressed?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V14 0/7] amd-pstate preferred core
  2024-01-29 15:18 ` [PATCH V14 0/7] amd-pstate preferred core Rafael J. Wysocki
@ 2024-01-29 15:33   ` Borislav Petkov
  2024-01-31 13:58     ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2024-01-29 15:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Meng Li, Rafael J . Wysocki, Huang Rui, linux-pm, linux-kernel,
	x86, linux-acpi, Shuah Khan, linux-kselftest, Nathan Fontenot,
	Deepak Sharma, Alex Deucher, Mario Limonciello, Shimmer Huang,
	Perry Yuan, Xiaojian Du, Viresh Kumar, Oleksandr Natalenko

On Mon, Jan 29, 2024 at 04:18:02PM +0100, Rafael J. Wysocki wrote:
> You've had comments on the previous version of this.
> 
> Have they all been addressed?

Yeah, see patch 1.

Thx.


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V14 0/7] amd-pstate preferred core
  2024-01-29 15:33   ` Borislav Petkov
@ 2024-01-31 13:58     ` Rafael J. Wysocki
  2024-02-18 16:10       ` Lucas Lee Jing Yi
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2024-01-31 13:58 UTC (permalink / raw)
  To: Borislav Petkov, Meng Li
  Cc: Rafael J . Wysocki, Huang Rui, linux-pm, linux-kernel, x86,
	linux-acpi, Shuah Khan, linux-kselftest, Nathan Fontenot,
	Deepak Sharma, Alex Deucher, Mario Limonciello, Shimmer Huang,
	Perry Yuan, Xiaojian Du, Viresh Kumar, Oleksandr Natalenko

On Mon, Jan 29, 2024 at 4:33 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 04:18:02PM +0100, Rafael J. Wysocki wrote:
> > You've had comments on the previous version of this.
> >
> > Have they all been addressed?
>
> Yeah, see patch 1.

Thanks!

So the whole lot has been applied as 6.9 material, with some patch
subjects changed and a couple of changelogs edited.

Thank you!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V14 0/7] amd-pstate preferred core
  2024-01-31 13:58     ` Rafael J. Wysocki
@ 2024-02-18 16:10       ` Lucas Lee Jing Yi
  2024-02-18 16:10         ` [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs Lucas Lee Jing Yi
  2024-02-19  1:02         ` [PATCH V14 0/7] amd-pstate preferred core Meng, Li (Jassmine)
  0 siblings, 2 replies; 19+ messages in thread
From: Lucas Lee Jing Yi @ 2024-02-18 16:10 UTC (permalink / raw)
  To: rafael
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	mario.limonciello, nathan.fontenot, oleksandr, rafael.j.wysocki,
	ray.huang, shimmer.huang, skhan, viresh.kumar, x86

Dear all,
I have found an issue with the patchset when applying on 6.7, leading to a large degradation in performance.

On my 7840HS on *STOCK* 6.7 highest_perf is reported as 196, not 166 as assumed in the patchset. Applying the patchset causes highest_perf to be misreported and hence a misreported maximum frequency as well, at 4.35GHz instead of 5.14GHz, leading to the degradation in performance.
However, On my 5950X, highest_perf is indeed reported as 166 before and after applying the patchset.

Hence, I propose the following patch (should be attached).

I do apologize for any mistakes as I am new to this and this is my first email on the mailing list.

Cheers!
Lucas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-18 16:10       ` Lucas Lee Jing Yi
@ 2024-02-18 16:10         ` Lucas Lee Jing Yi
  2024-02-20  7:23           ` Meng, Li (Jassmine)
  2024-02-20  9:02           ` Oleksandr Natalenko
  2024-02-19  1:02         ` [PATCH V14 0/7] amd-pstate preferred core Meng, Li (Jassmine)
  1 sibling, 2 replies; 19+ messages in thread
From: Lucas Lee Jing Yi @ 2024-02-18 16:10 UTC (permalink / raw)
  To: rafael
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	mario.limonciello, nathan.fontenot, oleksandr, rafael.j.wysocki,
	ray.huang, shimmer.huang, skhan, viresh.kumar, x86,
	Lucas Lee Jing Yi

On a Ryzen 7840HS the highest_perf value is 196, not 166 as AMD assumed.
This leads to the advertised max clock speed to only be 4.35ghz instead of 5.14ghz , leading to a large degradation in performance.

Fix the broken assumption and revert back to the old logic for getting highest_perf.

TEST:
Geekbench 6 Before Patch:
Single Core:	2325 (-22%)!
Multi Core:	11335 (-10%)

Geekbench 6 AFTER Patch:
Single Core:	2635
Multi Core:	12487

Signed-off-by: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
---
 drivers/cpufreq/amd-pstate.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 08e112444c27..54df68773620 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -50,7 +50,6 @@
 
 #define AMD_PSTATE_TRANSITION_LATENCY	20000
 #define AMD_PSTATE_TRANSITION_DELAY	1000
-#define AMD_PSTATE_PREFCORE_THRESHOLD	166
 
 /*
  * TODO: We need more time to fine tune processors with shared memory solution
@@ -299,15 +298,12 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 				     &cap1);
 	if (ret)
 		return ret;
-
-	/* For platforms that do not support the preferred core feature, the
-	 * highest_pef may be configured with 166 or 255, to avoid max frequency
-	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
-	 * the default max perf.
+ 
+	/* Some CPUs have different highest_perf from others, it is safer 
+	 * to read it than to assume some erroneous value, leading to performance issues.
 	 */
-	if (cpudata->hw_prefcore)
-		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
-	else
+	highest_perf = amd_get_highest_perf();
+	if(highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
 		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
@@ -329,9 +325,11 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	if (ret)
 		return ret;
 
-	if (cpudata->hw_prefcore)
-		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
-	else
+	/* Some CPUs have different highest_perf from others, it is safer 
+	 * to read it than to assume some erroneous value, leading to performance issues.
+	 */
+	highest_perf = amd_get_highest_perf();
+	if(highest_perf > cppc_perf.highest_perf)
 		highest_perf = cppc_perf.highest_perf;
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* RE: [PATCH V14 0/7] amd-pstate preferred core
  2024-02-18 16:10       ` Lucas Lee Jing Yi
  2024-02-18 16:10         ` [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs Lucas Lee Jing Yi
@ 2024-02-19  1:02         ` Meng, Li (Jassmine)
  1 sibling, 0 replies; 19+ messages in thread
From: Meng, Li (Jassmine) @ 2024-02-19  1:02 UTC (permalink / raw)
  To: Lucas Lee Jing Yi, rafael
  Cc: Yuan, Perry, Du, Xiaojian, Deucher, Alexander, bp, Sharma,
	Deepak, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	Limonciello, Mario, Fontenot, Nathan, oleksandr,
	rafael.j.wysocki, Huang, Ray, Huang, Shimmer, skhan,
	viresh.kumar, x86

[AMD Official Use Only - General]

Hi :
Thanks.
I will check this issue and fix it as soon as possible.

> -----Original Message-----
> From: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
> Sent: Monday, February 19, 2024 12:11 AM
> To: rafael@kernel.org
> Cc: Yuan, Perry <Perry.Yuan@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; bp@alien8.de; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>;
> linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> kselftest@vger.kernel.org; linux-pm@vger.kernel.org; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; oleksandr@natalenko.name;
> rafael.j.wysocki@intel.com; Huang, Ray <Ray.Huang@amd.com>; Huang,
> Shimmer <Shimmer.Huang@amd.com>; skhan@linuxfoundation.org;
> viresh.kumar@linaro.org; x86@kernel.org
> Subject: Re: [PATCH V14 0/7] amd-pstate preferred core
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> Dear all,
> I have found an issue with the patchset when applying on 6.7, leading to a
> large degradation in performance.
>
> On my 7840HS on *STOCK* 6.7 highest_perf is reported as 196, not 166 as
> assumed in the patchset. Applying the patchset causes highest_perf to be
> misreported and hence a misreported maximum frequency as well, at
> 4.35GHz instead of 5.14GHz, leading to the degradation in performance.
> However, On my 5950X, highest_perf is indeed reported as 166 before and
> after applying the patchset.
>
> Hence, I propose the following patch (should be attached).
>
> I do apologize for any mistakes as I am new to this and this is my first email on
> the mailing list.
>
> Cheers!
> Lucas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-18 16:10         ` [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs Lucas Lee Jing Yi
@ 2024-02-20  7:23           ` Meng, Li (Jassmine)
  2024-02-20  9:02           ` Oleksandr Natalenko
  1 sibling, 0 replies; 19+ messages in thread
From: Meng, Li (Jassmine) @ 2024-02-20  7:23 UTC (permalink / raw)
  To: Lucas Lee Jing Yi, rafael
  Cc: Yuan, Perry, Du, Xiaojian, Deucher, Alexander, bp, Sharma,
	Deepak, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	Limonciello, Mario, Fontenot, Nathan, oleksandr,
	rafael.j.wysocki, Huang, Ray, Huang, Shimmer, skhan,
	viresh.kumar, x86

[AMD Official Use Only - General]

Hi Lucas:

> -----Original Message-----
> From: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
> Sent: Monday, February 19, 2024 12:11 AM
> To: rafael@kernel.org
> Cc: Yuan, Perry <Perry.Yuan@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; bp@alien8.de; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>;
> linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> kselftest@vger.kernel.org; linux-pm@vger.kernel.org; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; oleksandr@natalenko.name;
> rafael.j.wysocki@intel.com; Huang, Ray <Ray.Huang@amd.com>; Huang,
> Shimmer <Shimmer.Huang@amd.com>; skhan@linuxfoundation.org;
> viresh.kumar@linaro.org; x86@kernel.org; Lucas Lee Jing Yi
> <lucasleeeeeeeee@gmail.com>
> Subject: [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on
> some CPUs
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On a Ryzen 7840HS the highest_perf value is 196, not 166 as AMD assumed.
> This leads to the advertised max clock speed to only be 4.35ghz instead of
> 5.14ghz , leading to a large degradation in performance.
>
> Fix the broken assumption and revert back to the old logic for getting
> highest_perf.
>
> TEST:
> Geekbench 6 Before Patch:
> Single Core:    2325 (-22%)!
> Multi Core:     11335 (-10%)
>
> Geekbench 6 AFTER Patch:
> Single Core:    2635
> Multi Core:     12487
>
> Signed-off-by: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 08e112444c27..54df68773620 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -50,7 +50,6 @@
>
>  #define AMD_PSTATE_TRANSITION_LATENCY  20000
>  #define AMD_PSTATE_TRANSITION_DELAY    1000
> -#define AMD_PSTATE_PREFCORE_THRESHOLD  166
>
>  /*
>   * TODO: We need more time to fine tune processors with shared memory
> solution @@ -299,15 +298,12 @@ static int pstate_init_perf(struct
> amd_cpudata *cpudata)
>                                      &cap1);
>         if (ret)
>                 return ret;
> -
> -       /* For platforms that do not support the preferred core feature, the
> -        * highest_pef may be configured with 166 or 255, to avoid max
> frequency
> -        * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1)
> value as
> -        * the default max perf.
> +
> +       /* Some CPUs have different highest_perf from others, it is safer
> +        * to read it than to assume some erroneous value, leading to
> performance issues.
>          */
> -       if (cpudata->hw_prefcore)
> -               highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -       else
> +       highest_perf = amd_get_highest_perf();
> +       if(highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
>                 highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
>
>         WRITE_ONCE(cpudata->highest_perf, highest_perf); @@ -329,9 +325,11
> @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
>         if (ret)
>                 return ret;
>
> -       if (cpudata->hw_prefcore)
> -               highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -       else
> +       /* Some CPUs have different highest_perf from others, it is safer
> +        * to read it than to assume some erroneous value, leading to
> performance issues.
> +        */
> +       highest_perf = amd_get_highest_perf();
> +       if(highest_perf > cppc_perf.highest_perf)
>                 highest_perf = cppc_perf.highest_perf;
>
>         WRITE_ONCE(cpudata->highest_perf, highest_perf);
> --
> 2.43.2
[Meng, Li (Jassmine)]
Reviewed-by: Li Meng < li.meng@amd.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-18 16:10         ` [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs Lucas Lee Jing Yi
  2024-02-20  7:23           ` Meng, Li (Jassmine)
@ 2024-02-20  9:02           ` Oleksandr Natalenko
  2024-02-21 17:19             ` Lucas Lee Jing Yi
  1 sibling, 1 reply; 19+ messages in thread
From: Oleksandr Natalenko @ 2024-02-20  9:02 UTC (permalink / raw)
  To: rafael, Lucas Lee Jing Yi
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	mario.limonciello, nathan.fontenot, rafael.j.wysocki, ray.huang,
	shimmer.huang, skhan, viresh.kumar, x86, Lucas Lee Jing Yi

[-- Attachment #1: Type: text/plain, Size: 2883 bytes --]

Hello.

On neděle 18. února 2024 17:10:31 CET Lucas Lee Jing Yi wrote:
> On a Ryzen 7840HS the highest_perf value is 196, not 166 as AMD assumed.
> This leads to the advertised max clock speed to only be 4.35ghz instead of 5.14ghz , leading to a large degradation in performance.
> 
> Fix the broken assumption and revert back to the old logic for getting highest_perf.
> 
> TEST:
> Geekbench 6 Before Patch:
> Single Core:	2325 (-22%)!
> Multi Core:	11335 (-10%)
> 
> Geekbench 6 AFTER Patch:
> Single Core:	2635
> Multi Core:	12487
> 
> Signed-off-by: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 08e112444c27..54df68773620 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -50,7 +50,6 @@
>  
>  #define AMD_PSTATE_TRANSITION_LATENCY	20000
>  #define AMD_PSTATE_TRANSITION_DELAY	1000
> -#define AMD_PSTATE_PREFCORE_THRESHOLD	166
>  
>  /*
>   * TODO: We need more time to fine tune processors with shared memory solution
> @@ -299,15 +298,12 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
>  				     &cap1);
>  	if (ret)
>  		return ret;
> -
> -	/* For platforms that do not support the preferred core feature, the
> -	 * highest_pef may be configured with 166 or 255, to avoid max frequency
> -	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
> -	 * the default max perf.
> + 
> +	/* Some CPUs have different highest_perf from others, it is safer 
> +	 * to read it than to assume some erroneous value, leading to performance issues.
>  	 */
> -	if (cpudata->hw_prefcore)
> -		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -	else
> +	highest_perf = amd_get_highest_perf();
> +	if(highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
>  		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
>  
>  	WRITE_ONCE(cpudata->highest_perf, highest_perf);
> @@ -329,9 +325,11 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
>  	if (ret)
>  		return ret;
>  
> -	if (cpudata->hw_prefcore)
> -		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -	else
> +	/* Some CPUs have different highest_perf from others, it is safer 
> +	 * to read it than to assume some erroneous value, leading to performance issues.
> +	 */
> +	highest_perf = amd_get_highest_perf();
> +	if(highest_perf > cppc_perf.highest_perf)
>  		highest_perf = cppc_perf.highest_perf;
>  
>  	WRITE_ONCE(cpudata->highest_perf, highest_perf);
> 

Please pay attention to trailing whitespaces, adding whitespaces to blank lines, and whitespaces between `if` and opening `(`.

`scripts/checkpatch.pl` may help you with that.

Thank you.

-- 
Oleksandr Natalenko (post-factum)

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-20  9:02           ` Oleksandr Natalenko
@ 2024-02-21 17:19             ` Lucas Lee Jing Yi
  2024-02-21 17:19               ` Lucas Lee Jing Yi
  0 siblings, 1 reply; 19+ messages in thread
From: Lucas Lee Jing Yi @ 2024-02-21 17:19 UTC (permalink / raw)
  To: oleksandr
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	lucasleeeeeeeee, mario.limonciello, nathan.fontenot,
	rafael.j.wysocki, rafael, ray.huang, shimmer.huang, skhan,
	viresh.kumar, x86


Hi Oleksandr,

Thanks, sent in a new patch with the recommendations highlighted by the script.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-21 17:19             ` Lucas Lee Jing Yi
@ 2024-02-21 17:19               ` Lucas Lee Jing Yi
  2024-02-21 18:47                 ` Mario Limonciello
  0 siblings, 1 reply; 19+ messages in thread
From: Lucas Lee Jing Yi @ 2024-02-21 17:19 UTC (permalink / raw)
  To: oleksandr
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	lucasleeeeeeeee, mario.limonciello, nathan.fontenot,
	rafael.j.wysocki, rafael, ray.huang, shimmer.huang, skhan,
	viresh.kumar, x86

On a Ryzen 7840HS the highest_perf value is 196, not 166 as AMD assumed.
This leads to the advertised max clock speed to only be 4.35ghz
instead of 5.14ghz leading to a large degradation in performance.

Fix the broken assumption and revert back to the old logic for
getting highest_perf.

TEST:
Geekbench 6 Before Patch:
Single Core:	2325 (-22%)!
Multi Core:	11335 (-10%)

Geekbench 6 AFTER Patch:
Single Core:	2635
Multi Core:	12487

Signed-off-by: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
---
 drivers/cpufreq/amd-pstate.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 08e112444c27..54df68773620 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -50,7 +50,6 @@
 
 #define AMD_PSTATE_TRANSITION_LATENCY	20000
 #define AMD_PSTATE_TRANSITION_DELAY	1000
-#define AMD_PSTATE_PREFCORE_THRESHOLD	166
 
 /*
  * TODO: We need more time to fine tune processors with shared memory solution
@@ -299,15 +298,12 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
 				     &cap1);
 	if (ret)
 		return ret;
-
-	/* For platforms that do not support the preferred core feature, the
-	 * highest_pef may be configured with 166 or 255, to avoid max frequency
-	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
-	 * the default max perf.
+
+	/* Some CPUs have different highest_perf from others, it is safer
+	 * to read it than to assume some erroneous value, leading to performance issues.
 	 */
-	if (cpudata->hw_prefcore)
-		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
-	else
+	highest_perf = amd_get_highest_perf();
+	if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
 		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
@@ -329,9 +325,11 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
 	if (ret)
 		return ret;
 
-	if (cpudata->hw_prefcore)
-		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
-	else
+	/* Some CPUs have different highest_perf from others, it is safer
+	 * to read it than to assume some erroneous value, leading to performance issues.
+	 */
+	highest_perf = amd_get_highest_perf();
+	if (highest_perf > cppc_perf.highest_perf)
 		highest_perf = cppc_perf.highest_perf;
 
 	WRITE_ONCE(cpudata->highest_perf, highest_perf);
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs
  2024-02-21 17:19               ` Lucas Lee Jing Yi
@ 2024-02-21 18:47                 ` Mario Limonciello
  0 siblings, 0 replies; 19+ messages in thread
From: Mario Limonciello @ 2024-02-21 18:47 UTC (permalink / raw)
  To: Lucas Lee Jing Yi, oleksandr
  Cc: Perry.Yuan, Xiaojian.Du, alexander.deucher, bp, deepak.sharma,
	li.meng, linux-acpi, linux-kernel, linux-kselftest, linux-pm,
	nathan.fontenot, rafael.j.wysocki, rafael, ray.huang,
	shimmer.huang, skhan, viresh.kumar, x86

On 2/21/2024 11:19, Lucas Lee Jing Yi wrote:
> On a Ryzen 7840HS the highest_perf value is 196, not 166 as AMD assumed.
> This leads to the advertised max clock speed to only be 4.35ghz
> instead of 5.14ghz leading to a large degradation in performance.
> 
> Fix the broken assumption and revert back to the old logic for
> getting highest_perf.
> 
> TEST:
> Geekbench 6 Before Patch:
> Single Core:	2325 (-22%)!
> Multi Core:	11335 (-10%)
> 
> Geekbench 6 AFTER Patch:
> Single Core:	2635
> Multi Core:	12487
> 

Yes; the max boost for your system should be 5.1GHz according to the 
specification [1].

Would you please open a kernel Bugzilla and attach an acpidump and dmesg 
for your system?  I believe we need to better understand your system's 
situation before deciding on how to correctly approach it.

[1] https://www.amd.com/en/product/13041

> Signed-off-by: Lucas Lee Jing Yi <lucasleeeeeeeee@gmail.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 22 ++++++++++------------
>   1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 08e112444c27..54df68773620 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -50,7 +50,6 @@
>   
>   #define AMD_PSTATE_TRANSITION_LATENCY	20000
>   #define AMD_PSTATE_TRANSITION_DELAY	1000
> -#define AMD_PSTATE_PREFCORE_THRESHOLD	166
>   
>   /*
>    * TODO: We need more time to fine tune processors with shared memory solution
> @@ -299,15 +298,12 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
>   				     &cap1);
>   	if (ret)
>   		return ret;
> -
> -	/* For platforms that do not support the preferred core feature, the
> -	 * highest_pef may be configured with 166 or 255, to avoid max frequency
> -	 * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
> -	 * the default max perf.
> +
> +	/* Some CPUs have different highest_perf from others, it is safer
> +	 * to read it than to assume some erroneous value, leading to performance issues.
>   	 */
> -	if (cpudata->hw_prefcore)
> -		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -	else
> +	highest_perf = amd_get_highest_perf();
> +	if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
>   		highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
>   
>   	WRITE_ONCE(cpudata->highest_perf, highest_perf);
> @@ -329,9 +325,11 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
>   	if (ret)
>   		return ret;
>   
> -	if (cpudata->hw_prefcore)
> -		highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
> -	else
> +	/* Some CPUs have different highest_perf from others, it is safer
> +	 * to read it than to assume some erroneous value, leading to performance issues.
> +	 */
> +	highest_perf = amd_get_highest_perf();
> +	if (highest_perf > cppc_perf.highest_perf)
>   		highest_perf = cppc_perf.highest_perf;
>   
>   	WRITE_ONCE(cpudata->highest_perf, highest_perf);


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-02-21 18:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-19  9:04 [PATCH V14 0/7] amd-pstate preferred core Meng Li
2024-01-19  9:04 ` [PATCH V14 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
2024-01-19  9:04 ` [PATCH V14 2/7] ACPI: CPPC: Add get the highest performance cppc control Meng Li
2024-01-19  9:04 ` [PATCH V14 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting Meng Li
2024-01-19  9:04 ` [PATCH V14 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
2024-01-19  9:05 ` [PATCH V14 5/7] cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically Meng Li
2024-01-19  9:05 ` [PATCH V14 6/7] Documentation: amd-pstate: introduce amd-pstate preferred core Meng Li
2024-01-19  9:05 ` [PATCH V14 7/7] Documentation: introduce amd-pstate preferrd core mode kernel command line options Meng Li
2024-01-29 15:18 ` [PATCH V14 0/7] amd-pstate preferred core Rafael J. Wysocki
2024-01-29 15:33   ` Borislav Petkov
2024-01-31 13:58     ` Rafael J. Wysocki
2024-02-18 16:10       ` Lucas Lee Jing Yi
2024-02-18 16:10         ` [PATCH] [PATCH] amd_pstate: fix erroneous highest_perf value on some CPUs Lucas Lee Jing Yi
2024-02-20  7:23           ` Meng, Li (Jassmine)
2024-02-20  9:02           ` Oleksandr Natalenko
2024-02-21 17:19             ` Lucas Lee Jing Yi
2024-02-21 17:19               ` Lucas Lee Jing Yi
2024-02-21 18:47                 ` Mario Limonciello
2024-02-19  1:02         ` [PATCH V14 0/7] amd-pstate preferred core Meng, Li (Jassmine)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).