linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/11] Implement AMD Pstate EPP Driver
@ 2022-12-02  7:47 Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
                   ` (10 more replies)
  0 siblings, 11 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

Hi all,

This patchset implements one new AMD CPU frequency driver
`amd-pstate-epp` instance for better performance and power control.
CPPC has a parameter called energy preference performance (EPP).
The EPP is used in the CCLK DPM controller to drive the frequency that a core
is going to operate during short periods of activity.
EPP values will be utilized for different OS profiles (balanced, performance, power savings).

AMD Energy Performance Preference (EPP) provides a hint to the hardware
if software wants to bias toward performance (0x0) or energy efficiency (0xff)
The lowlevel power firmware will calculate the runtime frequency according to the EPP preference 
value. So the EPP hint will impact the CPU cores frequency responsiveness.

We use the RAPL interface with "perf" tool to get the energy data of the package power.
Performance Per Watt (PPW) Calculation:

The PPW calculation is referred by below paper:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&reserved=0

Below formula is referred from below spec to measure the PPW:

(F / t) / P = F * t / (t * E) = F / E,

"F" is the number of frames per second.
"P" is power measured in watts.
"E" is energy measured in joules.

Gitsouce Benchmark Data on ROME Server CPU
+------------------------------+------------------------------+------------+------------------+
| Kernel Module                | PPW (1 / s * J)              |Energy(J) | PPW Improvement (%)|
+==============================+==============================+============+==================+
| acpi-cpufreq:schedutil       | 5.85658E-05                  | 17074.8    | base             |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:ondemand        | 5.03079E-05                  | 19877.6    | -14.10%          |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:performance     | 5.88132E-05                  | 17003      | 0.42%            |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:ondemand          | 4.60295E-05                  | 21725.2    | -21.41%          |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:schedutil         | 4.70026E-05                  | 21275.4    | -19.7%           |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:performance       | 5.80094E-05                  | 17238.6    | -0.95%           |
+------------------------------+------------------------------+------------+------------------+
| EPP:performance              | 5.8292E-05                   | 17155      | -0.47%           |
+------------------------------+------------------------------+------------+------------------+
| EPP: balance performance:    | 6.71709E-05                  | 14887.4    | 14.69%           |
+------------------------------+------------------------------+------------+------------------+
| EPP:power                    | 6.66951E-05                  | 4993.6     | 13.88%           |
+------------------------------+------------------------------+------------+------------------+

Tbench Benchmark Data on ROME Server CPU
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| Kernel Module                               | PPW MB / (s * J)  |Throughput(MB/s)| Energy (J)|PPW Improvement(%)|
+=============================================+===================+==============+=============+==================+
| acpi_cpufreq: schedutil                     | 46.39             | 17191        | 37057.3     | base             |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: ondemand                      | 51.51             | 19269.5      | 37406.5     | 11.04 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: performance                   | 45.96             | 17063.7      | 37123.7     | -0.74 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: performance(0)               | 54.46             | 20263.1      | 37205       | 17.87 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance performance          | 55.03             | 20481.9      | 37221.5     | 19.14 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance_power                | 54.43             | 20245.9      | 37194.2     | 17.77 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: power(255)                   | 54.26             | 20181.7      | 37197.4     | 17.40 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: schedutil                       | 48.22             | 17844.9      | 37006.6     | 3.80 %           |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: ondemand                        | 61.30             | 22988        | 37503.4     | 33.72 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: performance                     | 54.52             | 20252.6      | 37147.8     | 17.81 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+

change from v5:
 * add one common header `cpufreq_commoncpufreq_common` to extract EPP profiles 
   definition for amd and intel pstate driver.
 * remove the epp_off value to avoid confusion.
 * convert some other sysfs sprintf() function with sysfs_emit() and add onew new patch
 * add acpi pm server priofile detection to enable dynamic boost control
 * fix some code format with checkpatch script
 * move the EPP profile declaration into common header file `cpufreq_common.h`
 * fix commit typos

changes from v4:
 * rebase driver based on the v6.1-rc7
 * remove the builtin changes patch because pstate driver has been
   changed to builtin type by another thread patch
 * update Documentation: amd-pstate: add amd pstate driver mode introduction 
 * replace sprintf with sysfs_emit() instead.
 * fix typo for cppc_set_epp_perf() in cppc_acpi.h header

changes from v3:
 * add one more document update patch for the active and passive mode
   introducion.
 * drive most of the feedbacks from Mario
 * drive feedback from Rafael for the cppc_acpi driver.
 * remove the epp raw data set/get function
 * set the amd-pstate drive by passing kernel parameter
 * set amd-pstate driver disabled by default if no kernel parameter
   input from booting
 * get cppc_set_auto_epp and cppc_set_epp_perf combined
 * pick up reviewed by flag from Mario

changes from v2:
 * change pstate driver as builtin type from module
 * drop patch "export cpufreq cpu release and acquire"
 * squash patch of shared mem into single patch of epp implementation
 * add one new patch to support frequency boost control
 * add patch to expose driver working status checking
 * rebase driver into v6.1-rc4 kernel release
 * move some declaration to amd-pstate.h
 * drive feedback from Mario for the online/offline patch
 * drive feedback from Mario for the suspend/resume patch
 * drive feedback from Ray for the cppc_acpi and some other patches
 * drive feedback from Nathan for the epp patch

changes from v1:
 * rebased to v6.0
 * drive feedbacks from Mario for the suspend/resume patch
 * drive feedbacks from Nathan for the EPP support on msr type
 * fix some typos and code style indent problems
 * update commit comments for patch 4/7
 * change the `epp_enabled` module param name to `epp`
 * set the default epp mode to be false
 * add testing for the x86_energy_perf_policy utility patchset(will
   send that utility patchset with another thread)

v5: https://lore.kernel.org/lkml/20221128170314.2276636-1-perry.yuan@amd.com/
v4: https://lore.kernel.org/lkml/20221110175847.3098728-1-Perry.Yuan@amd.com/
v3: https://lore.kernel.org/all/20221107175705.2207842-1-Perry.Yuan@amd.com/
v2: https://lore.kernel.org/all/20221010162248.348141-1-Perry.Yuan@amd.com/
v1: https://lore.kernel.org/all/20221009071033.21170-1-Perry.Yuan@amd.com/

Perry Yuan (11):
  ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  Documentation: amd-pstate: add EPP profiles introduction
  cpufreq: intel_pstate: use common macro definition for Energy
    Preference Performance(EPP)
  cpufreq: amd_pstate: implement Pstate EPP support for the AMD
    processors
  cpufreq: amd_pstate: implement amd pstate cpu online and offline
    callback
  cpufreq: amd-pstate: implement suspend and resume callbacks
  cpufreq: amd-pstate: add frequency dynamic boost sysfs control
  cpufreq: amd_pstate: add driver working mode status sysfs entry
  Documentation: amd-pstate: add amd pstate driver mode introduction
  Documentation: introduce amd pstate active mode kernel command line
    options
  cpufreq: amd_pstate: convert sprintf with sysfs_emit()

 .../admin-guide/kernel-parameters.txt         |   7 +
 Documentation/admin-guide/pm/amd-pstate.rst   |  45 +-
 arch/x86/include/asm/msr-index.h              |   4 -
 drivers/acpi/cppc_acpi.c                      | 114 ++-
 drivers/cpufreq/amd-pstate.c                  | 886 +++++++++++++++++-
 drivers/cpufreq/intel_pstate.c                |  37 +-
 include/acpi/cppc_acpi.h                      |  12 +
 include/linux/amd-pstate.h                    |  36 +
 include/linux/cpufreq_common.h                |  56 ++
 9 files changed, 1140 insertions(+), 57 deletions(-)
 create mode 100644 include/linux/cpufreq_common.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02 16:58   ` Limonciello, Mario
  2022-12-02  7:47 ` [PATCH v6 02/11] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

Add support for setting and querying EPP preferences to the generic
CPPC driver.  This enables downstream drivers such as amd-pstate to discover
and use these values

In order to get EPP worked, cppc_get_epp_caps() will query EPP preference
value and cppc_set_epp_perf() will set EPP new value.
Before the EPP works, pstate driver will use cppc_set_auto_epp() to
enable EPP function from firmware firstly.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/acpi/cppc_acpi.c | 114 +++++++++++++++++++++++++++++++++++++--
 include/acpi/cppc_acpi.h |  12 +++++
 2 files changed, 121 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 093675b1a1ff..37fa75f25f62 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1093,6 +1093,9 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
 {
 	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
 	struct cpc_register_resource *reg;
+	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
+	struct cppc_pcc_data *pcc_ss_data = NULL;
+	int ret = -EINVAL;
 
 	if (!cpc_desc) {
 		pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
@@ -1102,10 +1105,6 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
 	reg = &cpc_desc->cpc_regs[reg_idx];
 
 	if (CPC_IN_PCC(reg)) {
-		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
-		struct cppc_pcc_data *pcc_ss_data = NULL;
-		int ret = 0;
-
 		if (pcc_ss_id < 0)
 			return -EIO;
 
@@ -1125,7 +1124,7 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
 
 	cpc_read(cpunum, reg, perf);
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1365,6 +1364,111 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
 }
 EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
 
+/**
+ * cppc_get_epp_caps - Get the energy preference register value.
+ * @cpunum: CPU from which to get epp preference level.
+ * @perf_caps: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
+{
+	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
+	struct cpc_register_resource *energy_perf_reg;
+	u64 energy_perf;
+
+	if (!cpc_desc) {
+		pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
+		return -ENODEV;
+	}
+
+	energy_perf_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
+
+	if (!CPC_SUPPORTED(energy_perf_reg))
+		pr_warn_once("energy perf reg update is unsupported!\n");
+
+	if (CPC_IN_PCC(energy_perf_reg)) {
+		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
+		struct cppc_pcc_data *pcc_ss_data = NULL;
+		int ret = 0;
+
+		if (pcc_ss_id < 0)
+			return -ENODEV;
+
+		pcc_ss_data = pcc_data[pcc_ss_id];
+
+		down_write(&pcc_ss_data->pcc_lock);
+
+		if (send_pcc_cmd(pcc_ss_id, CMD_READ) >= 0) {
+			cpc_read(cpunum, energy_perf_reg, &energy_perf);
+			perf_caps->energy_perf = energy_perf;
+		} else {
+			ret = -EIO;
+		}
+
+		up_write(&pcc_ss_data->pcc_lock);
+
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cppc_get_epp_caps);
+
+/*
+ * Set Energy Performance Preference Register value through
+ * Performance Controls Interface
+ */
+int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
+{
+	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
+	struct cpc_register_resource *epp_set_reg;
+	struct cpc_register_resource *auto_sel_reg;
+	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
+	struct cppc_pcc_data *pcc_ss_data = NULL;
+	int ret = -EINVAL;
+
+	if (!cpc_desc) {
+		pr_debug("No CPC descriptor for CPU:%d\n", cpu);
+		return -ENODEV;
+	}
+
+	auto_sel_reg = &cpc_desc->cpc_regs[AUTO_SEL_ENABLE];
+	epp_set_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
+
+	if (CPC_IN_PCC(epp_set_reg) || CPC_IN_PCC(auto_sel_reg)) {
+		if (pcc_ss_id < 0) {
+			pr_debug("Invalid pcc_ss_id\n");
+			return -ENODEV;
+		}
+
+		if (CPC_SUPPORTED(auto_sel_reg)) {
+			ret = cpc_write(cpu, auto_sel_reg, enable);
+			if (ret)
+				return ret;
+		}
+
+		if (CPC_SUPPORTED(epp_set_reg)) {
+			ret = cpc_write(cpu, epp_set_reg, perf_ctrls->energy_perf);
+			if (ret)
+				return ret;
+		}
+
+		pcc_ss_data = pcc_data[pcc_ss_id];
+
+		down_write(&pcc_ss_data->pcc_lock);
+		/* after writing CPC, transfer the ownership of PCC to platform */
+		ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
+		up_write(&pcc_ss_data->pcc_lock);
+	} else {
+		ret = -ENOTSUPP;
+		pr_debug("_CPC in PCC is not supported\n");
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cppc_set_epp_perf);
+
 /**
  * cppc_set_enable - Set to enable CPPC on the processor by writing the
  * Continuous Performance Control package EnableRegister field.
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index c5614444031f..a45bb876a19c 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -108,12 +108,14 @@ struct cppc_perf_caps {
 	u32 lowest_nonlinear_perf;
 	u32 lowest_freq;
 	u32 nominal_freq;
+	u32 energy_perf;
 };
 
 struct cppc_perf_ctrls {
 	u32 max_perf;
 	u32 min_perf;
 	u32 desired_perf;
+	u32 energy_perf;
 };
 
 struct cppc_perf_fb_ctrs {
@@ -149,6 +151,8 @@ extern bool cpc_ffh_supported(void);
 extern bool cpc_supported_by_cpu(void);
 extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val);
 extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
+extern int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps);
+extern int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable);
 #else /* !CONFIG_ACPI_CPPC_LIB */
 static inline int cppc_get_desired_perf(int cpunum, u64 *desired_perf)
 {
@@ -202,6 +206,14 @@ static inline int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val)
 {
 	return -ENOTSUPP;
 }
+static inline int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
+{
+	return -ENOTSUPP;
+}
+static inline int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
+{
+	return -ENOTSUPP;
+}
 #endif /* !CONFIG_ACPI_CPPC_LIB */
 
 #endif /* _CPPC_ACPI_H*/
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 02/11] Documentation: amd-pstate: add EPP profiles introduction
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP) Perry Yuan
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

The patch add AMD pstate EPP feature introduction and what EPP
preference supported for AMD processors.

User can get supported list from
energy_performance_available_preferences attribute file, or update
current profile to energy_performance_preference file

1) See all EPP profiles
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

2) Check current EPP profile
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance

3) Set new EPP profile
$ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 06e23538f79c..33ab8ec8fc2f 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
 <perf_cap_>`_.)
 This attribute is read-only.
 
+``energy_performance_available_preferences``
+
+A list of all the supported EPP preferences that could be used for
+``energy_performance_preference`` on this system.
+These profiles represent different hints that are provided
+to the low-level firmware about the user's desired energy vs efficiency
+tradeoff.  ``default`` represents the epp value is set by platform
+firmware. This attribute is read-only.
+
+``energy_performance_preference``
+
+The current energy performance preference can be read from this attribute.
+and user can change current preference according to energy or performance needs
+Please get all support profiles list from
+``energy_performance_available_preferences`` attribute, all the profiles are
+integer values defined between 0 to 255 when EPP feature is enabled by platform
+firmware, if EPP feature is disabled, driver will ignore the written value
+This attribute is read-write.
+
 Other performance and frequency values can be read back from
 ``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 02/11] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-05 11:48   ` Huang Rui
  2022-12-02  7:47 ` [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors Perry Yuan
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

make the energy preference performance strings and profiles using one
common header for intel_pstate driver, then the amd_pstate epp driver can
use the common header as well. This will simpify the intel_pstate and
amd_pstate driver.

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
---
 arch/x86/include/asm/msr-index.h |  4 ---
 drivers/cpufreq/intel_pstate.c   | 37 +--------------------
 include/linux/cpufreq_common.h   | 56 ++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+), 40 deletions(-)
 create mode 100644 include/linux/cpufreq_common.h

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4a2af82553e4..3983378cff5b 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -472,10 +472,6 @@
 #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
 #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
 #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long) x & 0xff) << 24)
-#define HWP_EPP_PERFORMANCE		0x00
-#define HWP_EPP_BALANCE_PERFORMANCE	0x80
-#define HWP_EPP_BALANCE_POWERSAVE	0xC0
-#define HWP_EPP_POWERSAVE		0xFF
 #define HWP_ACTIVITY_WINDOW(x)		((unsigned long long)(x & 0xff3) << 32)
 #define HWP_PACKAGE_CONTROL(x)		((unsigned long long)(x & 0x1) << 42)
 
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ad9be31753b6..65036ca21719 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -25,6 +25,7 @@
 #include <linux/acpi.h>
 #include <linux/vmalloc.h>
 #include <linux/pm_qos.h>
+#include <linux/cpufreq_common.h>
 #include <trace/events/power.h>
 
 #include <asm/cpu.h>
@@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
 	return 0;
 }
 
-/*
- * EPP/EPB display strings corresponding to EPP index in the
- * energy_perf_strings[]
- *	index		String
- *-------------------------------------
- *	0		default
- *	1		performance
- *	2		balance_performance
- *	3		balance_power
- *	4		power
- */
-
-enum energy_perf_value_index {
-	EPP_INDEX_DEFAULT = 0,
-	EPP_INDEX_PERFORMANCE,
-	EPP_INDEX_BALANCE_PERFORMANCE,
-	EPP_INDEX_BALANCE_POWERSAVE,
-	EPP_INDEX_POWERSAVE,
-};
-
-static const char * const energy_perf_strings[] = {
-	[EPP_INDEX_DEFAULT] = "default",
-	[EPP_INDEX_PERFORMANCE] = "performance",
-	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
-	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
-	[EPP_INDEX_POWERSAVE] = "power",
-	NULL
-};
-static unsigned int epp_values[] = {
-	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
-	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
-	[EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
-	[EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
-	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
-};
-
 static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw_epp)
 {
 	s16 epp;
diff --git a/include/linux/cpufreq_common.h b/include/linux/cpufreq_common.h
new file mode 100644
index 000000000000..2d14b0b0f55c
--- /dev/null
+++ b/include/linux/cpufreq_common.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * linux/include/linux/cpufreq_common.h
+ *
+ * Copyright (C) 2022 Advanced Micro Devices, Inc.
+ *
+ * Author: Perry Yuan <Perry.Yuan@amd.com>
+ */
+
+#ifndef _LINUX_CPUFREQ_COMMON_H
+#define _LINUX_CPUFREQ_COMMON_H
+
+#include <asm/msr.h>
+/*
+ * EPP/EPB display strings corresponding to EPP index in the
+ * energy_perf_strings[]
+ *	index		String
+ *-------------------------------------
+ *	0		default
+ *	1		performance
+ *	2		balance_performance
+ *	3		balance_power
+ *	4		power
+ */
+
+#define HWP_EPP_PERFORMANCE		0x00
+#define HWP_EPP_BALANCE_PERFORMANCE	0x80
+#define HWP_EPP_BALANCE_POWERSAVE	0xC0
+#define HWP_EPP_POWERSAVE		0xFF
+
+enum energy_perf_value_index {
+	EPP_INDEX_DEFAULT = 0,
+	EPP_INDEX_PERFORMANCE,
+	EPP_INDEX_BALANCE_PERFORMANCE,
+	EPP_INDEX_BALANCE_POWERSAVE,
+	EPP_INDEX_POWERSAVE,
+};
+
+static const char * const energy_perf_strings[] = {
+	[EPP_INDEX_DEFAULT] = "default",
+	[EPP_INDEX_PERFORMANCE] = "performance",
+	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
+	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
+	[EPP_INDEX_POWERSAVE] = "power",
+	NULL
+};
+
+static unsigned int epp_values[] = {
+	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
+	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
+	[EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
+	[EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
+	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
+};
+
+#endif /* _LINUX_CPUFREQ_COMMON_H */
\ No newline at end of file
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (2 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP) Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02 17:36   ` Limonciello, Mario
  2022-12-05 12:22   ` Huang Rui
  2022-12-02  7:47 ` [PATCH v6 05/11] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

Add EPP driver support for AMD SoCs which support a dedicated MSR for
CPPC.  EPP is used by the DPM controller to configure the frequency that
a core operates at during short periods of activity.

The SoC EPP targets are configured on a scale from 0 to 255 where 0
represents maximum performance and 255 represents maximum efficiency.

The amd-pstate driver exports profile string names to userspace that are
tied to specific EPP values.

The balance_performance string (0x80) provides the best balance for
efficiency versus power on most systems, but users can choose other
strings to meet their needs as well.

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
default performance balance_performance balance_power power

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
balance_performance

To enable the driver,it needs to add `amd_pstate=active` to kernel
command line and kernel will load the active mode epp driver

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 643 ++++++++++++++++++++++++++++++++++-
 include/linux/amd-pstate.h   |  35 ++
 2 files changed, 672 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 204e39006dda..5989d4d25d0f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -37,9 +37,12 @@
 #include <linux/uaccess.h>
 #include <linux/static_call.h>
 #include <linux/amd-pstate.h>
+#include <linux/cpufreq_common.h>
 
+#ifdef CONFIG_ACPI
 #include <acpi/processor.h>
 #include <acpi/cppc_acpi.h>
+#endif
 
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -59,9 +62,130 @@
  * we disable it by default to go acpi-cpufreq on these processors and add a
  * module parameter to be able to enable it manually for debugging.
  */
-static struct cpufreq_driver amd_pstate_driver;
+static bool cppc_active;
 static int cppc_load __initdata;
 
+static struct cpufreq_driver *default_pstate_driver;
+static struct amd_cpudata **all_cpu_data;
+static struct amd_pstate_params global_params;
+
+static DEFINE_MUTEX(amd_pstate_limits_lock);
+static DEFINE_MUTEX(amd_pstate_driver_lock);
+
+static bool cppc_boost __read_mostly;
+struct kobject *amd_pstate_kobj;
+
+#ifdef CONFIG_ACPI_CPPC_LIB
+static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
+{
+	s16 epp;
+	struct cppc_perf_caps perf_caps;
+	int ret;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		if (!cppc_req_cached) {
+			epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
+					&cppc_req_cached);
+			if (epp)
+				return epp;
+		}
+		epp = (cppc_req_cached >> 24) & 0xFF;
+	} else {
+		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
+		if (ret < 0) {
+			pr_debug("Could not retrieve energy perf value (%d)\n", ret);
+			return -EIO;
+		}
+		epp = (s16) perf_caps.energy_perf;
+	}
+
+	return epp;
+}
+#endif
+
+static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
+{
+	s16 epp;
+	int index = -EINVAL;
+
+	epp = amd_pstate_get_epp(cpudata, 0);
+	if (epp < 0)
+		return epp;
+
+	switch (epp) {
+	case HWP_EPP_PERFORMANCE:
+		index = EPP_INDEX_PERFORMANCE;
+		break;
+	case HWP_EPP_BALANCE_PERFORMANCE:
+		index = EPP_INDEX_BALANCE_PERFORMANCE;
+		break;
+	case HWP_EPP_BALANCE_POWERSAVE:
+		index = EPP_INDEX_BALANCE_POWERSAVE;
+		break;
+	case HWP_EPP_POWERSAVE:
+		index = EPP_INDEX_POWERSAVE;
+		break;
+	default:
+			break;
+	}
+
+	return index;
+}
+
+#ifdef CONFIG_ACPI_CPPC_LIB
+static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
+{
+	int ret;
+	struct cppc_perf_ctrls perf_ctrls;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		u64 value = READ_ONCE(cpudata->cppc_req_cached);
+
+		value &= ~GENMASK_ULL(31, 24);
+		value |= (u64)epp << 24;
+		WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+		if (!ret)
+			cpudata->epp_cached = epp;
+	} else {
+		perf_ctrls.energy_perf = epp;
+		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
+		if (ret) {
+			pr_debug("failed to set energy perf value (%d)\n", ret);
+			return ret;
+		}
+		cpudata->epp_cached = epp;
+	}
+
+	return ret;
+}
+
+static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
+		int pref_index)
+{
+	int epp = -EINVAL;
+	int ret;
+
+	if (!pref_index) {
+		pr_debug("EPP pref_index is invalid\n");
+		return -EINVAL;
+	}
+
+	if (epp == -EINVAL)
+		epp = epp_values[pref_index];
+
+	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+		pr_debug("EPP cannot be set under performance policy\n");
+		return -EBUSY;
+	}
+
+	ret = amd_pstate_set_epp(cpudata, epp);
+
+	return ret;
+}
+#endif
+
 static inline int pstate_enable(bool enable)
 {
 	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
@@ -70,11 +194,21 @@ static inline int pstate_enable(bool enable)
 static int cppc_enable(bool enable)
 {
 	int cpu, ret = 0;
+	struct cppc_perf_ctrls perf_ctrls;
 
 	for_each_present_cpu(cpu) {
 		ret = cppc_set_enable(cpu, enable);
 		if (ret)
 			return ret;
+
+		/* Enable autonomous mode for EPP */
+		if (!cppc_active) {
+			/* Set desired perf as zero to allow EPP firmware control */
+			perf_ctrls.desired_perf = 0;
+			ret = cppc_set_perf(cpu, &perf_ctrls);
+			if (ret)
+				return ret;
+		}
 	}
 
 	return ret;
@@ -417,7 +551,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
 		return;
 
 	cpudata->boost_supported = true;
-	amd_pstate_driver.boost_enabled = true;
+	default_pstate_driver->boost_enabled = true;
 }
 
 static void amd_perf_ctl_reset(unsigned int cpu)
@@ -591,10 +725,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 	return sprintf(&buf[0], "%u\n", perf);
 }
 
+static ssize_t show_energy_performance_available_preferences(
+				struct cpufreq_policy *policy, char *buf)
+{
+	int i = 0;
+	int ret = 0;
+
+	while (energy_perf_strings[i] != NULL)
+		ret += sysfs_emit(buf, "%s", energy_perf_strings[i++]);
+
+	ret += sysfs_emit(buf, "\n");
+
+	return ret;
+}
+
+static ssize_t store_energy_performance_preference(
+		struct cpufreq_policy *policy, const char *buf, size_t count)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+	char str_preference[21];
+	ssize_t ret;
+
+	ret = sscanf(buf, "%20s", str_preference);
+	if (ret != 1)
+		return -EINVAL;
+
+	ret = match_string(energy_perf_strings, -1, str_preference);
+	if (ret < 0)
+		return -EINVAL;
+
+	mutex_lock(&amd_pstate_limits_lock);
+	ret = amd_pstate_set_energy_pref_index(cpudata, ret);
+	mutex_unlock(&amd_pstate_limits_lock);
+
+	return ret ?: count;
+}
+
+static ssize_t show_energy_performance_preference(
+				struct cpufreq_policy *policy, char *buf)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+	int preference;
+
+	preference = amd_pstate_get_energy_pref_index(cpudata);
+	if (preference < 0)
+		return preference;
+
+	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
 cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_rw(energy_performance_preference);
+cpufreq_freq_attr_ro(energy_performance_available_preferences);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
@@ -603,6 +788,429 @@ static struct freq_attr *amd_pstate_attr[] = {
 	NULL,
 };
 
+static struct freq_attr *amd_pstate_epp_attr[] = {
+	&amd_pstate_max_freq,
+	&amd_pstate_lowest_nonlinear_freq,
+	&amd_pstate_highest_perf,
+	&energy_performance_preference,
+	&energy_performance_available_preferences,
+	NULL,
+};
+
+static inline void update_boost_state(void)
+{
+	u64 misc_en;
+	struct amd_cpudata *cpudata;
+
+	cpudata = all_cpu_data[0];
+	rdmsrl(MSR_K7_HWCR, misc_en);
+	global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
+}
+
+static bool amd_pstate_acpi_pm_profile_server(void)
+{
+	if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
+	    acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)
+		return true;
+
+	return false;
+}
+
+static int amd_pstate_init_cpu(unsigned int cpunum)
+{
+	struct amd_cpudata *cpudata;
+
+	cpudata = all_cpu_data[cpunum];
+	if (!cpudata) {
+		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
+		if (!cpudata)
+			return -ENOMEM;
+		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
+
+		cpudata->cpu = cpunum;
+
+		if (cppc_active) {
+			if (amd_pstate_acpi_pm_profile_server())
+				cppc_boost = true;
+		}
+
+	}
+	cpudata->epp_powersave = -EINVAL;
+	cpudata->epp_policy = 0;
+	pr_debug("controlling: cpu %d\n", cpunum);
+	return 0;
+}
+
+static int __amd_pstate_cpu_init(struct cpufreq_policy *policy)
+{
+	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
+	struct amd_cpudata *cpudata;
+	struct device *dev;
+	int rc;
+	u64 value;
+
+	rc = amd_pstate_init_cpu(policy->cpu);
+	if (rc)
+		return rc;
+
+	cpudata = all_cpu_data[policy->cpu];
+
+	dev = get_cpu_device(policy->cpu);
+	if (!dev)
+		goto free_cpudata1;
+
+	rc = amd_pstate_init_perf(cpudata);
+	if (rc)
+		goto free_cpudata1;
+
+	min_freq = amd_get_min_freq(cpudata);
+	max_freq = amd_get_max_freq(cpudata);
+	nominal_freq = amd_get_nominal_freq(cpudata);
+	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+	if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
+		dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
+				min_freq, max_freq);
+		ret = -EINVAL;
+		goto free_cpudata1;
+	}
+
+	policy->min = min_freq;
+	policy->max = max_freq;
+
+	policy->cpuinfo.min_freq = min_freq;
+	policy->cpuinfo.max_freq = max_freq;
+	/* It will be updated by governor */
+	policy->cur = policy->cpuinfo.min_freq;
+
+	/* Initial processor data capability frequencies */
+	cpudata->max_freq = max_freq;
+	cpudata->min_freq = min_freq;
+	cpudata->nominal_freq = nominal_freq;
+	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
+
+	policy->driver_data = cpudata;
+
+	update_boost_state();
+	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
+
+	policy->min = policy->cpuinfo.min_freq;
+	policy->max = policy->cpuinfo.max_freq;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC))
+		policy->fast_switch_possible = true;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
+		if (ret)
+			return ret;
+		WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
+		if (ret)
+			return ret;
+		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
+	}
+	amd_pstate_boost_init(cpudata);
+
+	return 0;
+
+free_cpudata1:
+	kfree(cpudata);
+	return ret;
+}
+
+static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
+{
+	int ret;
+
+	ret = __amd_pstate_cpu_init(policy);
+	if (ret)
+		return ret;
+	/*
+	 * Set the policy to powersave to provide a valid fallback value in case
+	 * the default cpufreq governor is neither powersave nor performance.
+	 */
+	policy->policy = CPUFREQ_POLICY_POWERSAVE;
+
+	return 0;
+}
+
+static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
+{
+	pr_debug("CPU %d exiting\n", policy->cpu);
+	policy->fast_switch_possible = false;
+	return 0;
+}
+
+static void amd_pstate_update_max_freq(unsigned int cpu)
+{
+	struct cpufreq_policy *policy = policy = cpufreq_cpu_get(cpu);
+
+	if (!policy)
+		return;
+
+	refresh_frequency_limits(policy);
+	cpufreq_cpu_put(policy);
+}
+
+static void amd_pstate_epp_update_limits(unsigned int cpu)
+{
+	mutex_lock(&amd_pstate_driver_lock);
+	update_boost_state();
+	if (global_params.cppc_boost_disabled) {
+		for_each_possible_cpu(cpu)
+			amd_pstate_update_max_freq(cpu);
+	} else {
+		cpufreq_update_policy(cpu);
+	}
+	mutex_unlock(&amd_pstate_driver_lock);
+}
+
+static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
+
+static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata)
+{
+	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
+	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
+	u32 max_limit = (hwp_req & 0xff);
+	u32 min_limit = (hwp_req & 0xff00) >> 8;
+	u32 boost_level1;
+
+	/* If max and min are equal or already at max, nothing to boost */
+	if (max_limit == min_limit)
+		return;
+
+	/* Set boost max and min to initial value */
+	if (!cpudata->cppc_boost_min)
+		cpudata->cppc_boost_min = min_limit;
+
+	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) + min_limit) >> 1);
+
+	if (cpudata->cppc_boost_min < boost_level1)
+		cpudata->cppc_boost_min = boost_level1;
+	else if (cpudata->cppc_boost_min < AMD_CPPC_NOMINAL_PERF(hwp_cap))
+		cpudata->cppc_boost_min = AMD_CPPC_NOMINAL_PERF(hwp_cap);
+	else if (cpudata->cppc_boost_min == AMD_CPPC_NOMINAL_PERF(hwp_cap))
+		cpudata->cppc_boost_min = max_limit;
+	else
+		return;
+
+	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
+	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
+	wrmsrl(MSR_AMD_CPPC_REQ, hwp_req);
+	cpudata->last_update = cpudata->sample.time;
+}
+
+static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
+{
+	bool expired;
+
+	if (cpudata->cppc_boost_min) {
+		expired = time_after64(cpudata->sample.time, cpudata->last_update +
+					cppc_boost_hold_time_ns);
+
+		if (expired) {
+			wrmsrl(MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
+			cpudata->cppc_boost_min = 0;
+		}
+	}
+
+	cpudata->last_update = cpudata->sample.time;
+}
+
+static inline void amd_pstate_boost_update_util(struct amd_cpudata *cpudata,
+						      u64 time)
+{
+	cpudata->sample.time = time;
+	if (smp_processor_id() != cpudata->cpu)
+		return;
+
+	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
+		bool do_io = false;
+
+		cpudata->sched_flags = 0;
+		/*
+		 * Set iowait_boost flag and update time. Since IO WAIT flag
+		 * is set all the time, we can't just conclude that there is
+		 * some IO bound activity is scheduled on this CPU with just
+		 * one occurrence. If we receive at least two in two
+		 * consecutive ticks, then we treat as boost candidate.
+		 * This is leveraged from Intel Pstate driver.
+		 */
+		if (time_before64(time, cpudata->last_io_update + 2 * TICK_NSEC))
+			do_io = true;
+
+		cpudata->last_io_update = time;
+
+		if (do_io)
+			amd_pstate_boost_up(cpudata);
+
+	} else {
+		amd_pstate_boost_down(cpudata);
+	}
+}
+
+static inline void amd_pstate_cppc_update_hook(struct update_util_data *data,
+						u64 time, unsigned int flags)
+{
+	struct amd_cpudata *cpudata = container_of(data,
+				struct amd_cpudata, update_util);
+
+	cpudata->sched_flags |= flags;
+
+	if (smp_processor_id() == cpudata->cpu)
+		amd_pstate_boost_update_util(cpudata, time);
+}
+
+static void amd_pstate_clear_update_util_hook(unsigned int cpu)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu];
+
+	if (!cpudata->update_util_set)
+		return;
+
+	cpufreq_remove_update_util_hook(cpu);
+	cpudata->update_util_set = false;
+	synchronize_rcu();
+}
+
+static void amd_pstate_set_update_util_hook(unsigned int cpu_num)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
+
+	if (!cppc_boost) {
+		if (cpudata->update_util_set)
+			amd_pstate_clear_update_util_hook(cpudata->cpu);
+		return;
+	}
+
+	if (cpudata->update_util_set)
+		return;
+
+	cpudata->sample.time = 0;
+	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
+						amd_pstate_cppc_update_hook);
+	cpudata->update_util_set = true;
+}
+
+static void amd_pstate_epp_init(unsigned int cpu)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu];
+	u32 max_perf, min_perf;
+	u64 value;
+	s16 epp;
+	int ret;
+
+	max_perf = READ_ONCE(cpudata->highest_perf);
+	min_perf = READ_ONCE(cpudata->lowest_perf);
+
+	value = READ_ONCE(cpudata->cppc_req_cached);
+
+	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+		min_perf = max_perf;
+
+	/* Initial min/max values for CPPC Performance Controls Register */
+	value &= ~AMD_CPPC_MIN_PERF(~0L);
+	value |= AMD_CPPC_MIN_PERF(min_perf);
+
+	value &= ~AMD_CPPC_MAX_PERF(~0L);
+	value |= AMD_CPPC_MAX_PERF(max_perf);
+
+	/* CPPC EPP feature require to set zero to the desire perf bit */
+	value &= ~AMD_CPPC_DES_PERF(~0L);
+	value |= AMD_CPPC_DES_PERF(0);
+
+	if (cpudata->epp_policy == cpudata->policy)
+		goto skip_epp;
+
+	cpudata->epp_policy = cpudata->policy;
+
+	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+		epp = amd_pstate_get_epp(cpudata, value);
+		cpudata->epp_powersave = epp;
+		if (epp < 0)
+			goto skip_epp;
+		/* force the epp value to be zero for performance policy */
+		epp = 0;
+	} else {
+		if (cpudata->epp_powersave < 0)
+			goto skip_epp;
+		/* Get BIOS pre-defined epp value */
+		epp = amd_pstate_get_epp(cpudata, value);
+		if (epp)
+			goto skip_epp;
+		epp = cpudata->epp_powersave;
+	}
+	/* Set initial EPP value */
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		value &= ~GENMASK_ULL(31, 24);
+		value |= (u64)epp << 24;
+	}
+
+skip_epp:
+	WRITE_ONCE(cpudata->cppc_req_cached, value);
+	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	if (!ret)
+		cpudata->epp_cached = epp;
+}
+
+static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata)
+{
+	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
+	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
+	u32 max_limit = (hwp_cap >> 24) & 0xff;
+
+	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
+	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
+	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
+}
+
+static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata;
+
+	if (!policy->cpuinfo.max_freq)
+		return -ENODEV;
+
+	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
+				policy->cpuinfo.max_freq, policy->max);
+
+	cpudata = all_cpu_data[policy->cpu];
+	cpudata->policy = policy->policy;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		mutex_lock(&amd_pstate_limits_lock);
+
+		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+			amd_pstate_clear_update_util_hook(policy->cpu);
+			amd_pstate_set_max_limits(cpudata);
+		} else {
+			amd_pstate_set_update_util_hook(policy->cpu);
+		}
+
+		if (boot_cpu_has(X86_FEATURE_CPPC))
+			amd_pstate_epp_init(policy->cpu);
+
+		mutex_unlock(&amd_pstate_limits_lock);
+	}
+
+	return 0;
+}
+
+static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
+					   struct cpufreq_policy_data *policy)
+{
+	update_boost_state();
+	cpufreq_verify_within_cpu_limits(policy);
+}
+
+static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
+{
+	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
+	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
+	return 0;
+}
+
 static struct cpufreq_driver amd_pstate_driver = {
 	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
 	.verify		= amd_pstate_verify,
@@ -616,8 +1224,20 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.attr		= amd_pstate_attr,
 };
 
+static struct cpufreq_driver amd_pstate_epp_driver = {
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.verify		= amd_pstate_epp_verify_policy,
+	.setpolicy	= amd_pstate_epp_set_policy,
+	.init		= amd_pstate_epp_cpu_init,
+	.exit		= amd_pstate_epp_cpu_exit,
+	.update_limits	= amd_pstate_epp_update_limits,
+	.name		= "amd_pstate_epp",
+	.attr		= amd_pstate_epp_attr,
+};
+
 static int __init amd_pstate_init(void)
 {
+	static struct amd_cpudata **cpudata;
 	int ret;
 
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
@@ -644,7 +1264,8 @@ static int __init amd_pstate_init(void)
 	/* capability check */
 	if (boot_cpu_has(X86_FEATURE_CPPC)) {
 		pr_debug("AMD CPPC MSR based functionality is supported\n");
-		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
+		if (!cppc_active)
+			default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
 	} else {
 		pr_debug("AMD CPPC shared memory based functionality is supported\n");
 		static_call_update(amd_pstate_enable, cppc_enable);
@@ -652,6 +1273,10 @@ static int __init amd_pstate_init(void)
 		static_call_update(amd_pstate_update_perf, cppc_update_perf);
 	}
 
+	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
+	if (!cpudata)
+		return -ENOMEM;
+	WRITE_ONCE(all_cpu_data, cpudata);
 	/* enable amd pstate feature */
 	ret = amd_pstate_enable(true);
 	if (ret) {
@@ -659,9 +1284,9 @@ static int __init amd_pstate_init(void)
 		return ret;
 	}
 
-	ret = cpufreq_register_driver(&amd_pstate_driver);
+	ret = cpufreq_register_driver(default_pstate_driver);
 	if (ret)
-		pr_err("failed to register amd_pstate_driver with return %d\n",
+		pr_err("failed to register amd pstate driver with return %d\n",
 		       ret);
 
 	return ret;
@@ -676,8 +1301,14 @@ static int __init amd_pstate_param(char *str)
 	if (!strcmp(str, "disable")) {
 		cppc_load = 0;
 		pr_info("driver is explicitly disabled\n");
-	} else if (!strcmp(str, "passive"))
+	} else if (!strcmp(str, "passive")) {
 		cppc_load = 1;
+		default_pstate_driver = &amd_pstate_driver;
+	} else if (!strcmp(str, "active")) {
+		cppc_active = 1;
+		cppc_load = 1;
+		default_pstate_driver = &amd_pstate_epp_driver;
+	}
 
 	return 0;
 }
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 1c4b8659f171..888af62040f1 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -25,6 +25,7 @@ struct amd_aperf_mperf {
 	u64 aperf;
 	u64 mperf;
 	u64 tsc;
+	u64 time;
 };
 
 /**
@@ -47,6 +48,18 @@ struct amd_aperf_mperf {
  * @prev: Last Aperf/Mperf/tsc count value read from register
  * @freq: current cpu frequency value
  * @boost_supported: check whether the Processor or SBIOS supports boost mode
+ * @epp_powersave: Last saved CPPC energy performance preference
+				when policy switched to performance
+ * @epp_policy: Last saved policy used to set energy-performance preference
+ * @epp_cached: Cached CPPC energy-performance preference value
+ * @policy: Cpufreq policy value
+ * @sched_flags: Store scheduler flags for possible cross CPU update
+ * @update_util_set: CPUFreq utility callback is set
+ * @last_update: Time stamp of the last performance state update
+ * @cppc_boost_min: Last CPPC boosted min performance state
+ * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
+ * @update_util: Cpufreq utility callback information
+ * @sample: the stored performance sample
  *
  * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
  * represents all the attributes and goals that AMD P-State requests at runtime.
@@ -72,6 +85,28 @@ struct amd_cpudata {
 
 	u64	freq;
 	bool	boost_supported;
+
+	/* EPP feature related attributes*/
+	s16	epp_powersave;
+	s16	epp_policy;
+	s16	epp_cached;
+	u32	policy;
+	u32	sched_flags;
+	bool	update_util_set;
+	u64	last_update;
+	u64	last_io_update;
+	u32	cppc_boost_min;
+	u64	cppc_cap1_cached;
+	struct	update_util_data update_util;
+	struct	amd_aperf_mperf sample;
+};
+
+/**
+ * struct amd_pstate_params - global parameters for the performance control
+ * @ cppc_boost_disabled wheher the core performance boost disabled
+ */
+struct amd_pstate_params {
+	bool cppc_boost_disabled;
 };
 
 #endif /* _LINUX_AMD_PSTATE_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 05/11] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (3 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 06/11] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

Adds online and offline driver callback support to allow cpu cores go
offline and help to restore the previous working states when core goes
back online later for EPP driver mode.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 89 ++++++++++++++++++++++++++++++++++++
 include/linux/amd-pstate.h   |  1 +
 2 files changed, 90 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 5989d4d25d0f..7545d83a934b 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1197,6 +1197,93 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
 	return 0;
 }
 
+static void amd_pstate_epp_reenable(struct amd_cpudata *cpudata)
+{
+	struct cppc_perf_ctrls perf_ctrls;
+	u64 value, max_perf;
+	int ret;
+
+	ret = amd_pstate_enable(true);
+	if (ret)
+		pr_err("failed to enable amd pstate during resume, return %d\n", ret);
+
+	value = READ_ONCE(cpudata->cppc_req_cached);
+	max_perf = READ_ONCE(cpudata->highest_perf);
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	} else {
+		perf_ctrls.max_perf = max_perf;
+		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(cpudata->epp_cached);
+		cppc_set_perf(cpudata->cpu, &perf_ctrls);
+	}
+}
+
+static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
+
+	if (cppc_active) {
+		amd_pstate_epp_reenable(cpudata);
+		cpudata->suspended = false;
+	}
+
+	return 0;
+}
+
+static void amd_pstate_epp_offline(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+	struct cppc_perf_ctrls perf_ctrls;
+	int min_perf;
+	u64 value;
+
+	min_perf = READ_ONCE(cpudata->lowest_perf);
+	value = READ_ONCE(cpudata->cppc_req_cached);
+
+	mutex_lock(&amd_pstate_limits_lock);
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		cpudata->epp_policy = CPUFREQ_POLICY_UNKNOWN;
+
+		/* Set max perf same as min perf */
+		value &= ~AMD_CPPC_MAX_PERF(~0L);
+		value |= AMD_CPPC_MAX_PERF(min_perf);
+		value &= ~AMD_CPPC_MIN_PERF(~0L);
+		value |= AMD_CPPC_MIN_PERF(min_perf);
+		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	} else {
+		perf_ctrls.desired_perf = 0;
+		perf_ctrls.max_perf = min_perf;
+		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(HWP_EPP_POWERSAVE);
+		cppc_set_perf(cpudata->cpu, &perf_ctrls);
+	}
+	mutex_unlock(&amd_pstate_limits_lock);
+}
+
+static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	pr_debug("AMD CPU Core %d going offline\n", cpudata->cpu);
+
+	if (cpudata->suspended)
+		return 0;
+
+	if (cppc_active)
+		amd_pstate_epp_offline(policy);
+
+	return 0;
+}
+
+static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
+{
+	amd_pstate_clear_update_util_hook(policy->cpu);
+
+	return amd_pstate_cpu_offline(policy);
+}
+
 static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
 					   struct cpufreq_policy_data *policy)
 {
@@ -1231,6 +1318,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.init		= amd_pstate_epp_cpu_init,
 	.exit		= amd_pstate_epp_cpu_exit,
 	.update_limits	= amd_pstate_epp_update_limits,
+	.offline	= amd_pstate_epp_cpu_offline,
+	.online		= amd_pstate_epp_cpu_online,
 	.name		= "amd_pstate_epp",
 	.attr		= amd_pstate_epp_attr,
 };
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 888af62040f1..3dd26a3d104c 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -99,6 +99,7 @@ struct amd_cpudata {
 	u64	cppc_cap1_cached;
 	struct	update_util_data update_util;
 	struct	amd_aperf_mperf sample;
+	bool suspended;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 06/11] cpufreq: amd-pstate: implement suspend and resume callbacks
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (4 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 05/11] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control Perry Yuan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

add suspend and resume support for the AMD processors by amd_pstate_epp
driver instance.

When the CPPC is suspended, EPP driver will set EPP profile to 'power'
profile and set max/min perf to lowest perf value.
When resume happens, it will restore the MSR registers with
previous cached value.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 40 ++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 7545d83a934b..b3a82cee2e83 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1284,6 +1284,44 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
 	return amd_pstate_cpu_offline(policy);
 }
 
+static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+	int ret;
+
+	/* avoid suspending when EPP is not enabled */
+	if (!cppc_active)
+		return 0;
+
+	/* set this flag to avoid setting core offline*/
+	cpudata->suspended = true;
+
+	/* disable CPPC in lowlevel firmware */
+	ret = amd_pstate_enable(false);
+	if (ret)
+		pr_err("failed to suspend, return %d\n", ret);
+
+	return 0;
+}
+
+static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	if (cpudata->suspended) {
+		mutex_lock(&amd_pstate_limits_lock);
+
+		/* enable amd pstate from suspend state*/
+		amd_pstate_epp_reenable(cpudata);
+
+		mutex_unlock(&amd_pstate_limits_lock);
+
+		cpudata->suspended = false;
+	}
+
+	return 0;
+}
+
 static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
 					   struct cpufreq_policy_data *policy)
 {
@@ -1320,6 +1358,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.update_limits	= amd_pstate_epp_update_limits,
 	.offline	= amd_pstate_epp_cpu_offline,
 	.online		= amd_pstate_epp_cpu_online,
+	.suspend	= amd_pstate_epp_suspend,
+	.resume		= amd_pstate_epp_resume,
 	.name		= "amd_pstate_epp",
 	.attr		= amd_pstate_epp_attr,
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (5 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 06/11] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02 15:59   ` Limonciello, Mario
  2022-12-02  7:47 ` [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry Perry Yuan
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

Add one sysfs entry to control the CPU cores frequency boost state
The attribute file can allow user to set max performance boosted or
keeping at normal perf level.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 66 ++++++++++++++++++++++++++++++++++--
 1 file changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index b3a82cee2e83..6936df6e8642 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -774,12 +774,46 @@ static ssize_t show_energy_performance_preference(
 	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
 }
 
+static void amd_pstate_update_policies(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		cpufreq_update_policy(cpu);
+}
+
+static ssize_t show_boost(struct kobject *kobj,
+				struct kobj_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%u\n", cppc_boost);
+}
+
+static ssize_t store_boost(struct kobject *a,
+				       struct kobj_attribute *b,
+				       const char *buf, size_t count)
+{
+	bool new_state;
+	int ret;
+
+	ret = kstrtobool(buf, &new_state);
+	if (ret)
+		return -EINVAL;
+
+	mutex_lock(&amd_pstate_driver_lock);
+	cppc_boost = !!new_state;
+	amd_pstate_update_policies();
+	mutex_unlock(&amd_pstate_driver_lock);
+
+	return count;
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
 cpufreq_freq_attr_ro(amd_pstate_highest_perf);
 cpufreq_freq_attr_rw(energy_performance_preference);
 cpufreq_freq_attr_ro(energy_performance_available_preferences);
+define_one_global_rw(boost);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
@@ -797,6 +831,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 	NULL,
 };
 
+static struct attribute *pstate_global_attributes[] = {
+	&boost.attr,
+	NULL
+};
+
+static const struct attribute_group amd_pstate_global_attr_group = {
+	.attrs = pstate_global_attributes,
+};
+
 static inline void update_boost_state(void)
 {
 	u64 misc_en;
@@ -1415,9 +1458,28 @@ static int __init amd_pstate_init(void)
 
 	ret = cpufreq_register_driver(default_pstate_driver);
 	if (ret)
-		pr_err("failed to register amd pstate driver with return %d\n",
-		       ret);
+		pr_err("failed to register driver with return %d\n", ret);
+
+	amd_pstate_kobj = kobject_create_and_add("amd-pstate", &cpu_subsys.dev_root->kobj);
+	if (!amd_pstate_kobj) {
+		ret = -EINVAL;
+		pr_err("global sysfs registration failed.\n");
+		goto kobject_free;
+	}
+
+	ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
+	if (ret) {
+		pr_err("sysfs attribute export failed with error %d.\n", ret);
+		goto global_attr_free;
+	}
+
+	return ret;
 
+global_attr_free:
+	kobject_put(amd_pstate_kobj);
+kobject_free:
+	kfree(cpudata);
+	cpufreq_unregister_driver(default_pstate_driver);
 	return ret;
 }
 device_initcall(amd_pstate_init);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (6 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02 15:57   ` Limonciello, Mario
  2022-12-02  7:47 ` [PATCH v6 09/11] Documentation: amd-pstate: add amd pstate driver mode introduction Perry Yuan
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

While amd-pstate driver was loaded with specific driver mode, it will
need to check which mode is enabled for the pstate driver,add this sysfs
entry to show the current status

$ cat /sys/devices/system/cpu/amd-pstate/status
active

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 44 ++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 6936df6e8642..7f748a579023 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -66,6 +66,8 @@ static bool cppc_active;
 static int cppc_load __initdata;
 
 static struct cpufreq_driver *default_pstate_driver;
+static struct cpufreq_driver amd_pstate_epp_driver;
+static struct cpufreq_driver amd_pstate_driver;
 static struct amd_cpudata **all_cpu_data;
 static struct amd_pstate_params global_params;
 
@@ -807,6 +809,46 @@ static ssize_t store_boost(struct kobject *a,
 	return count;
 }
 
+static ssize_t amd_pstate_show_status(char *buf)
+{
+	if (!default_pstate_driver)
+		return sysfs_emit(buf, "off\n");
+
+	return sysfs_emit(buf, "%s\n", default_pstate_driver == &amd_pstate_epp_driver ?
+					"active" : "passive");
+}
+
+static int amd_pstate_update_status(const char *buf, size_t size)
+{
+	/* FIXME! */
+	return -EOPNOTSUPP;
+}
+
+static ssize_t show_status(struct kobject *kobj,
+			   struct kobj_attribute *attr, char *buf)
+{
+	ssize_t ret;
+
+	mutex_lock(&amd_pstate_driver_lock);
+	ret = amd_pstate_show_status(buf);
+	mutex_unlock(&amd_pstate_driver_lock);
+
+	return ret;
+}
+
+static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
+			    const char *buf, size_t count)
+{
+	char *p = memchr(buf, '\n', count);
+	int ret;
+
+	mutex_lock(&amd_pstate_driver_lock);
+	ret = amd_pstate_update_status(buf, p ? p - buf : count);
+	mutex_unlock(&amd_pstate_driver_lock);
+
+	return ret < 0 ? ret : count;
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
@@ -814,6 +856,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
 cpufreq_freq_attr_rw(energy_performance_preference);
 cpufreq_freq_attr_ro(energy_performance_available_preferences);
 define_one_global_rw(boost);
+define_one_global_rw(status);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
@@ -833,6 +876,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
 
 static struct attribute *pstate_global_attributes[] = {
 	&boost.attr,
+	&status.attr,
 	NULL
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 09/11] Documentation: amd-pstate: add amd pstate driver mode introduction
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (7 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 10/11] Documentation: introduce amd pstate active mode kernel command line options Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 11/11] cpufreq: amd_pstate: convert sprintf with sysfs_emit() Perry Yuan
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

From: Perry Yuan <Perry.Yuan@amd.com>

Introduce ``amd-pstate`` CPPC has two operation modes:
* CPPC Autonomous (active) mode
* CPPC non-autonomous (passive) mode.
active mode and passive mode can be chosen by different kernel parameters.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 Documentation/admin-guide/pm/amd-pstate.rst | 26 +++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 33ab8ec8fc2f..62744dae3c5f 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -299,8 +299,30 @@ module which supports the new AMD P-States mechanism on most of the future AMD
 platforms. The AMD P-States mechanism is the more performance and energy
 efficiency frequency management method on AMD processors.
 
-Kernel Module Options for ``amd-pstate``
-=========================================
+
+AMD Pstate Driver Operation Modes
+=================================
+
+``amd_pstate`` CPPC has two operation modes: CPPC Autonomous(active) mode and
+CPPC non-autonomous(passive) mode.
+active mode and passive mode can be chosen by different kernel parameters.
+When in Autonomous mode, CPPC ignores requests done in the Desired Performance
+Target register and takes into account only the values set to the Minimum requested
+performance, Maximum requested performance, and Energy Performance Preference
+registers. When Autonomous is disabled, it only considers the Desired Performance Target.
+
+Active Mode
+------------
+
+``amd_pstate=active``
+
+This is the low-level firmware control mode which is implemented by ``amd_pstate_epp``
+driver with ``amd_pstate=active`` passed to the kernel in the command line.
+In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software
+wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware.
+then CPPC power algorithm will calculate the runtime workload and adjust the realtime
+cores frequency according to the power supply and thermal, core voltage and some other
+hardware conditions.
 
 Passive Mode
 ------------
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 10/11] Documentation: introduce amd pstate active mode kernel command line options
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (8 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 09/11] Documentation: amd-pstate: add amd pstate driver mode introduction Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  2022-12-02  7:47 ` [PATCH v6 11/11] cpufreq: amd_pstate: convert sprintf with sysfs_emit() Perry Yuan
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

AMD Pstate driver support another firmware based autonomous mode
with "amd_pstate=active" added to the kernel command line.
In autonomous mode SMU firmware decides frequencies at 1 ms timescale
based on workload utilization, usage in other IPs, infrastructure
limits such as power, thermals and so on.

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 42af9ca0127e..73a02816f6f8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6970,3 +6970,10 @@
 			  management firmware translates the requests into actual
 			  hardware states (core frequency, data fabric and memory
 			  clocks etc.)
+			active
+			  Use amd_pstate_epp driver instance as the scaling driver,
+			  driver provides a hint to the hardware if software wants
+			  to bias toward performance (0x0) or energy efficiency (0xff)
+			  to the CPPC firmware. then CPPC power algorithm will
+			  calculate the runtime workload and adjust the realtime cores
+			  frequency.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 11/11] cpufreq: amd_pstate: convert sprintf with sysfs_emit()
  2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (9 preceding siblings ...)
  2022-12-02  7:47 ` [PATCH v6 10/11] Documentation: introduce amd pstate active mode kernel command line options Perry Yuan
@ 2022-12-02  7:47 ` Perry Yuan
  10 siblings, 0 replies; 27+ messages in thread
From: Perry Yuan @ 2022-12-02  7:47 UTC (permalink / raw)
  To: rafael.j.wysocki, Mario.Limonciello, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

replace the sprintf with a more generic sysfs_emit function

No potential function impact

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 7f748a579023..28e2dbaad542 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -696,7 +696,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
 	if (max_freq < 0)
 		return max_freq;
 
-	return sprintf(&buf[0], "%u\n", max_freq);
+	return sysfs_emit(buf, "%u\n", max_freq);
 }
 
 static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
@@ -709,7 +709,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
 	if (freq < 0)
 		return freq;
 
-	return sprintf(&buf[0], "%u\n", freq);
+	return sysfs_emit(buf, "%u\n", freq);
 }
 
 /*
@@ -724,7 +724,7 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 
 	perf = READ_ONCE(cpudata->highest_perf);
 
-	return sprintf(&buf[0], "%u\n", perf);
+	return sysfs_emit(buf, "%u\n", perf);
 }
 
 static ssize_t show_energy_performance_available_preferences(
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry
  2022-12-02  7:47 ` [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry Perry Yuan
@ 2022-12-02 15:57   ` Limonciello, Mario
  2022-12-08 15:55     ` Yuan, Perry
  0 siblings, 1 reply; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-02 15:57 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

On 12/2/2022 01:47, Perry Yuan wrote:
> From: Perry Yuan <Perry.Yuan@amd.com>
> 
> While amd-pstate driver was loaded with specific driver mode, it will

The *driver* doesn't need to check which mode is enabled from the sysfs 
file, but userspace may want to check this.  I think you should reword 
this accordingly in the commit message.

> need to check which mode is enabled for the pstate driver,add this sysfs
> entry to show the current status
> 
> $ cat /sys/devices/system/cpu/amd-pstate/status
> active
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 44 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 44 insertions(+)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 6936df6e8642..7f748a579023 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -66,6 +66,8 @@ static bool cppc_active;
>   static int cppc_load __initdata;
>   
>   static struct cpufreq_driver *default_pstate_driver;
> +static struct cpufreq_driver amd_pstate_epp_driver;
> +static struct cpufreq_driver amd_pstate_driver;
>   static struct amd_cpudata **all_cpu_data;
>   static struct amd_pstate_params global_params;
>   
> @@ -807,6 +809,46 @@ static ssize_t store_boost(struct kobject *a,
>   	return count;
>   }
>   
> +static ssize_t amd_pstate_show_status(char *buf)
> +{
> +	if (!default_pstate_driver)
> +		return sysfs_emit(buf, "off\n");
> +
> +	return sysfs_emit(buf, "%s\n", default_pstate_driver == &amd_pstate_epp_driver ?
> +					"active" : "passive");
> +}
> +
> +static int amd_pstate_update_status(const char *buf, size_t size)
> +{
> +	/* FIXME! */
> +	return -EOPNOTSUPP;
> +}

Why not just fix this as part of the series?  It should be no different 
than unloading the driver and reloading it in the appropriate mode, right?

Is this going to be a short term FIXME/TODO or a long term?  If it's 
long term, it might be better to just make the attribute ro and and have 
just the show callback.  Then when you're ready to make it rw you can 
add the store callback, change it from ro to rw and update documentation 
at that time.

> +
> +static ssize_t show_status(struct kobject *kobj,
> +			   struct kobj_attribute *attr, char *buf)
> +{
> +	ssize_t ret;
> +
> +	mutex_lock(&amd_pstate_driver_lock);
> +	ret = amd_pstate_show_status(buf);
> +	mutex_unlock(&amd_pstate_driver_lock);
> +
> +	return ret;
> +}
> +
> +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
> +			    const char *buf, size_t count)
> +{
> +	char *p = memchr(buf, '\n', count);
> +	int ret;
> +
> +	mutex_lock(&amd_pstate_driver_lock);
> +	ret = amd_pstate_update_status(buf, p ? p - buf : count);
> +	mutex_unlock(&amd_pstate_driver_lock);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
>   cpufreq_freq_attr_ro(amd_pstate_max_freq);
>   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>   
> @@ -814,6 +856,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
>   cpufreq_freq_attr_rw(energy_performance_preference);
>   cpufreq_freq_attr_ro(energy_performance_available_preferences);
>   define_one_global_rw(boost);
> +define_one_global_rw(status);
>   
>   static struct freq_attr *amd_pstate_attr[] = {
>   	&amd_pstate_max_freq,
> @@ -833,6 +876,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
>   
>   static struct attribute *pstate_global_attributes[] = {
>   	&boost.attr,
> +	&status.attr,

In the series I didn't see a matching Documentation update for the new 
sysfs file, can you please add one?

>   	NULL
>   };
>   


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control
  2022-12-02  7:47 ` [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control Perry Yuan
@ 2022-12-02 15:59   ` Limonciello, Mario
  2022-12-05 15:19     ` Yuan, Perry
  0 siblings, 1 reply; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-02 15:59 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

On 12/2/2022 01:47, Perry Yuan wrote:
> From: Perry Yuan <Perry.Yuan@amd.com>
> 
> Add one sysfs entry to control the CPU cores frequency boost state
> The attribute file can allow user to set max performance boosted or
> keeping at normal perf level.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 66 ++++++++++++++++++++++++++++++++++--
>   1 file changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index b3a82cee2e83..6936df6e8642 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -774,12 +774,46 @@ static ssize_t show_energy_performance_preference(
>   	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
>   }
>   
> +static void amd_pstate_update_policies(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu)
> +		cpufreq_update_policy(cpu);
> +}
> +
> +static ssize_t show_boost(struct kobject *kobj,
> +				struct kobj_attribute *attr, char *buf)
> +{
> +	return sysfs_emit(buf, "%u\n", cppc_boost);
> +}
> +
> +static ssize_t store_boost(struct kobject *a,
> +				       struct kobj_attribute *b,
> +				       const char *buf, size_t count)
> +{
> +	bool new_state;
> +	int ret;
> +
> +	ret = kstrtobool(buf, &new_state);
> +	if (ret)
> +		return -EINVAL;
> +
> +	mutex_lock(&amd_pstate_driver_lock);
> +	cppc_boost = !!new_state;
> +	amd_pstate_update_policies();
> +	mutex_unlock(&amd_pstate_driver_lock);
> +
> +	return count;
> +}
> +
>   cpufreq_freq_attr_ro(amd_pstate_max_freq);
>   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>   
>   cpufreq_freq_attr_ro(amd_pstate_highest_perf);
>   cpufreq_freq_attr_rw(energy_performance_preference);
>   cpufreq_freq_attr_ro(energy_performance_available_preferences);
> +define_one_global_rw(boost);
>   
>   static struct freq_attr *amd_pstate_attr[] = {
>   	&amd_pstate_max_freq,
> @@ -797,6 +831,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
>   	NULL,
>   };
>   
> +static struct attribute *pstate_global_attributes[] = {
> +	&boost.attr,

Make sure you update the documentation for the new sysfs attribute.

> +	NULL
> +};
> +
> +static const struct attribute_group amd_pstate_global_attr_group = {
> +	.attrs = pstate_global_attributes,
> +};
> +
>   static inline void update_boost_state(void)
>   {
>   	u64 misc_en;
> @@ -1415,9 +1458,28 @@ static int __init amd_pstate_init(void)
>   
>   	ret = cpufreq_register_driver(default_pstate_driver);
>   	if (ret)
> -		pr_err("failed to register amd pstate driver with return %d\n",
> -		       ret);
> +		pr_err("failed to register driver with return %d\n", ret);
> +
> +	amd_pstate_kobj = kobject_create_and_add("amd-pstate", &cpu_subsys.dev_root->kobj);
> +	if (!amd_pstate_kobj) {
> +		ret = -EINVAL;
> +		pr_err("global sysfs registration failed.\n");
> +		goto kobject_free;
> +	}
> +
> +	ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
> +	if (ret) {
> +		pr_err("sysfs attribute export failed with error %d.\n", ret);
> +		goto global_attr_free;
> +	}
> +
> +	return ret;
>   
> +global_attr_free:
> +	kobject_put(amd_pstate_kobj);
> +kobject_free:
> +	kfree(cpudata);
> +	cpufreq_unregister_driver(default_pstate_driver);
>   	return ret;
>   }
>   device_initcall(amd_pstate_init);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  2022-12-02  7:47 ` [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
@ 2022-12-02 16:58   ` Limonciello, Mario
  2022-12-05 15:17     ` Yuan, Perry
  0 siblings, 1 reply; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-02 16:58 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

The title of this first patch should not reflect amd-pstate, it's 
generic code.

On 12/2/2022 01:47, Perry Yuan wrote:
> From: Perry Yuan <Perry.Yuan@amd.com>
> 
> Add support for setting and querying EPP preferences to the generic
> CPPC driver.  This enables downstream drivers such as amd-pstate to discover
> and use these values
> 
> In order to get EPP worked, cppc_get_epp_caps() will query EPP preference
> value and cppc_set_epp_perf() will set EPP new value.
> Before the EPP works, pstate driver will use cppc_set_auto_epp() to
> enable EPP function from firmware firstly.

My suggestion at rewording this:

Downstream drivers that want to use the new symbols cppc_get_epp_caps 
and cppc_set_epp_perf for querying and setting EPP preferences will need 
to call cppc_set_auto_epp to enable the EPP function first.

> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/acpi/cppc_acpi.c | 114 +++++++++++++++++++++++++++++++++++++--
>   include/acpi/cppc_acpi.h |  12 +++++
>   2 files changed, 121 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 093675b1a1ff..37fa75f25f62 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -1093,6 +1093,9 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
>   {
>   	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
>   	struct cpc_register_resource *reg;
> +	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> +	struct cppc_pcc_data *pcc_ss_data = NULL;
> +	int ret = -EINVAL;
>   
>   	if (!cpc_desc) {
>   		pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
> @@ -1102,10 +1105,6 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
>   	reg = &cpc_desc->cpc_regs[reg_idx];
>   
>   	if (CPC_IN_PCC(reg)) {
> -		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> -		struct cppc_pcc_data *pcc_ss_data = NULL;
> -		int ret = 0;
> -
>   		if (pcc_ss_id < 0)
>   			return -EIO;
>   
> @@ -1125,7 +1124,7 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
>   
>   	cpc_read(cpunum, reg, perf);
>   
> -	return 0;
> +	return ret;
>   }
>   
>   /**
> @@ -1365,6 +1364,111 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
>   }
>   EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
>   
> +/**
> + * cppc_get_epp_caps - Get the energy preference register value.
> + * @cpunum: CPU from which to get epp preference level.
> + * @perf_caps: Return address.
> + *
> + * Return: 0 for success, -EIO otherwise.
> + */
> +int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
> +{
> +	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
> +	struct cpc_register_resource *energy_perf_reg;
> +	u64 energy_perf;
> +
> +	if (!cpc_desc) {
> +		pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
> +		return -ENODEV;
> +	}
> +
> +	energy_perf_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
> +
> +	if (!CPC_SUPPORTED(energy_perf_reg))
> +		pr_warn_once("energy perf reg update is unsupported!\n");
> +
> +	if (CPC_IN_PCC(energy_perf_reg)) {
> +		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> +		struct cppc_pcc_data *pcc_ss_data = NULL;
> +		int ret = 0;
> +
> +		if (pcc_ss_id < 0)
> +			return -ENODEV;
> +
> +		pcc_ss_data = pcc_data[pcc_ss_id];
> +
> +		down_write(&pcc_ss_data->pcc_lock);
> +
> +		if (send_pcc_cmd(pcc_ss_id, CMD_READ) >= 0) {
> +			cpc_read(cpunum, energy_perf_reg, &energy_perf);
> +			perf_caps->energy_perf = energy_perf;
> +		} else {
> +			ret = -EIO;
> +		}
> +
> +		up_write(&pcc_ss_data->pcc_lock);
> +
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cppc_get_epp_caps);
> +
> +/*
> + * Set Energy Performance Preference Register value through
> + * Performance Controls Interface
> + */
> +int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
> +{
> +	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
> +	struct cpc_register_resource *epp_set_reg;
> +	struct cpc_register_resource *auto_sel_reg;
> +	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> +	struct cppc_pcc_data *pcc_ss_data = NULL;
> +	int ret = -EINVAL;
> +
> +	if (!cpc_desc) {
> +		pr_debug("No CPC descriptor for CPU:%d\n", cpu);
> +		return -ENODEV;
> +	}
> +
> +	auto_sel_reg = &cpc_desc->cpc_regs[AUTO_SEL_ENABLE];
> +	epp_set_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
> +
> +	if (CPC_IN_PCC(epp_set_reg) || CPC_IN_PCC(auto_sel_reg)) {
> +		if (pcc_ss_id < 0) {
> +			pr_debug("Invalid pcc_ss_id\n");
> +			return -ENODEV;
> +		}
> +
> +		if (CPC_SUPPORTED(auto_sel_reg)) {
> +			ret = cpc_write(cpu, auto_sel_reg, enable);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		if (CPC_SUPPORTED(epp_set_reg)) {
> +			ret = cpc_write(cpu, epp_set_reg, perf_ctrls->energy_perf);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		pcc_ss_data = pcc_data[pcc_ss_id];
> +
> +		down_write(&pcc_ss_data->pcc_lock);
> +		/* after writing CPC, transfer the ownership of PCC to platform */
> +		ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
> +		up_write(&pcc_ss_data->pcc_lock);
> +	} else {
> +		ret = -ENOTSUPP;
> +		pr_debug("_CPC in PCC is not supported\n");
> +	}
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(cppc_set_epp_perf);
> +
>   /**
>    * cppc_set_enable - Set to enable CPPC on the processor by writing the
>    * Continuous Performance Control package EnableRegister field.
> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> index c5614444031f..a45bb876a19c 100644
> --- a/include/acpi/cppc_acpi.h
> +++ b/include/acpi/cppc_acpi.h
> @@ -108,12 +108,14 @@ struct cppc_perf_caps {
>   	u32 lowest_nonlinear_perf;
>   	u32 lowest_freq;
>   	u32 nominal_freq;
> +	u32 energy_perf;
>   };
>   
>   struct cppc_perf_ctrls {
>   	u32 max_perf;
>   	u32 min_perf;
>   	u32 desired_perf;
> +	u32 energy_perf;
>   };
>   
>   struct cppc_perf_fb_ctrs {
> @@ -149,6 +151,8 @@ extern bool cpc_ffh_supported(void);
>   extern bool cpc_supported_by_cpu(void);
>   extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val);
>   extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
> +extern int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps);
> +extern int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable);
>   #else /* !CONFIG_ACPI_CPPC_LIB */
>   static inline int cppc_get_desired_perf(int cpunum, u64 *desired_perf)
>   {
> @@ -202,6 +206,14 @@ static inline int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val)
>   {
>   	return -ENOTSUPP;
>   }
> +static inline int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
> +{
> +	return -ENOTSUPP;
> +}
> +static inline int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
> +{
> +	return -ENOTSUPP;
> +}
>   #endif /* !CONFIG_ACPI_CPPC_LIB */
>   
>   #endif /* _CPPC_ACPI_H*/


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors
  2022-12-02  7:47 ` [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors Perry Yuan
@ 2022-12-02 17:36   ` Limonciello, Mario
  2022-12-08 15:51     ` Yuan, Perry
  2022-12-05 12:22   ` Huang Rui
  1 sibling, 1 reply; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-02 17:36 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Shimmer.Huang,
	Xiaojian.Du, Li.Meng, wyes.karny, linux-pm, linux-kernel

On 12/2/2022 01:47, Perry Yuan wrote:
> From: Perry Yuan <Perry.Yuan@amd.com>
> 
> Add EPP driver support for AMD SoCs which support a dedicated MSR for
> CPPC.  EPP is used by the DPM controller to configure the frequency that
> a core operates at during short periods of activity.
> 
> The SoC EPP targets are configured on a scale from 0 to 255 where 0
> represents maximum performance and 255 represents maximum efficiency.
> 
> The amd-pstate driver exports profile string names to userspace that are
> tied to specific EPP values.
> 
> The balance_performance string (0x80) provides the best balance for
> efficiency versus power on most systems, but users can choose other
> strings to meet their needs as well.
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance
> 
> To enable the driver,it needs to add `amd_pstate=active` to kernel
> command line and kernel will load the active mode epp driver
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 643 ++++++++++++++++++++++++++++++++++-
>   include/linux/amd-pstate.h   |  35 ++
>   2 files changed, 672 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 204e39006dda..5989d4d25d0f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -37,9 +37,12 @@
>   #include <linux/uaccess.h>
>   #include <linux/static_call.h>
>   #include <linux/amd-pstate.h>
> +#include <linux/cpufreq_common.h>
>   
> +#ifdef CONFIG_ACPI
>   #include <acpi/processor.h>
>   #include <acpi/cppc_acpi.h>
> +#endif
>   
>   #include <asm/msr.h>
>   #include <asm/processor.h>
> @@ -59,9 +62,130 @@
>    * we disable it by default to go acpi-cpufreq on these processors and add a
>    * module parameter to be able to enable it manually for debugging.
>    */
> -static struct cpufreq_driver amd_pstate_driver;
> +static bool cppc_active;
>   static int cppc_load __initdata;
>   
> +static struct cpufreq_driver *default_pstate_driver;
> +static struct amd_cpudata **all_cpu_data;
> +static struct amd_pstate_params global_params;
> +
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +
> +static bool cppc_boost __read_mostly;
> +struct kobject *amd_pstate_kobj;
> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
Rather than making this conditional I would think that amd-pstate should 
probably depend on or select CONFIG_ACPI_CPPC_LIB, no?

> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> +	s16 epp;
> +	struct cppc_perf_caps perf_caps;
> +	int ret;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		if (!cppc_req_cached) {
> +			epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> +					&cppc_req_cached);
> +			if (epp)
> +				return epp;
> +		}
> +		epp = (cppc_req_cached >> 24) & 0xFF;
> +	} else {
> +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> +		if (ret < 0) {
> +			pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> +			return -EIO;
> +		}
> +		epp = (s16) perf_caps.energy_perf;
> +	}
> +
> +	return epp;
> +}
> +#endif
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
> +{
> +	s16 epp;
> +	int index = -EINVAL;
> +
> +	epp = amd_pstate_get_epp(cpudata, 0);
> +	if (epp < 0)
> +		return epp;
> +
> +	switch (epp) {
> +	case HWP_EPP_PERFORMANCE:
> +		index = EPP_INDEX_PERFORMANCE;
> +		break;
> +	case HWP_EPP_BALANCE_PERFORMANCE:
> +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> +		break;
> +	case HWP_EPP_BALANCE_POWERSAVE:
> +		index = EPP_INDEX_BALANCE_POWERSAVE;
> +		break;
> +	case HWP_EPP_POWERSAVE:
> +		index = EPP_INDEX_POWERSAVE;
> +		break;
> +	default:
> +			break;
> +	}
> +
> +	return index;
> +}
> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> +	int ret;
> +	struct cppc_perf_ctrls perf_ctrls;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +		if (!ret)
> +			cpudata->epp_cached = epp;
> +	} else {
> +		perf_ctrls.energy_perf = epp;
> +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> +		if (ret) {
> +			pr_debug("failed to set energy perf value (%d)\n", ret);
> +			return ret;
> +		}
> +		cpudata->epp_cached = epp;
> +	}
> +
> +	return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> +		int pref_index)
> +{
> +	int epp = -EINVAL;
> +	int ret;
> +
> +	if (!pref_index) {
> +		pr_debug("EPP pref_index is invalid\n");
> +		return -EINVAL;
> +	}
> +
> +	if (epp == -EINVAL)
> +		epp = epp_values[pref_index];
> +
> +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		pr_debug("EPP cannot be set under performance policy\n");
> +		return -EBUSY;
> +	}
> +
> +	ret = amd_pstate_set_epp(cpudata, epp);
> +
> +	return ret;
> +}
> +#endif
> +
>   static inline int pstate_enable(bool enable)
>   {
>   	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -70,11 +194,21 @@ static inline int pstate_enable(bool enable)
>   static int cppc_enable(bool enable)
>   {
>   	int cpu, ret = 0;
> +	struct cppc_perf_ctrls perf_ctrls;
>   
>   	for_each_present_cpu(cpu) {
>   		ret = cppc_set_enable(cpu, enable);
>   		if (ret)
>   			return ret;
> +
> +		/* Enable autonomous mode for EPP */
> +		if (!cppc_active) {
> +			/* Set desired perf as zero to allow EPP firmware control */
> +			perf_ctrls.desired_perf = 0;
> +			ret = cppc_set_perf(cpu, &perf_ctrls);
> +			if (ret)
> +				return ret;
> +		}
>   	}
>   
>   	return ret;
> @@ -417,7 +551,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
>   		return;
>   
>   	cpudata->boost_supported = true;
> -	amd_pstate_driver.boost_enabled = true;
> +	default_pstate_driver->boost_enabled = true;
>   }
>   
>   static void amd_perf_ctl_reset(unsigned int cpu)
> @@ -591,10 +725,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
>   	return sprintf(&buf[0], "%u\n", perf);
>   }
>   
> +static ssize_t show_energy_performance_available_preferences(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	int i = 0;
> +	int ret = 0;
> +
> +	while (energy_perf_strings[i] != NULL)
> +		ret += sysfs_emit(buf, "%s", energy_perf_strings[i++]);
> +
> +	ret += sysfs_emit(buf, "\n");
> +
> +	return ret;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> +		struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	char str_preference[21];
> +	ssize_t ret;
> +
> +	ret = sscanf(buf, "%20s", str_preference);
> +	if (ret != 1)
> +		return -EINVAL;
> +
> +	ret = match_string(energy_perf_strings, -1, str_preference);
> +	if (ret < 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&amd_pstate_limits_lock);
> +	ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> +	mutex_unlock(&amd_pstate_limits_lock);
> +
> +	return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	int preference;
> +
> +	preference = amd_pstate_get_energy_pref_index(cpudata);
> +	if (preference < 0)
> +		return preference;
> +
> +	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);

You've got an extra space after "return".

> +}
> +
>   cpufreq_freq_attr_ro(amd_pstate_max_freq);
>   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>   
>   cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
>   
>   static struct freq_attr *amd_pstate_attr[] = {
>   	&amd_pstate_max_freq,
> @@ -603,6 +788,429 @@ static struct freq_attr *amd_pstate_attr[] = {
>   	NULL,
>   };
>   
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> +	&amd_pstate_max_freq,
> +	&amd_pstate_lowest_nonlinear_freq,
> +	&amd_pstate_highest_perf,
> +	&energy_performance_preference,
> +	&energy_performance_available_preferences,
> +	NULL,
> +};
> +
> +static inline void update_boost_state(void)
> +{
> +	u64 misc_en;
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[0];
> +	rdmsrl(MSR_K7_HWCR, misc_en);
> +	global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
> +}
> +
> +static bool amd_pstate_acpi_pm_profile_server(void)
> +{
> +	if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
> +	    acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)
> +		return true;
> +
> +	return false;
> +}

Right now this is only used to control boost, but I wonder if this 
should instad also control the default EPP policy as all?

> +
> +static int amd_pstate_init_cpu(unsigned int cpunum)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[cpunum];
> +	if (!cpudata) {
> +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> +		if (!cpudata)
> +			return -ENOMEM;
> +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> +
> +		cpudata->cpu = cpunum;
> +
> +		if (cppc_active) {
> +			if (amd_pstate_acpi_pm_profile_server())
> +				cppc_boost = true;
> +		}
> +
> +	}
> +	cpudata->epp_powersave = -EINVAL;
> +	cpudata->epp_policy = 0; > +	pr_debug("controlling: cpu %d\n", cpunum);
> +	return 0;
> +}
> +
> +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> +	struct amd_cpudata *cpudata;
> +	struct device *dev;
> +	int rc;
> +	u64 value;
> +
> +	rc = amd_pstate_init_cpu(policy->cpu);
> +	if (rc)
> +		return rc;
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +
> +	dev = get_cpu_device(policy->cpu);
> +	if (!dev)
> +		goto free_cpudata1;
> +
> +	rc = amd_pstate_init_perf(cpudata);
> +	if (rc)
> +		goto free_cpudata1;
> +
> +	min_freq = amd_get_min_freq(cpudata);
> +	max_freq = amd_get_max_freq(cpudata);
> +	nominal_freq = amd_get_nominal_freq(cpudata);
> +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> +	if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> +				min_freq, max_freq);
> +		ret = -EINVAL;
> +		goto free_cpudata1;
> +	}
> +
> +	policy->min = min_freq;
> +	policy->max = max_freq;
> +
> +	policy->cpuinfo.min_freq = min_freq;
> +	policy->cpuinfo.max_freq = max_freq;
> +	/* It will be updated by governor */
> +	policy->cur = policy->cpuinfo.min_freq;
> +
> +	/* Initial processor data capability frequencies */
> +	cpudata->max_freq = max_freq;
> +	cpudata->min_freq = min_freq;
> +	cpudata->nominal_freq = nominal_freq;
> +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> +	policy->driver_data = cpudata;
> +
> +	update_boost_state();
> +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> +	policy->min = policy->cpuinfo.min_freq;
> +	policy->max = policy->cpuinfo.max_freq;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC))
> +		policy->fast_switch_possible = true;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> +	}
> +	amd_pstate_boost_init(cpudata);
> +
> +	return 0;
> +
> +free_cpudata1:
> +	kfree(cpudata);
> +	return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int ret;
> +
> +	ret = __amd_pstate_cpu_init(policy);
> +	if (ret)
> +		return ret;
> +	/*
> +	 * Set the policy to powersave to provide a valid fallback value in case
> +	 * the default cpufreq governor is neither powersave nor performance.
> +	 */
> +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> +	pr_debug("CPU %d exiting\n", policy->cpu);
> +	policy->fast_switch_possible = false;
> +	return 0;
> +}
> +
> +static void amd_pstate_update_max_freq(unsigned int cpu)
> +{
> +	struct cpufreq_policy *policy = policy = cpufreq_cpu_get(cpu);
> +
> +	if (!policy)
> +		return;
> +
> +	refresh_frequency_limits(policy);
> +	cpufreq_cpu_put(policy);
> +}
> +
> +static void amd_pstate_epp_update_limits(unsigned int cpu)
> +{
> +	mutex_lock(&amd_pstate_driver_lock);
> +	update_boost_state();
> +	if (global_params.cppc_boost_disabled) {
> +		for_each_possible_cpu(cpu)
> +			amd_pstate_update_max_freq(cpu);
> +	} else {
> +		cpufreq_update_policy(cpu);
> +	}
> +	mutex_unlock(&amd_pstate_driver_lock);
> +}
> +
> +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> +
> +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u32 max_limit = (hwp_req & 0xff);
> +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> +	u32 boost_level1;
> +
> +	/* If max and min are equal or already at max, nothing to boost */
> +	if (max_limit == min_limit)
> +		return;
> +
> +	/* Set boost max and min to initial value */
> +	if (!cpudata->cppc_boost_min)
> +		cpudata->cppc_boost_min = min_limit;
> +
> +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) + min_limit) >> 1);
> +
> +	if (cpudata->cppc_boost_min < boost_level1)
> +		cpudata->cppc_boost_min = boost_level1;
> +	else if (cpudata->cppc_boost_min < AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = AMD_CPPC_NOMINAL_PERF(hwp_cap);
> +	else if (cpudata->cppc_boost_min == AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = max_limit;
> +	else
> +		return;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> +	wrmsrl(MSR_AMD_CPPC_REQ, hwp_req);
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> +{
> +	bool expired;
> +
> +	if (cpudata->cppc_boost_min) {
> +		expired = time_after64(cpudata->sample.time, cpudata->last_update +
> +					cppc_boost_hold_time_ns);
> +
> +		if (expired) {
> +			wrmsrl(MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
> +			cpudata->cppc_boost_min = 0;
> +		}
> +	}
> +
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_update_util(struct amd_cpudata *cpudata,
> +						      u64 time)
> +{
> +	cpudata->sample.time = time;
> +	if (smp_processor_id() != cpudata->cpu)
> +		return;
> +
> +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> +		bool do_io = false;
> +
> +		cpudata->sched_flags = 0;
> +		/*
> +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> +		 * is set all the time, we can't just conclude that there is
> +		 * some IO bound activity is scheduled on this CPU with just
> +		 * one occurrence. If we receive at least two in two
> +		 * consecutive ticks, then we treat as boost candidate.
> +		 * This is leveraged from Intel Pstate driver.
> +		 */
> +		if (time_before64(time, cpudata->last_io_update + 2 * TICK_NSEC))
> +			do_io = true;
> +
> +		cpudata->last_io_update = time;
> +
> +		if (do_io)
> +			amd_pstate_boost_up(cpudata);
> +
> +	} else {
> +		amd_pstate_boost_down(cpudata);
> +	}
> +}
> +
> +static inline void amd_pstate_cppc_update_hook(struct update_util_data *data,
> +						u64 time, unsigned int flags)
> +{
> +	struct amd_cpudata *cpudata = container_of(data,
> +				struct amd_cpudata, update_util);
> +
> +	cpudata->sched_flags |= flags;
> +
> +	if (smp_processor_id() == cpudata->cpu)
> +		amd_pstate_boost_update_util(cpudata, time);
> +}
> +
> +static void amd_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +
> +	if (!cpudata->update_util_set)
> +		return;
> +
> +	cpufreq_remove_update_util_hook(cpu);
> +	cpudata->update_util_set = false;
> +	synchronize_rcu();
> +}
> +
> +static void amd_pstate_set_update_util_hook(unsigned int cpu_num)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> +
> +	if (!cppc_boost) {
> +		if (cpudata->update_util_set)
> +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> +		return;
> +	}
> +
> +	if (cpudata->update_util_set)
> +		return;
> +
> +	cpudata->sample.time = 0;
> +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> +						amd_pstate_cppc_update_hook);
> +	cpudata->update_util_set = true;
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +	u32 max_perf, min_perf;
> +	u64 value;
> +	s16 epp;
> +	int ret;
> +
> +	max_perf = READ_ONCE(cpudata->highest_perf);
> +	min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> +		min_perf = max_perf;
> +
> +	/* Initial min/max values for CPPC Performance Controls Register */
> +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> +	value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> +	value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> +	/* CPPC EPP feature require to set zero to the desire perf bit */
> +	value &= ~AMD_CPPC_DES_PERF(~0L);
> +	value |= AMD_CPPC_DES_PERF(0);
> +
> +	if (cpudata->epp_policy == cpudata->policy)
> +		goto skip_epp;
> +
> +	cpudata->epp_policy = cpudata->policy;
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		cpudata->epp_powersave = epp;
> +		if (epp < 0)
> +			goto skip_epp;
> +		/* force the epp value to be zero for performance policy */
> +		epp = 0;
> +	} else {
> +		if (cpudata->epp_powersave < 0)
> +			goto skip_epp;
> +		/* Get BIOS pre-defined epp value */
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		if (epp)
> +			goto skip_epp;
> +		epp = cpudata->epp_powersave;
> +	}
> +	/* Set initial EPP value */
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +	}
> +
> +skip_epp:
> +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	if (!ret)
> +		cpudata->epp_cached = epp;
> +}
> +
> +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	if (!policy->cpuinfo.max_freq)
> +		return -ENODEV;
> +
> +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> +				policy->cpuinfo.max_freq, policy->max);
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +	cpudata->policy = policy->policy;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		mutex_lock(&amd_pstate_limits_lock);
> +
> +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +			amd_pstate_clear_update_util_hook(policy->cpu);
> +			amd_pstate_set_max_limits(cpudata);
> +		} else {
> +			amd_pstate_set_update_util_hook(policy->cpu);
> +		}
> +
> +		if (boot_cpu_has(X86_FEATURE_CPPC))
> +			amd_pstate_epp_init(policy->cpu);
> +
> +		mutex_unlock(&amd_pstate_limits_lock);
> +	}
> +
> +	return 0;
> +}
> +
> +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> +					   struct cpufreq_policy_data *policy)
> +{
> +	update_boost_state();
> +	cpufreq_verify_within_cpu_limits(policy);
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> +	return 0;
> +}
> +
>   static struct cpufreq_driver amd_pstate_driver = {
>   	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
>   	.verify		= amd_pstate_verify,
> @@ -616,8 +1224,20 @@ static struct cpufreq_driver amd_pstate_driver = {
>   	.attr		= amd_pstate_attr,
>   };
>   
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> +	.flags		= CPUFREQ_CONST_LOOPS,
> +	.verify		= amd_pstate_epp_verify_policy,
> +	.setpolicy	= amd_pstate_epp_set_policy,
> +	.init		= amd_pstate_epp_cpu_init,
> +	.exit		= amd_pstate_epp_cpu_exit,
> +	.update_limits	= amd_pstate_epp_update_limits,
> +	.name		= "amd_pstate_epp",
> +	.attr		= amd_pstate_epp_attr,
> +};
> +
>   static int __init amd_pstate_init(void)
>   {
> +	static struct amd_cpudata **cpudata;
>   	int ret;
>   
>   	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> @@ -644,7 +1264,8 @@ static int __init amd_pstate_init(void)
>   	/* capability check */
>   	if (boot_cpu_has(X86_FEATURE_CPPC)) {
>   		pr_debug("AMD CPPC MSR based functionality is supported\n");
> -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> +		if (!cppc_active)
> +			default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
>   	} else {
>   		pr_debug("AMD CPPC shared memory based functionality is supported\n");
>   		static_call_update(amd_pstate_enable, cppc_enable);
> @@ -652,6 +1273,10 @@ static int __init amd_pstate_init(void)
>   		static_call_update(amd_pstate_update_perf, cppc_update_perf);
>   	}
>   
> +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> +	if (!cpudata)
> +		return -ENOMEM;
> +	WRITE_ONCE(all_cpu_data, cpudata);
>   	/* enable amd pstate feature */
>   	ret = amd_pstate_enable(true);
>   	if (ret) {
> @@ -659,9 +1284,9 @@ static int __init amd_pstate_init(void)
>   		return ret;
>   	}
>   
> -	ret = cpufreq_register_driver(&amd_pstate_driver);
> +	ret = cpufreq_register_driver(default_pstate_driver);
>   	if (ret)
> -		pr_err("failed to register amd_pstate_driver with return %d\n",
> +		pr_err("failed to register amd pstate driver with return %d\n",
>   		       ret);
>   
>   	return ret;
> @@ -676,8 +1301,14 @@ static int __init amd_pstate_param(char *str)
>   	if (!strcmp(str, "disable")) {
>   		cppc_load = 0;
>   		pr_info("driver is explicitly disabled\n");
> -	} else if (!strcmp(str, "passive"))
> +	} else if (!strcmp(str, "passive")) {
>   		cppc_load = 1;
> +		default_pstate_driver = &amd_pstate_driver;
> +	} else if (!strcmp(str, "active")) {
> +		cppc_active = 1;
> +		cppc_load = 1;
> +		default_pstate_driver = &amd_pstate_epp_driver;
> +	}
>   
>   	return 0;
>   }
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 1c4b8659f171..888af62040f1 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -25,6 +25,7 @@ struct amd_aperf_mperf {
>   	u64 aperf;
>   	u64 mperf;
>   	u64 tsc;
> +	u64 time;
>   };
>   
>   /**
> @@ -47,6 +48,18 @@ struct amd_aperf_mperf {
>    * @prev: Last Aperf/Mperf/tsc count value read from register
>    * @freq: current cpu frequency value
>    * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_powersave: Last saved CPPC energy performance preference
> +				when policy switched to performance
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @sched_flags: Store scheduler flags for possible cross CPU update
> + * @update_util_set: CPUFreq utility callback is set
> + * @last_update: Time stamp of the last performance state update
> + * @cppc_boost_min: Last CPPC boosted min performance state
> + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> + * @update_util: Cpufreq utility callback information
> + * @sample: the stored performance sample
>    *
>    * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
>    * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +85,28 @@ struct amd_cpudata {
>   
>   	u64	freq;
>   	bool	boost_supported;
> +
> +	/* EPP feature related attributes*/
> +	s16	epp_powersave;
> +	s16	epp_policy;
> +	s16	epp_cached;
> +	u32	policy;
> +	u32	sched_flags;
> +	bool	update_util_set;
> +	u64	last_update;
> +	u64	last_io_update;
> +	u32	cppc_boost_min;
> +	u64	cppc_cap1_cached;
> +	struct	update_util_data update_util;
> +	struct	amd_aperf_mperf sample;
> +};
> +
> +/**
> + * struct amd_pstate_params - global parameters for the performance control
> + * @ cppc_boost_disabled wheher the core performance boost disabled
> + */
> +struct amd_pstate_params {
> +	bool cppc_boost_disabled;
>   };
>   
>   #endif /* _LINUX_AMD_PSTATE_H */


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-02  7:47 ` [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP) Perry Yuan
@ 2022-12-05 11:48   ` Huang Rui
  2022-12-05 14:40     ` Yuan, Perry
  0 siblings, 1 reply; 27+ messages in thread
From: Huang Rui @ 2022-12-05 11:48 UTC (permalink / raw)
  To: Yuan, Perry
  Cc: rafael.j.wysocki, Limonciello, Mario, viresh.kumar, Sharma,
	Deepak, Fontenot, Nathan, Deucher, Alexander, Huang, Shimmer, Du,
	Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

On Fri, Dec 02, 2022 at 03:47:11PM +0800, Yuan, Perry wrote:
> make the energy preference performance strings and profiles using one
> common header for intel_pstate driver, then the amd_pstate epp driver can
> use the common header as well. This will simpify the intel_pstate and
> amd_pstate driver.
> 
> Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> ---
>  arch/x86/include/asm/msr-index.h |  4 ---
>  drivers/cpufreq/intel_pstate.c   | 37 +--------------------
>  include/linux/cpufreq_common.h   | 56 ++++++++++++++++++++++++++++++++

I don't find any specific reason why you have to use another common
cpufreq_common header instead of include/linux/cpufreq.h.

Thanks,
Ray

>  3 files changed, 57 insertions(+), 40 deletions(-)
>  create mode 100644 include/linux/cpufreq_common.h
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 4a2af82553e4..3983378cff5b 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -472,10 +472,6 @@
>  #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
>  #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
>  #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long) x & 0xff) << 24)
> -#define HWP_EPP_PERFORMANCE		0x00
> -#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> -#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> -#define HWP_EPP_POWERSAVE		0xFF
>  #define HWP_ACTIVITY_WINDOW(x)		((unsigned long long)(x & 0xff3) << 32)
>  #define HWP_PACKAGE_CONTROL(x)		((unsigned long long)(x & 0x1) << 42)
>  
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index ad9be31753b6..65036ca21719 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -25,6 +25,7 @@
>  #include <linux/acpi.h>
>  #include <linux/vmalloc.h>
>  #include <linux/pm_qos.h>
> +#include <linux/cpufreq_common.h>
>  #include <trace/events/power.h>
>  
>  #include <asm/cpu.h>
> @@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
>  	return 0;
>  }
>  
> -/*
> - * EPP/EPB display strings corresponding to EPP index in the
> - * energy_perf_strings[]
> - *	index		String
> - *-------------------------------------
> - *	0		default
> - *	1		performance
> - *	2		balance_performance
> - *	3		balance_power
> - *	4		power
> - */
> -
> -enum energy_perf_value_index {
> -	EPP_INDEX_DEFAULT = 0,
> -	EPP_INDEX_PERFORMANCE,
> -	EPP_INDEX_BALANCE_PERFORMANCE,
> -	EPP_INDEX_BALANCE_POWERSAVE,
> -	EPP_INDEX_POWERSAVE,
> -};
> -
> -static const char * const energy_perf_strings[] = {
> -	[EPP_INDEX_DEFAULT] = "default",
> -	[EPP_INDEX_PERFORMANCE] = "performance",
> -	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> -	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> -	[EPP_INDEX_POWERSAVE] = "power",
> -	NULL
> -};
> -static unsigned int epp_values[] = {
> -	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> -	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> -	[EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
> -	[EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
> -	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> -};
> -
>  static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw_epp)
>  {
>  	s16 epp;
> diff --git a/include/linux/cpufreq_common.h b/include/linux/cpufreq_common.h
> new file mode 100644
> index 000000000000..2d14b0b0f55c
> --- /dev/null
> +++ b/include/linux/cpufreq_common.h
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * linux/include/linux/cpufreq_common.h
> + *
> + * Copyright (C) 2022 Advanced Micro Devices, Inc.
> + *
> + * Author: Perry Yuan <Perry.Yuan@amd.com>
> + */
> +
> +#ifndef _LINUX_CPUFREQ_COMMON_H
> +#define _LINUX_CPUFREQ_COMMON_H
> +
> +#include <asm/msr.h>
> +/*
> + * EPP/EPB display strings corresponding to EPP index in the
> + * energy_perf_strings[]
> + *	index		String
> + *-------------------------------------
> + *	0		default
> + *	1		performance
> + *	2		balance_performance
> + *	3		balance_power
> + *	4		power
> + */
> +
> +#define HWP_EPP_PERFORMANCE		0x00
> +#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> +#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> +#define HWP_EPP_POWERSAVE		0xFF
> +
> +enum energy_perf_value_index {
> +	EPP_INDEX_DEFAULT = 0,
> +	EPP_INDEX_PERFORMANCE,
> +	EPP_INDEX_BALANCE_PERFORMANCE,
> +	EPP_INDEX_BALANCE_POWERSAVE,
> +	EPP_INDEX_POWERSAVE,
> +};
> +
> +static const char * const energy_perf_strings[] = {
> +	[EPP_INDEX_DEFAULT] = "default",
> +	[EPP_INDEX_PERFORMANCE] = "performance",
> +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> +	[EPP_INDEX_POWERSAVE] = "power",
> +	NULL
> +};
> +
> +static unsigned int epp_values[] = {
> +	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> +	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> +	[EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
> +	[EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
> +	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> +};
> +
> +#endif /* _LINUX_CPUFREQ_COMMON_H */
> \ No newline at end of file
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors
  2022-12-02  7:47 ` [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors Perry Yuan
  2022-12-02 17:36   ` Limonciello, Mario
@ 2022-12-05 12:22   ` Huang Rui
  2022-12-08 15:43     ` Yuan, Perry
  1 sibling, 1 reply; 27+ messages in thread
From: Huang Rui @ 2022-12-05 12:22 UTC (permalink / raw)
  To: Yuan, Perry
  Cc: rafael.j.wysocki, Limonciello, Mario, viresh.kumar, Sharma,
	Deepak, Fontenot, Nathan, Deucher, Alexander, Huang, Shimmer, Du,
	Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

On Fri, Dec 02, 2022 at 03:47:12PM +0800, Yuan, Perry wrote:
> From: Perry Yuan <Perry.Yuan@amd.com>
> 

> cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors

I suggest that we align the format like "cpufreq: amd-pstate:" as the
prefix of the subject like previous patches. Because it's there anywhere
even in the documentation file. I think it's not worth to change such as
minor update like amd-pstate to amd_pstate for all the files and comments.
So let's align the previous way.

> Add EPP driver support for AMD SoCs which support a dedicated MSR for
> CPPC.  EPP is used by the DPM controller to configure the frequency that
> a core operates at during short periods of activity.
> 
> The SoC EPP targets are configured on a scale from 0 to 255 where 0
> represents maximum performance and 255 represents maximum efficiency.
> 
> The amd-pstate driver exports profile string names to userspace that are
> tied to specific EPP values.
> 
> The balance_performance string (0x80) provides the best balance for
> efficiency versus power on most systems, but users can choose other
> strings to meet their needs as well.
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance
> 
> To enable the driver,it needs to add `amd_pstate=active` to kernel
> command line and kernel will load the active mode epp driver
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 643 ++++++++++++++++++++++++++++++++++-
>  include/linux/amd-pstate.h   |  35 ++
>  2 files changed, 672 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 204e39006dda..5989d4d25d0f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -37,9 +37,12 @@
>  #include <linux/uaccess.h>
>  #include <linux/static_call.h>
>  #include <linux/amd-pstate.h>
> +#include <linux/cpufreq_common.h>
>  
> +#ifdef CONFIG_ACPI
>  #include <acpi/processor.h>
>  #include <acpi/cppc_acpi.h>
> +#endif

The amd-pstate module depends on ACPI in the kconfig, so we won't need add
CONFIG_ACPI here.

>  
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -59,9 +62,130 @@
>   * we disable it by default to go acpi-cpufreq on these processors and add a
>   * module parameter to be able to enable it manually for debugging.
>   */
> -static struct cpufreq_driver amd_pstate_driver;
> +static bool cppc_active;
>  static int cppc_load __initdata;
>  
> +static struct cpufreq_driver *default_pstate_driver;
> +static struct amd_cpudata **all_cpu_data;
> +static struct amd_pstate_params global_params;
> +
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +
> +static bool cppc_boost __read_mostly;
> +struct kobject *amd_pstate_kobj;

Where is it using amd_pstate_kobj?

> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> +	s16 epp;
> +	struct cppc_perf_caps perf_caps;
> +	int ret;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		if (!cppc_req_cached) {
> +			epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> +					&cppc_req_cached);
> +			if (epp)
> +				return epp;
> +		}
> +		epp = (cppc_req_cached >> 24) & 0xFF;
> +	} else {
> +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> +		if (ret < 0) {
> +			pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> +			return -EIO;
> +		}
> +		epp = (s16) perf_caps.energy_perf;
> +	}
> +
> +	return epp;
> +}
> +#endif
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
> +{
> +	s16 epp;
> +	int index = -EINVAL;
> +
> +	epp = amd_pstate_get_epp(cpudata, 0);
> +	if (epp < 0)
> +		return epp;
> +
> +	switch (epp) {
> +	case HWP_EPP_PERFORMANCE:
> +		index = EPP_INDEX_PERFORMANCE;
> +		break;
> +	case HWP_EPP_BALANCE_PERFORMANCE:
> +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> +		break;
> +	case HWP_EPP_BALANCE_POWERSAVE:
> +		index = EPP_INDEX_BALANCE_POWERSAVE;
> +		break;
> +	case HWP_EPP_POWERSAVE:
> +		index = EPP_INDEX_POWERSAVE;
> +		break;
> +	default:
> +			break;
> +	}
> +
> +	return index;
> +}
> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> +	int ret;
> +	struct cppc_perf_ctrls perf_ctrls;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +		if (!ret)
> +			cpudata->epp_cached = epp;
> +	} else {
> +		perf_ctrls.energy_perf = epp;
> +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> +		if (ret) {
> +			pr_debug("failed to set energy perf value (%d)\n", ret);
> +			return ret;
> +		}
> +		cpudata->epp_cached = epp;
> +	}
> +
> +	return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> +		int pref_index)
> +{
> +	int epp = -EINVAL;
> +	int ret;
> +
> +	if (!pref_index) {
> +		pr_debug("EPP pref_index is invalid\n");
> +		return -EINVAL;
> +	}
> +
> +	if (epp == -EINVAL)
> +		epp = epp_values[pref_index];
> +
> +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		pr_debug("EPP cannot be set under performance policy\n");
> +		return -EBUSY;
> +	}
> +
> +	ret = amd_pstate_set_epp(cpudata, epp);
> +
> +	return ret;
> +}
> +#endif
> +
>  static inline int pstate_enable(bool enable)
>  {
>  	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -70,11 +194,21 @@ static inline int pstate_enable(bool enable)
>  static int cppc_enable(bool enable)
>  {
>  	int cpu, ret = 0;
> +	struct cppc_perf_ctrls perf_ctrls;
>  
>  	for_each_present_cpu(cpu) {
>  		ret = cppc_set_enable(cpu, enable);
>  		if (ret)
>  			return ret;
> +
> +		/* Enable autonomous mode for EPP */
> +		if (!cppc_active) {
> +			/* Set desired perf as zero to allow EPP firmware control */
> +			perf_ctrls.desired_perf = 0;
> +			ret = cppc_set_perf(cpu, &perf_ctrls);
> +			if (ret)
> +				return ret;
> +		}
>  	}
>  
>  	return ret;
> @@ -417,7 +551,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
>  		return;
>  
>  	cpudata->boost_supported = true;
> -	amd_pstate_driver.boost_enabled = true;
> +	default_pstate_driver->boost_enabled = true;
>  }
>  
>  static void amd_perf_ctl_reset(unsigned int cpu)
> @@ -591,10 +725,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
>  	return sprintf(&buf[0], "%u\n", perf);
>  }
>  
> +static ssize_t show_energy_performance_available_preferences(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	int i = 0;
> +	int ret = 0;
> +
> +	while (energy_perf_strings[i] != NULL)
> +		ret += sysfs_emit(buf, "%s", energy_perf_strings[i++]);
> +
> +	ret += sysfs_emit(buf, "\n");
> +
> +	return ret;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> +		struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	char str_preference[21];
> +	ssize_t ret;
> +
> +	ret = sscanf(buf, "%20s", str_preference);
> +	if (ret != 1)
> +		return -EINVAL;
> +
> +	ret = match_string(energy_perf_strings, -1, str_preference);
> +	if (ret < 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&amd_pstate_limits_lock);
> +	ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> +	mutex_unlock(&amd_pstate_limits_lock);
> +
> +	return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	int preference;
> +
> +	preference = amd_pstate_get_energy_pref_index(cpudata);
> +	if (preference < 0)
> +		return preference;
> +
> +	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> +}
> +
>  cpufreq_freq_attr_ro(amd_pstate_max_freq);
>  cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>  
>  cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
>  
>  static struct freq_attr *amd_pstate_attr[] = {
>  	&amd_pstate_max_freq,
> @@ -603,6 +788,429 @@ static struct freq_attr *amd_pstate_attr[] = {
>  	NULL,
>  };
>  
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> +	&amd_pstate_max_freq,
> +	&amd_pstate_lowest_nonlinear_freq,
> +	&amd_pstate_highest_perf,
> +	&energy_performance_preference,
> +	&energy_performance_available_preferences,
> +	NULL,
> +};
> +
> +static inline void update_boost_state(void)
> +{
> +	u64 misc_en;
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[0];
> +	rdmsrl(MSR_K7_HWCR, misc_en);
> +	global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
> +}
> +
> +static bool amd_pstate_acpi_pm_profile_server(void)
> +{
> +	if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
> +	    acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)
> +		return true;
> +
> +	return false;
> +}
> +
> +static int amd_pstate_init_cpu(unsigned int cpunum)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[cpunum];
> +	if (!cpudata) {
> +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> +		if (!cpudata)
> +			return -ENOMEM;
> +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> +
> +		cpudata->cpu = cpunum;
> +
> +		if (cppc_active) {
> +			if (amd_pstate_acpi_pm_profile_server())
> +				cppc_boost = true;
> +		}
> +
> +	}
> +	cpudata->epp_powersave = -EINVAL;
> +	cpudata->epp_policy = 0;
> +	pr_debug("controlling: cpu %d\n", cpunum);
> +	return 0;
> +}
> +
> +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> +	struct amd_cpudata *cpudata;
> +	struct device *dev;
> +	int rc;
> +	u64 value;
> +
> +	rc = amd_pstate_init_cpu(policy->cpu);
> +	if (rc)
> +		return rc;
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +
> +	dev = get_cpu_device(policy->cpu);
> +	if (!dev)
> +		goto free_cpudata1;
> +
> +	rc = amd_pstate_init_perf(cpudata);
> +	if (rc)
> +		goto free_cpudata1;
> +
> +	min_freq = amd_get_min_freq(cpudata);
> +	max_freq = amd_get_max_freq(cpudata);
> +	nominal_freq = amd_get_nominal_freq(cpudata);
> +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> +	if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> +				min_freq, max_freq);
> +		ret = -EINVAL;
> +		goto free_cpudata1;
> +	}
> +
> +	policy->min = min_freq;
> +	policy->max = max_freq;
> +
> +	policy->cpuinfo.min_freq = min_freq;
> +	policy->cpuinfo.max_freq = max_freq;
> +	/* It will be updated by governor */
> +	policy->cur = policy->cpuinfo.min_freq;
> +
> +	/* Initial processor data capability frequencies */
> +	cpudata->max_freq = max_freq;
> +	cpudata->min_freq = min_freq;
> +	cpudata->nominal_freq = nominal_freq;
> +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> +	policy->driver_data = cpudata;
> +
> +	update_boost_state();
> +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> +	policy->min = policy->cpuinfo.min_freq;
> +	policy->max = policy->cpuinfo.max_freq;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC))
> +		policy->fast_switch_possible = true;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> +	}
> +	amd_pstate_boost_init(cpudata);
> +
> +	return 0;
> +
> +free_cpudata1:
> +	kfree(cpudata);
> +	return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int ret;
> +
> +	ret = __amd_pstate_cpu_init(policy);
> +	if (ret)
> +		return ret;
> +	/*
> +	 * Set the policy to powersave to provide a valid fallback value in case
> +	 * the default cpufreq governor is neither powersave nor performance.
> +	 */
> +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> +	pr_debug("CPU %d exiting\n", policy->cpu);
> +	policy->fast_switch_possible = false;
> +	return 0;
> +}
> +
> +static void amd_pstate_update_max_freq(unsigned int cpu)
> +{
> +	struct cpufreq_policy *policy = policy = cpufreq_cpu_get(cpu);
> +
> +	if (!policy)
> +		return;
> +
> +	refresh_frequency_limits(policy);
> +	cpufreq_cpu_put(policy);
> +}
> +
> +static void amd_pstate_epp_update_limits(unsigned int cpu)
> +{
> +	mutex_lock(&amd_pstate_driver_lock);
> +	update_boost_state();
> +	if (global_params.cppc_boost_disabled) {
> +		for_each_possible_cpu(cpu)
> +			amd_pstate_update_max_freq(cpu);
> +	} else {
> +		cpufreq_update_policy(cpu);
> +	}
> +	mutex_unlock(&amd_pstate_driver_lock);
> +}
> +
> +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> +
> +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u32 max_limit = (hwp_req & 0xff);
> +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> +	u32 boost_level1;
> +
> +	/* If max and min are equal or already at max, nothing to boost */
> +	if (max_limit == min_limit)
> +		return;
> +
> +	/* Set boost max and min to initial value */
> +	if (!cpudata->cppc_boost_min)
> +		cpudata->cppc_boost_min = min_limit;
> +
> +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) + min_limit) >> 1);
> +
> +	if (cpudata->cppc_boost_min < boost_level1)
> +		cpudata->cppc_boost_min = boost_level1;
> +	else if (cpudata->cppc_boost_min < AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = AMD_CPPC_NOMINAL_PERF(hwp_cap);
> +	else if (cpudata->cppc_boost_min == AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = max_limit;
> +	else
> +		return;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> +	wrmsrl(MSR_AMD_CPPC_REQ, hwp_req);
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> +{
> +	bool expired;
> +
> +	if (cpudata->cppc_boost_min) {
> +		expired = time_after64(cpudata->sample.time, cpudata->last_update +
> +					cppc_boost_hold_time_ns);
> +
> +		if (expired) {
> +			wrmsrl(MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
> +			cpudata->cppc_boost_min = 0;
> +		}
> +	}
> +
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_update_util(struct amd_cpudata *cpudata,
> +						      u64 time)
> +{
> +	cpudata->sample.time = time;
> +	if (smp_processor_id() != cpudata->cpu)
> +		return;
> +
> +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> +		bool do_io = false;
> +
> +		cpudata->sched_flags = 0;
> +		/*
> +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> +		 * is set all the time, we can't just conclude that there is
> +		 * some IO bound activity is scheduled on this CPU with just
> +		 * one occurrence. If we receive at least two in two
> +		 * consecutive ticks, then we treat as boost candidate.
> +		 * This is leveraged from Intel Pstate driver.
> +		 */
> +		if (time_before64(time, cpudata->last_io_update + 2 * TICK_NSEC))
> +			do_io = true;
> +
> +		cpudata->last_io_update = time;
> +
> +		if (do_io)
> +			amd_pstate_boost_up(cpudata);
> +
> +	} else {
> +		amd_pstate_boost_down(cpudata);
> +	}
> +}
> +
> +static inline void amd_pstate_cppc_update_hook(struct update_util_data *data,
> +						u64 time, unsigned int flags)
> +{
> +	struct amd_cpudata *cpudata = container_of(data,
> +				struct amd_cpudata, update_util);
> +
> +	cpudata->sched_flags |= flags;
> +
> +	if (smp_processor_id() == cpudata->cpu)
> +		amd_pstate_boost_update_util(cpudata, time);
> +}
> +
> +static void amd_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +
> +	if (!cpudata->update_util_set)
> +		return;
> +
> +	cpufreq_remove_update_util_hook(cpu);
> +	cpudata->update_util_set = false;
> +	synchronize_rcu();
> +}
> +
> +static void amd_pstate_set_update_util_hook(unsigned int cpu_num)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> +
> +	if (!cppc_boost) {
> +		if (cpudata->update_util_set)
> +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> +		return;
> +	}
> +
> +	if (cpudata->update_util_set)
> +		return;
> +
> +	cpudata->sample.time = 0;
> +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> +						amd_pstate_cppc_update_hook);
> +	cpudata->update_util_set = true;
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +	u32 max_perf, min_perf;
> +	u64 value;
> +	s16 epp;
> +	int ret;
> +
> +	max_perf = READ_ONCE(cpudata->highest_perf);
> +	min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> +		min_perf = max_perf;
> +
> +	/* Initial min/max values for CPPC Performance Controls Register */
> +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> +	value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> +	value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> +	/* CPPC EPP feature require to set zero to the desire perf bit */
> +	value &= ~AMD_CPPC_DES_PERF(~0L);
> +	value |= AMD_CPPC_DES_PERF(0);
> +
> +	if (cpudata->epp_policy == cpudata->policy)
> +		goto skip_epp;
> +
> +	cpudata->epp_policy = cpudata->policy;
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		cpudata->epp_powersave = epp;
> +		if (epp < 0)
> +			goto skip_epp;
> +		/* force the epp value to be zero for performance policy */
> +		epp = 0;
> +	} else {
> +		if (cpudata->epp_powersave < 0)
> +			goto skip_epp;
> +		/* Get BIOS pre-defined epp value */
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		if (epp)
> +			goto skip_epp;
> +		epp = cpudata->epp_powersave;
> +	}
> +	/* Set initial EPP value */
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +	}
> +
> +skip_epp:
> +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	if (!ret)
> +		cpudata->epp_cached = epp;
> +}
> +
> +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	if (!policy->cpuinfo.max_freq)
> +		return -ENODEV;
> +
> +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> +				policy->cpuinfo.max_freq, policy->max);
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +	cpudata->policy = policy->policy;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {

X86_FEATURE_CPPC indicated the MSR support, but I believe the share memory
CPU also has the EPP, right?

> +		mutex_lock(&amd_pstate_limits_lock);
> +
> +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +			amd_pstate_clear_update_util_hook(policy->cpu);
> +			amd_pstate_set_max_limits(cpudata);
> +		} else {
> +			amd_pstate_set_update_util_hook(policy->cpu);

May I know why amd processors also needs to set and update util hook?

> +		}
> +
> +		if (boot_cpu_has(X86_FEATURE_CPPC))

I am confused why if (boot_cpu_has(X86_FEATURE_CPPC) is still inside of a
if (boot_cpu_has(X86_FEATURE_CPPC)).

> +			amd_pstate_epp_init(policy->cpu);
> +
> +		mutex_unlock(&amd_pstate_limits_lock);
> +	}
> +
> +	return 0;
> +}
> +
> +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> +					   struct cpufreq_policy_data *policy)
> +{
> +	update_boost_state();
> +	cpufreq_verify_within_cpu_limits(policy);
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> +	return 0;
> +}
> +
>  static struct cpufreq_driver amd_pstate_driver = {
>  	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
>  	.verify		= amd_pstate_verify,
> @@ -616,8 +1224,20 @@ static struct cpufreq_driver amd_pstate_driver = {
>  	.attr		= amd_pstate_attr,
>  };
>  
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> +	.flags		= CPUFREQ_CONST_LOOPS,
> +	.verify		= amd_pstate_epp_verify_policy,
> +	.setpolicy	= amd_pstate_epp_set_policy,
> +	.init		= amd_pstate_epp_cpu_init,
> +	.exit		= amd_pstate_epp_cpu_exit,
> +	.update_limits	= amd_pstate_epp_update_limits,
> +	.name		= "amd_pstate_epp",
> +	.attr		= amd_pstate_epp_attr,
> +};
> +
>  static int __init amd_pstate_init(void)
>  {
> +	static struct amd_cpudata **cpudata;
>  	int ret;
>  
>  	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> @@ -644,7 +1264,8 @@ static int __init amd_pstate_init(void)
>  	/* capability check */
>  	if (boot_cpu_has(X86_FEATURE_CPPC)) {
>  		pr_debug("AMD CPPC MSR based functionality is supported\n");
> -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> +		if (!cppc_active)
> +			default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
>  	} else {
>  		pr_debug("AMD CPPC shared memory based functionality is supported\n");
>  		static_call_update(amd_pstate_enable, cppc_enable);
> @@ -652,6 +1273,10 @@ static int __init amd_pstate_init(void)
>  		static_call_update(amd_pstate_update_perf, cppc_update_perf);
>  	}
>  
> +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> +	if (!cpudata)
> +		return -ENOMEM;
> +	WRITE_ONCE(all_cpu_data, cpudata);
>  	/* enable amd pstate feature */
>  	ret = amd_pstate_enable(true);
>  	if (ret) {
> @@ -659,9 +1284,9 @@ static int __init amd_pstate_init(void)
>  		return ret;
>  	}
>  
> -	ret = cpufreq_register_driver(&amd_pstate_driver);
> +	ret = cpufreq_register_driver(default_pstate_driver);
>  	if (ret)
> -		pr_err("failed to register amd_pstate_driver with return %d\n",
> +		pr_err("failed to register amd pstate driver with return %d\n",
>  		       ret);
>  
>  	return ret;
> @@ -676,8 +1301,14 @@ static int __init amd_pstate_param(char *str)
>  	if (!strcmp(str, "disable")) {
>  		cppc_load = 0;
>  		pr_info("driver is explicitly disabled\n");
> -	} else if (!strcmp(str, "passive"))
> +	} else if (!strcmp(str, "passive")) {
>  		cppc_load = 1;
> +		default_pstate_driver = &amd_pstate_driver;
> +	} else if (!strcmp(str, "active")) {
> +		cppc_active = 1;
> +		cppc_load = 1;
> +		default_pstate_driver = &amd_pstate_epp_driver;
> +	}
>  
>  	return 0;
>  }
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 1c4b8659f171..888af62040f1 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -25,6 +25,7 @@ struct amd_aperf_mperf {
>  	u64 aperf;
>  	u64 mperf;
>  	u64 tsc;
> +	u64 time;
>  };
>  
>  /**
> @@ -47,6 +48,18 @@ struct amd_aperf_mperf {
>   * @prev: Last Aperf/Mperf/tsc count value read from register
>   * @freq: current cpu frequency value
>   * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_powersave: Last saved CPPC energy performance preference
> +				when policy switched to performance
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @sched_flags: Store scheduler flags for possible cross CPU update
> + * @update_util_set: CPUFreq utility callback is set
> + * @last_update: Time stamp of the last performance state update
> + * @cppc_boost_min: Last CPPC boosted min performance state
> + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> + * @update_util: Cpufreq utility callback information
> + * @sample: the stored performance sample
>   *
>   * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
>   * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +85,28 @@ struct amd_cpudata {
>  
>  	u64	freq;
>  	bool	boost_supported;
> +
> +	/* EPP feature related attributes*/
> +	s16	epp_powersave;
> +	s16	epp_policy;
> +	s16	epp_cached;
> +	u32	policy;
> +	u32	sched_flags;
> +	bool	update_util_set;
> +	u64	last_update;
> +	u64	last_io_update;
> +	u32	cppc_boost_min;
> +	u64	cppc_cap1_cached;
> +	struct	update_util_data update_util;
> +	struct	amd_aperf_mperf sample;
> +};
> +
> +/**
> + * struct amd_pstate_params - global parameters for the performance control
> + * @ cppc_boost_disabled wheher the core performance boost disabled
> + */
> +struct amd_pstate_params {
> +	bool cppc_boost_disabled;
>  };
>  
>  #endif /* _LINUX_AMD_PSTATE_H */
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-05 11:48   ` Huang Rui
@ 2022-12-05 14:40     ` Yuan, Perry
  2022-12-05 16:44       ` Limonciello, Mario
  0 siblings, 1 reply; 27+ messages in thread
From: Yuan, Perry @ 2022-12-05 14:40 UTC (permalink / raw)
  To: Huang, Ray
  Cc: rafael.j.wysocki, Limonciello, Mario, viresh.kumar, Sharma,
	Deepak, Fontenot, Nathan, Deucher, Alexander, Huang, Shimmer, Du,
	Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6535 bytes --]

[AMD Official Use Only - General]

Hi Ray. 

> -----Original Message-----
> From: Huang, Ray <Ray.Huang@amd.com>
> Sent: Monday, December 5, 2022 7:49 PM
> To: Yuan, Perry <Perry.Yuan@amd.com>
> Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> definition for Energy Preference Performance(EPP)
> 
> On Fri, Dec 02, 2022 at 03:47:11PM +0800, Yuan, Perry wrote:
> > make the energy preference performance strings and profiles using one
> > common header for intel_pstate driver, then the amd_pstate epp driver
> > can use the common header as well. This will simpify the intel_pstate
> > and amd_pstate driver.
> >
> > Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> > ---
> >  arch/x86/include/asm/msr-index.h |  4 ---
> >  drivers/cpufreq/intel_pstate.c   | 37 +--------------------
> >  include/linux/cpufreq_common.h   | 56
> ++++++++++++++++++++++++++++++++
> 
> I don't find any specific reason why you have to use another common
> cpufreq_common header instead of include/linux/cpufreq.h.
> 
> Thanks,
> Ray

That is fine for me to use the cpufreq.h to store the common vars.
I will move the declaration to that header file.

Thanks for the review.

Perry. 

> 
> >  3 files changed, 57 insertions(+), 40 deletions(-)  create mode
> > 100644 include/linux/cpufreq_common.h
> >
> > diff --git a/arch/x86/include/asm/msr-index.h
> > b/arch/x86/include/asm/msr-index.h
> > index 4a2af82553e4..3983378cff5b 100644
> > --- a/arch/x86/include/asm/msr-index.h
> > +++ b/arch/x86/include/asm/msr-index.h
> > @@ -472,10 +472,6 @@
> >  #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
> >  #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
> >  #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long)
> x & 0xff) << 24)
> > -#define HWP_EPP_PERFORMANCE		0x00
> > -#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > -#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > -#define HWP_EPP_POWERSAVE		0xFF
> >  #define HWP_ACTIVITY_WINDOW(x)		((unsigned long
> long)(x & 0xff3) << 32)
> >  #define HWP_PACKAGE_CONTROL(x)		((unsigned long
> long)(x & 0x1) << 42)
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c index ad9be31753b6..65036ca21719
> > 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/acpi.h>
> >  #include <linux/vmalloc.h>
> >  #include <linux/pm_qos.h>
> > +#include <linux/cpufreq_common.h>
> >  #include <trace/events/power.h>
> >
> >  #include <asm/cpu.h>
> > @@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> >  	return 0;
> >  }
> >
> > -/*
> > - * EPP/EPB display strings corresponding to EPP index in the
> > - * energy_perf_strings[]
> > - *	index		String
> > - *-------------------------------------
> > - *	0		default
> > - *	1		performance
> > - *	2		balance_performance
> > - *	3		balance_power
> > - *	4		power
> > - */
> > -
> > -enum energy_perf_value_index {
> > -	EPP_INDEX_DEFAULT = 0,
> > -	EPP_INDEX_PERFORMANCE,
> > -	EPP_INDEX_BALANCE_PERFORMANCE,
> > -	EPP_INDEX_BALANCE_POWERSAVE,
> > -	EPP_INDEX_POWERSAVE,
> > -};
> > -
> > -static const char * const energy_perf_strings[] = {
> > -	[EPP_INDEX_DEFAULT] = "default",
> > -	[EPP_INDEX_PERFORMANCE] = "performance",
> > -	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > -	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > -	[EPP_INDEX_POWERSAVE] = "power",
> > -	NULL
> > -};
> > -static unsigned int epp_values[] = {
> > -	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > -	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > -	[EPP_INDEX_BALANCE_PERFORMANCE] =
> HWP_EPP_BALANCE_PERFORMANCE,
> > -	[EPP_INDEX_BALANCE_POWERSAVE] =
> HWP_EPP_BALANCE_POWERSAVE,
> > -	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> > -};
> > -
> >  static int intel_pstate_get_energy_pref_index(struct cpudata
> > *cpu_data, int *raw_epp)  {
> >  	s16 epp;
> > diff --git a/include/linux/cpufreq_common.h
> > b/include/linux/cpufreq_common.h new file mode 100644 index
> > 000000000000..2d14b0b0f55c
> > --- /dev/null
> > +++ b/include/linux/cpufreq_common.h
> > @@ -0,0 +1,56 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * linux/include/linux/cpufreq_common.h
> > + *
> > + * Copyright (C) 2022 Advanced Micro Devices, Inc.
> > + *
> > + * Author: Perry Yuan <Perry.Yuan@amd.com>  */
> > +
> > +#ifndef _LINUX_CPUFREQ_COMMON_H
> > +#define _LINUX_CPUFREQ_COMMON_H
> > +
> > +#include <asm/msr.h>
> > +/*
> > + * EPP/EPB display strings corresponding to EPP index in the
> > + * energy_perf_strings[]
> > + *	index		String
> > + *-------------------------------------
> > + *	0		default
> > + *	1		performance
> > + *	2		balance_performance
> > + *	3		balance_power
> > + *	4		power
> > + */
> > +
> > +#define HWP_EPP_PERFORMANCE		0x00
> > +#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > +#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > +#define HWP_EPP_POWERSAVE		0xFF
> > +
> > +enum energy_perf_value_index {
> > +	EPP_INDEX_DEFAULT = 0,
> > +	EPP_INDEX_PERFORMANCE,
> > +	EPP_INDEX_BALANCE_PERFORMANCE,
> > +	EPP_INDEX_BALANCE_POWERSAVE,
> > +	EPP_INDEX_POWERSAVE,
> > +};
> > +
> > +static const char * const energy_perf_strings[] = {
> > +	[EPP_INDEX_DEFAULT] = "default",
> > +	[EPP_INDEX_PERFORMANCE] = "performance",
> > +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > +	[EPP_INDEX_POWERSAVE] = "power",
> > +	NULL
> > +};
> > +
> > +static unsigned int epp_values[] = {
> > +	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > +	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> HWP_EPP_BALANCE_PERFORMANCE,
> > +	[EPP_INDEX_BALANCE_POWERSAVE] =
> HWP_EPP_BALANCE_POWERSAVE,
> > +	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE, };
> > +
> > +#endif /* _LINUX_CPUFREQ_COMMON_H */
> > \ No newline at end of file
> > --
> > 2.34.1
> >

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 19295 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  2022-12-02 16:58   ` Limonciello, Mario
@ 2022-12-05 15:17     ` Yuan, Perry
  0 siblings, 0 replies; 27+ messages in thread
From: Yuan, Perry @ 2022-12-05 15:17 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Huang,
	Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8355 bytes --]

[AMD Official Use Only - General]



> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, December 3, 2022 12:58 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy
> performance preference cppc control
> 
> The title of this first patch should not reflect amd-pstate, it's generic code.
> 
> On 12/2/2022 01:47, Perry Yuan wrote:
> > From: Perry Yuan <Perry.Yuan@amd.com>
> >
> > Add support for setting and querying EPP preferences to the generic
> > CPPC driver.  This enables downstream drivers such as amd-pstate to
> > discover and use these values
> >
> > In order to get EPP worked, cppc_get_epp_caps() will query EPP
> > preference value and cppc_set_epp_perf() will set EPP new value.
> > Before the EPP works, pstate driver will use cppc_set_auto_epp() to
> > enable EPP function from firmware firstly.
> 
> My suggestion at rewording this:
> 
> Downstream drivers that want to use the new symbols cppc_get_epp_caps
> and cppc_set_epp_perf for querying and setting EPP preferences will need
> to call cppc_set_auto_epp to enable the EPP function first.
> 

Looks better, will update this to V7.

Thank you!

Perry.  

> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/acpi/cppc_acpi.c | 114
> +++++++++++++++++++++++++++++++++++++--
> >   include/acpi/cppc_acpi.h |  12 +++++
> >   2 files changed, 121 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index
> > 093675b1a1ff..37fa75f25f62 100644
> > --- a/drivers/acpi/cppc_acpi.c
> > +++ b/drivers/acpi/cppc_acpi.c
> > @@ -1093,6 +1093,9 @@ static int cppc_get_perf(int cpunum, enum
> cppc_regs reg_idx, u64 *perf)
> >   {
> >   	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
> >   	struct cpc_register_resource *reg;
> > +	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> > +	struct cppc_pcc_data *pcc_ss_data = NULL;
> > +	int ret = -EINVAL;
> >
> >   	if (!cpc_desc) {
> >   		pr_debug("No CPC descriptor for CPU:%d\n", cpunum); @@
> -1102,10
> > +1105,6 @@ static int cppc_get_perf(int cpunum, enum cppc_regs reg_idx,
> u64 *perf)
> >   	reg = &cpc_desc->cpc_regs[reg_idx];
> >
> >   	if (CPC_IN_PCC(reg)) {
> > -		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> > -		struct cppc_pcc_data *pcc_ss_data = NULL;
> > -		int ret = 0;
> > -
> >   		if (pcc_ss_id < 0)
> >   			return -EIO;
> >
> > @@ -1125,7 +1124,7 @@ static int cppc_get_perf(int cpunum, enum
> > cppc_regs reg_idx, u64 *perf)
> >
> >   	cpc_read(cpunum, reg, perf);
> >
> > -	return 0;
> > +	return ret;
> >   }
> >
> >   /**
> > @@ -1365,6 +1364,111 @@ int cppc_get_perf_ctrs(int cpunum, struct
> cppc_perf_fb_ctrs *perf_fb_ctrs)
> >   }
> >   EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
> >
> > +/**
> > + * cppc_get_epp_caps - Get the energy preference register value.
> > + * @cpunum: CPU from which to get epp preference level.
> > + * @perf_caps: Return address.
> > + *
> > + * Return: 0 for success, -EIO otherwise.
> > + */
> > +int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps) {
> > +	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
> > +	struct cpc_register_resource *energy_perf_reg;
> > +	u64 energy_perf;
> > +
> > +	if (!cpc_desc) {
> > +		pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
> > +		return -ENODEV;
> > +	}
> > +
> > +	energy_perf_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
> > +
> > +	if (!CPC_SUPPORTED(energy_perf_reg))
> > +		pr_warn_once("energy perf reg update is unsupported!\n");
> > +
> > +	if (CPC_IN_PCC(energy_perf_reg)) {
> > +		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
> > +		struct cppc_pcc_data *pcc_ss_data = NULL;
> > +		int ret = 0;
> > +
> > +		if (pcc_ss_id < 0)
> > +			return -ENODEV;
> > +
> > +		pcc_ss_data = pcc_data[pcc_ss_id];
> > +
> > +		down_write(&pcc_ss_data->pcc_lock);
> > +
> > +		if (send_pcc_cmd(pcc_ss_id, CMD_READ) >= 0) {
> > +			cpc_read(cpunum, energy_perf_reg, &energy_perf);
> > +			perf_caps->energy_perf = energy_perf;
> > +		} else {
> > +			ret = -EIO;
> > +		}
> > +
> > +		up_write(&pcc_ss_data->pcc_lock);
> > +
> > +		return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(cppc_get_epp_caps);
> > +
> > +/*
> > + * Set Energy Performance Preference Register value through
> > + * Performance Controls Interface
> > + */
> > +int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls,
> > +bool enable) {
> > +	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
> > +	struct cpc_register_resource *epp_set_reg;
> > +	struct cpc_register_resource *auto_sel_reg;
> > +	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> > +	struct cppc_pcc_data *pcc_ss_data = NULL;
> > +	int ret = -EINVAL;
> > +
> > +	if (!cpc_desc) {
> > +		pr_debug("No CPC descriptor for CPU:%d\n", cpu);
> > +		return -ENODEV;
> > +	}
> > +
> > +	auto_sel_reg = &cpc_desc->cpc_regs[AUTO_SEL_ENABLE];
> > +	epp_set_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
> > +
> > +	if (CPC_IN_PCC(epp_set_reg) || CPC_IN_PCC(auto_sel_reg)) {
> > +		if (pcc_ss_id < 0) {
> > +			pr_debug("Invalid pcc_ss_id\n");
> > +			return -ENODEV;
> > +		}
> > +
> > +		if (CPC_SUPPORTED(auto_sel_reg)) {
> > +			ret = cpc_write(cpu, auto_sel_reg, enable);
> > +			if (ret)
> > +				return ret;
> > +		}
> > +
> > +		if (CPC_SUPPORTED(epp_set_reg)) {
> > +			ret = cpc_write(cpu, epp_set_reg, perf_ctrls-
> >energy_perf);
> > +			if (ret)
> > +				return ret;
> > +		}
> > +
> > +		pcc_ss_data = pcc_data[pcc_ss_id];
> > +
> > +		down_write(&pcc_ss_data->pcc_lock);
> > +		/* after writing CPC, transfer the ownership of PCC to
> platform */
> > +		ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
> > +		up_write(&pcc_ss_data->pcc_lock);
> > +	} else {
> > +		ret = -ENOTSUPP;
> > +		pr_debug("_CPC in PCC is not supported\n");
> > +	}
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(cppc_set_epp_perf);
> > +
> >   /**
> >    * cppc_set_enable - Set to enable CPPC on the processor by writing the
> >    * Continuous Performance Control package EnableRegister field.
> > diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h index
> > c5614444031f..a45bb876a19c 100644
> > --- a/include/acpi/cppc_acpi.h
> > +++ b/include/acpi/cppc_acpi.h
> > @@ -108,12 +108,14 @@ struct cppc_perf_caps {
> >   	u32 lowest_nonlinear_perf;
> >   	u32 lowest_freq;
> >   	u32 nominal_freq;
> > +	u32 energy_perf;
> >   };
> >
> >   struct cppc_perf_ctrls {
> >   	u32 max_perf;
> >   	u32 min_perf;
> >   	u32 desired_perf;
> > +	u32 energy_perf;
> >   };
> >
> >   struct cppc_perf_fb_ctrs {
> > @@ -149,6 +151,8 @@ extern bool cpc_ffh_supported(void);
> >   extern bool cpc_supported_by_cpu(void);
> >   extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val);
> >   extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
> > +extern int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps
> > +*perf_caps); extern int cppc_set_epp_perf(int cpu, struct
> > +cppc_perf_ctrls *perf_ctrls, bool enable);
> >   #else /* !CONFIG_ACPI_CPPC_LIB */
> >   static inline int cppc_get_desired_perf(int cpunum, u64 *desired_perf)
> >   {
> > @@ -202,6 +206,14 @@ static inline int cpc_write_ffh(int cpunum, struct
> cpc_reg *reg, u64 val)
> >   {
> >   	return -ENOTSUPP;
> >   }
> > +static inline int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls
> > +*perf_ctrls, bool enable) {
> > +	return -ENOTSUPP;
> > +}
> > +static inline int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps
> > +*perf_caps) {
> > +	return -ENOTSUPP;
> > +}
> >   #endif /* !CONFIG_ACPI_CPPC_LIB */
> >
> >   #endif /* _CPPC_ACPI_H*/

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 20003 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control
  2022-12-02 15:59   ` Limonciello, Mario
@ 2022-12-05 15:19     ` Yuan, Perry
  0 siblings, 0 replies; 27+ messages in thread
From: Yuan, Perry @ 2022-12-05 15:19 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Huang,
	Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4506 bytes --]

[AMD Official Use Only - General]



> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, December 3, 2022 12:00 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic
> boost sysfs control
> 
> On 12/2/2022 01:47, Perry Yuan wrote:
> > From: Perry Yuan <Perry.Yuan@amd.com>
> >
> > Add one sysfs entry to control the CPU cores frequency boost state The
> > attribute file can allow user to set max performance boosted or
> > keeping at normal perf level.
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 66
> ++++++++++++++++++++++++++++++++++--
> >   1 file changed, 64 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index b3a82cee2e83..6936df6e8642
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -774,12 +774,46 @@ static ssize_t
> show_energy_performance_preference(
> >   	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> >   }
> >
> > +static void amd_pstate_update_policies(void) {
> > +	int cpu;
> > +
> > +	for_each_possible_cpu(cpu)
> > +		cpufreq_update_policy(cpu);
> > +}
> > +
> > +static ssize_t show_boost(struct kobject *kobj,
> > +				struct kobj_attribute *attr, char *buf) {
> > +	return sysfs_emit(buf, "%u\n", cppc_boost); }
> > +
> > +static ssize_t store_boost(struct kobject *a,
> > +				       struct kobj_attribute *b,
> > +				       const char *buf, size_t count) {
> > +	bool new_state;
> > +	int ret;
> > +
> > +	ret = kstrtobool(buf, &new_state);
> > +	if (ret)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	cppc_boost = !!new_state;
> > +	amd_pstate_update_policies();
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +
> > +	return count;
> > +}
> > +
> >   cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> >   cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> >   cpufreq_freq_attr_rw(energy_performance_preference);
> >   cpufreq_freq_attr_ro(energy_performance_available_preferences);
> > +define_one_global_rw(boost);
> >
> >   static struct freq_attr *amd_pstate_attr[] = {
> >   	&amd_pstate_max_freq,
> > @@ -797,6 +831,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
> >   	NULL,
> >   };
> >
> > +static struct attribute *pstate_global_attributes[] = {
> > +	&boost.attr,
> 
> Make sure you update the documentation for the new sysfs attribute.
> 
Indeed , I missed to update the sysfs docs for this .
Will update kernel doc in V7 soon.

Perry. 

> > +	NULL
> > +};
> > +
> > +static const struct attribute_group amd_pstate_global_attr_group = {
> > +	.attrs = pstate_global_attributes,
> > +};
> > +
> >   static inline void update_boost_state(void)
> >   {
> >   	u64 misc_en;
> > @@ -1415,9 +1458,28 @@ static int __init amd_pstate_init(void)
> >
> >   	ret = cpufreq_register_driver(default_pstate_driver);
> >   	if (ret)
> > -		pr_err("failed to register amd pstate driver with
> return %d\n",
> > -		       ret);
> > +		pr_err("failed to register driver with return %d\n", ret);
> > +
> > +	amd_pstate_kobj = kobject_create_and_add("amd-pstate",
> &cpu_subsys.dev_root->kobj);
> > +	if (!amd_pstate_kobj) {
> > +		ret = -EINVAL;
> > +		pr_err("global sysfs registration failed.\n");
> > +		goto kobject_free;
> > +	}
> > +
> > +	ret = sysfs_create_group(amd_pstate_kobj,
> &amd_pstate_global_attr_group);
> > +	if (ret) {
> > +		pr_err("sysfs attribute export failed with error %d.\n", ret);
> > +		goto global_attr_free;
> > +	}
> > +
> > +	return ret;
> >
> > +global_attr_free:
> > +	kobject_put(amd_pstate_kobj);
> > +kobject_free:
> > +	kfree(cpudata);
> > +	cpufreq_unregister_driver(default_pstate_driver);
> >   	return ret;
> >   }
> >   device_initcall(amd_pstate_init);

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 18736 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-05 14:40     ` Yuan, Perry
@ 2022-12-05 16:44       ` Limonciello, Mario
  2022-12-08 14:40         ` Yuan, Perry
  0 siblings, 1 reply; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-05 16:44 UTC (permalink / raw)
  To: Yuan, Perry, Huang, Ray
  Cc: rafael.j.wysocki, viresh.kumar, Sharma, Deepak, Fontenot, Nathan,
	Deucher, Alexander, Huang, Shimmer, Du, Xiaojian, Meng,
	Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7755 bytes --]

[Public]



> -----Original Message-----
> From: Yuan, Perry <Perry.Yuan@amd.com>
> Sent: Monday, December 5, 2022 08:41
> To: Huang, Ray <Ray.Huang@amd.com>
> Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> definition for Energy Preference Performance(EPP)
> 
> [AMD Official Use Only - General]
> 
> Hi Ray.
> 
> > -----Original Message-----
> > From: Huang, Ray <Ray.Huang@amd.com>
> > Sent: Monday, December 5, 2022 7:49 PM
> > To: Yuan, Perry <Perry.Yuan@amd.com>
> > Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> > <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma, Deepak
> > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> > <Nathan.Fontenot@amd.com>; Deucher, Alexander
> > <Alexander.Deucher@amd.com>; Huang, Shimmer
> > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> Meng,
> > Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes
> <Wyes.Karny@amd.com>;
> > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> > definition for Energy Preference Performance(EPP)
> >
> > On Fri, Dec 02, 2022 at 03:47:11PM +0800, Yuan, Perry wrote:
> > > make the energy preference performance strings and profiles using one
> > > common header for intel_pstate driver, then the amd_pstate epp driver
> > > can use the common header as well. This will simpify the intel_pstate
> > > and amd_pstate driver.
> > >
> > > Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> > > ---
> > >  arch/x86/include/asm/msr-index.h |  4 ---
> > >  drivers/cpufreq/intel_pstate.c   | 37 +--------------------
> > >  include/linux/cpufreq_common.h   | 56
> > ++++++++++++++++++++++++++++++++
> >
> > I don't find any specific reason why you have to use another common
> > cpufreq_common header instead of include/linux/cpufreq.h.
> >
> > Thanks,
> > Ray
> 
> That is fine for me to use the cpufreq.h to store the common vars.
> I will move the declaration to that header file.
> 
> Thanks for the review.
> 
> Perry.
> 
> >
> > >  3 files changed, 57 insertions(+), 40 deletions(-)  create mode
> > > 100644 include/linux/cpufreq_common.h
> > >
> > > diff --git a/arch/x86/include/asm/msr-index.h
> > > b/arch/x86/include/asm/msr-index.h
> > > index 4a2af82553e4..3983378cff5b 100644
> > > --- a/arch/x86/include/asm/msr-index.h
> > > +++ b/arch/x86/include/asm/msr-index.h
> > > @@ -472,10 +472,6 @@
> > >  #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
> > >  #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
> > >  #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long)
> > x & 0xff) << 24)
> > > -#define HWP_EPP_PERFORMANCE		0x00
> > > -#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > -#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > -#define HWP_EPP_POWERSAVE		0xFF
> > >  #define HWP_ACTIVITY_WINDOW(x)		((unsigned long
> > long)(x & 0xff3) << 32)
> > >  #define HWP_PACKAGE_CONTROL(x)		((unsigned long
> > long)(x & 0x1) << 42)
> > >
> > > diff --git a/drivers/cpufreq/intel_pstate.c
> > > b/drivers/cpufreq/intel_pstate.c index ad9be31753b6..65036ca21719
> > > 100644
> > > --- a/drivers/cpufreq/intel_pstate.c
> > > +++ b/drivers/cpufreq/intel_pstate.c
> > > @@ -25,6 +25,7 @@
> > >  #include <linux/acpi.h>
> > >  #include <linux/vmalloc.h>
> > >  #include <linux/pm_qos.h>
> > > +#include <linux/cpufreq_common.h>
> > >  #include <trace/events/power.h>
> > >
> > >  #include <asm/cpu.h>
> > > @@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> > >  	return 0;
> > >  }
> > >
> > > -/*
> > > - * EPP/EPB display strings corresponding to EPP index in the
> > > - * energy_perf_strings[]
> > > - *	index		String
> > > - *-------------------------------------
> > > - *	0		default
> > > - *	1		performance
> > > - *	2		balance_performance
> > > - *	3		balance_power
> > > - *	4		power
> > > - */
> > > -
> > > -enum energy_perf_value_index {
> > > -	EPP_INDEX_DEFAULT = 0,
> > > -	EPP_INDEX_PERFORMANCE,
> > > -	EPP_INDEX_BALANCE_PERFORMANCE,
> > > -	EPP_INDEX_BALANCE_POWERSAVE,
> > > -	EPP_INDEX_POWERSAVE,
> > > -};
> > > -
> > > -static const char * const energy_perf_strings[] = {
> > > -	[EPP_INDEX_DEFAULT] = "default",
> > > -	[EPP_INDEX_PERFORMANCE] = "performance",
> > > -	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > > -	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > -	[EPP_INDEX_POWERSAVE] = "power",
> > > -	NULL
> > > -};
> > > -static unsigned int epp_values[] = {
> > > -	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > -	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > -	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > HWP_EPP_BALANCE_PERFORMANCE,
> > > -	[EPP_INDEX_BALANCE_POWERSAVE] =
> > HWP_EPP_BALANCE_POWERSAVE,
> > > -	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> > > -};
> > > -
> > >  static int intel_pstate_get_energy_pref_index(struct cpudata
> > > *cpu_data, int *raw_epp)  {
> > >  	s16 epp;
> > > diff --git a/include/linux/cpufreq_common.h
> > > b/include/linux/cpufreq_common.h new file mode 100644 index
> > > 000000000000..2d14b0b0f55c
> > > --- /dev/null
> > > +++ b/include/linux/cpufreq_common.h
> > > @@ -0,0 +1,56 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * linux/include/linux/cpufreq_common.h
> > > + *
> > > + * Copyright (C) 2022 Advanced Micro Devices, Inc.
> > > + *
> > > + * Author: Perry Yuan <Perry.Yuan@amd.com>  */
> > > +
> > > +#ifndef _LINUX_CPUFREQ_COMMON_H
> > > +#define _LINUX_CPUFREQ_COMMON_H
> > > +
> > > +#include <asm/msr.h>
> > > +/*
> > > + * EPP/EPB display strings corresponding to EPP index in the
> > > + * energy_perf_strings[]
> > > + *	index		String
> > > + *-------------------------------------
> > > + *	0		default
> > > + *	1		performance
> > > + *	2		balance_performance
> > > + *	3		balance_power
> > > + *	4		power
> > > + */
> > > +
> > > +#define HWP_EPP_PERFORMANCE		0x00
> > > +#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > +#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > +#define HWP_EPP_POWERSAVE		0xFF
> > > +
> > > +enum energy_perf_value_index {
> > > +	EPP_INDEX_DEFAULT = 0,
> > > +	EPP_INDEX_PERFORMANCE,
> > > +	EPP_INDEX_BALANCE_PERFORMANCE,
> > > +	EPP_INDEX_BALANCE_POWERSAVE,
> > > +	EPP_INDEX_POWERSAVE,
> > > +};
> > > +
> > > +static const char * const energy_perf_strings[] = {
> > > +	[EPP_INDEX_DEFAULT] = "default",
> > > +	[EPP_INDEX_PERFORMANCE] = "performance",
> > > +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > > +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > +	[EPP_INDEX_POWERSAVE] = "power",
> > > +	NULL
> > > +};
> > > +
> > > +static unsigned int epp_values[] = {
> > > +	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > +	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > HWP_EPP_BALANCE_PERFORMANCE,
> > > +	[EPP_INDEX_BALANCE_POWERSAVE] =
> > HWP_EPP_BALANCE_POWERSAVE,
> > > +	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE, };
> > > +

I don't believe you should be making these static in the header file.

> > > +#endif /* _LINUX_CPUFREQ_COMMON_H */
> > > \ No newline at end of file
> > > --
> > > 2.34.1
> > >

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 20563 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-05 16:44       ` Limonciello, Mario
@ 2022-12-08 14:40         ` Yuan, Perry
  2022-12-08 16:46           ` Limonciello, Mario
  0 siblings, 1 reply; 27+ messages in thread
From: Yuan, Perry @ 2022-12-08 14:40 UTC (permalink / raw)
  To: Limonciello, Mario, Huang, Ray
  Cc: rafael.j.wysocki, viresh.kumar, Sharma, Deepak, Fontenot, Nathan,
	Deucher, Alexander, Huang, Shimmer, Du, Xiaojian, Meng,
	Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[Public]

Hi Mario. 

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Tuesday, December 6, 2022 12:45 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; Huang, Ray <Ray.Huang@amd.com>
> Cc: rafael.j.wysocki@intel.com; viresh.kumar@linaro.org; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> definition for Energy Preference Performance(EPP)
> 
> [Public]
> 
> 
> 
> > -----Original Message-----
> > From: Yuan, Perry <Perry.Yuan@amd.com>
> > Sent: Monday, December 5, 2022 08:41
> > To: Huang, Ray <Ray.Huang@amd.com>
> > Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> > <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma, Deepak
> > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>;
> > Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Shimmer
> > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> Meng, Li
> > (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> > definition for Energy Preference Performance(EPP)
> >
> > [AMD Official Use Only - General]
> >
> > Hi Ray.
> >
> > > -----Original Message-----
> > > From: Huang, Ray <Ray.Huang@amd.com>
> > > Sent: Monday, December 5, 2022 7:49 PM
> > > To: Yuan, Perry <Perry.Yuan@amd.com>
> > > Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> > > <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma,
> Deepak
> > > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>;
> > > Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Shimmer
> > > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> > Meng,
> > > Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes
> > <Wyes.Karny@amd.com>;
> > > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v6 03/11] cpufreq: intel_pstate: use common
> > > macro definition for Energy Preference Performance(EPP)
> > >
> > > On Fri, Dec 02, 2022 at 03:47:11PM +0800, Yuan, Perry wrote:
> > > > make the energy preference performance strings and profiles using
> > > > one common header for intel_pstate driver, then the amd_pstate epp
> > > > driver can use the common header as well. This will simpify the
> > > > intel_pstate and amd_pstate driver.
> > > >
> > > > Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> > > > ---
> > > >  arch/x86/include/asm/msr-index.h |  4 ---
> > > >  drivers/cpufreq/intel_pstate.c   | 37 +--------------------
> > > >  include/linux/cpufreq_common.h   | 56
> > > ++++++++++++++++++++++++++++++++
> > >
> > > I don't find any specific reason why you have to use another common
> > > cpufreq_common header instead of include/linux/cpufreq.h.
> > >
> > > Thanks,
> > > Ray
> >
> > That is fine for me to use the cpufreq.h to store the common vars.
> > I will move the declaration to that header file.
> >
> > Thanks for the review.
> >
> > Perry.
> >
> > >
> > > >  3 files changed, 57 insertions(+), 40 deletions(-)  create mode
> > > > 100644 include/linux/cpufreq_common.h
> > > >
> > > > diff --git a/arch/x86/include/asm/msr-index.h
> > > > b/arch/x86/include/asm/msr-index.h
> > > > index 4a2af82553e4..3983378cff5b 100644
> > > > --- a/arch/x86/include/asm/msr-index.h
> > > > +++ b/arch/x86/include/asm/msr-index.h
> > > > @@ -472,10 +472,6 @@
> > > >  #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
> > > >  #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
> > > >  #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long)
> > > x & 0xff) << 24)
> > > > -#define HWP_EPP_PERFORMANCE		0x00
> > > > -#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > > -#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > > -#define HWP_EPP_POWERSAVE		0xFF
> > > >  #define HWP_ACTIVITY_WINDOW(x)		((unsigned long
> > > long)(x & 0xff3) << 32)
> > > >  #define HWP_PACKAGE_CONTROL(x)		((unsigned long
> > > long)(x & 0x1) << 42)
> > > >
> > > > diff --git a/drivers/cpufreq/intel_pstate.c
> > > > b/drivers/cpufreq/intel_pstate.c index ad9be31753b6..65036ca21719
> > > > 100644
> > > > --- a/drivers/cpufreq/intel_pstate.c
> > > > +++ b/drivers/cpufreq/intel_pstate.c
> > > > @@ -25,6 +25,7 @@
> > > >  #include <linux/acpi.h>
> > > >  #include <linux/vmalloc.h>
> > > >  #include <linux/pm_qos.h>
> > > > +#include <linux/cpufreq_common.h>
> > > >  #include <trace/events/power.h>
> > > >
> > > >  #include <asm/cpu.h>
> > > > @@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> > > >  	return 0;
> > > >  }
> > > >
> > > > -/*
> > > > - * EPP/EPB display strings corresponding to EPP index in the
> > > > - * energy_perf_strings[]
> > > > - *	index		String
> > > > - *-------------------------------------
> > > > - *	0		default
> > > > - *	1		performance
> > > > - *	2		balance_performance
> > > > - *	3		balance_power
> > > > - *	4		power
> > > > - */
> > > > -
> > > > -enum energy_perf_value_index {
> > > > -	EPP_INDEX_DEFAULT = 0,
> > > > -	EPP_INDEX_PERFORMANCE,
> > > > -	EPP_INDEX_BALANCE_PERFORMANCE,
> > > > -	EPP_INDEX_BALANCE_POWERSAVE,
> > > > -	EPP_INDEX_POWERSAVE,
> > > > -};
> > > > -
> > > > -static const char * const energy_perf_strings[] = {
> > > > -	[EPP_INDEX_DEFAULT] = "default",
> > > > -	[EPP_INDEX_PERFORMANCE] = "performance",
> > > > -	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > > > -	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > > -	[EPP_INDEX_POWERSAVE] = "power",
> > > > -	NULL
> > > > -};
> > > > -static unsigned int epp_values[] = {
> > > > -	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > > -	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > > -	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > > HWP_EPP_BALANCE_PERFORMANCE,
> > > > -	[EPP_INDEX_BALANCE_POWERSAVE] =
> > > HWP_EPP_BALANCE_POWERSAVE,
> > > > -	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> > > > -};
> > > > -
> > > >  static int intel_pstate_get_energy_pref_index(struct cpudata
> > > > *cpu_data, int *raw_epp)  {
> > > >  	s16 epp;
> > > > diff --git a/include/linux/cpufreq_common.h
> > > > b/include/linux/cpufreq_common.h new file mode 100644 index
> > > > 000000000000..2d14b0b0f55c
> > > > --- /dev/null
> > > > +++ b/include/linux/cpufreq_common.h
> > > > @@ -0,0 +1,56 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > +/*
> > > > + * linux/include/linux/cpufreq_common.h
> > > > + *
> > > > + * Copyright (C) 2022 Advanced Micro Devices, Inc.
> > > > + *
> > > > + * Author: Perry Yuan <Perry.Yuan@amd.com>  */
> > > > +
> > > > +#ifndef _LINUX_CPUFREQ_COMMON_H
> > > > +#define _LINUX_CPUFREQ_COMMON_H
> > > > +
> > > > +#include <asm/msr.h>
> > > > +/*
> > > > + * EPP/EPB display strings corresponding to EPP index in the
> > > > + * energy_perf_strings[]
> > > > + *	index		String
> > > > + *-------------------------------------
> > > > + *	0		default
> > > > + *	1		performance
> > > > + *	2		balance_performance
> > > > + *	3		balance_power
> > > > + *	4		power
> > > > + */
> > > > +
> > > > +#define HWP_EPP_PERFORMANCE		0x00
> > > > +#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > > +#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > > +#define HWP_EPP_POWERSAVE		0xFF
> > > > +
> > > > +enum energy_perf_value_index {
> > > > +	EPP_INDEX_DEFAULT = 0,
> > > > +	EPP_INDEX_PERFORMANCE,
> > > > +	EPP_INDEX_BALANCE_PERFORMANCE,
> > > > +	EPP_INDEX_BALANCE_POWERSAVE,
> > > > +	EPP_INDEX_POWERSAVE,
> > > > +};
> > > > +
> > > > +static const char * const energy_perf_strings[] = {
> > > > +	[EPP_INDEX_DEFAULT] = "default",
> > > > +	[EPP_INDEX_PERFORMANCE] = "performance",
> > > > +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > > > +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > > +	[EPP_INDEX_POWERSAVE] = "power",
> > > > +	NULL
> > > > +};
> > > > +
> > > > +static unsigned int epp_values[] = {
> > > > +	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > > +	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > > HWP_EPP_BALANCE_PERFORMANCE,
> > > > +	[EPP_INDEX_BALANCE_POWERSAVE] =
> > > HWP_EPP_BALANCE_POWERSAVE,
> > > > +	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE, };
> > > > +
> 
> I don't believe you should be making these static in the header file.

If I do not make these arrays as static type, it will be reporting lots of build errors.

ld: drivers/scsi/lpfc/lpfc_sli.o:(.data+0xa0): multiple definition of `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
ld: drivers/scsi/lpfc/lpfc_sli.o:(.rodata+0x120): multiple definition of `energy_perf_strings'; drivers/scsi/lpfc/lpfc_mem.o:(.rodata+0x0): first defined here
ld: drivers/scsi/lpfc/lpfc_ct.o:(.data+0x150): multiple definition of `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
ld: drivers/scsi/lpfc/lpfc_ct.o:(.rodata+0x60): multiple definition of `energy_perf_strings'; drivers/scsi/lpfc/lpfc_mem.o:(.rodata+0x0): first defined here
ld: drivers/scsi/lpfc/lpfc_els.o:(.data+0x0): multiple definition of `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
 ....


I will check how to get this fixed if we really need set it as static in V8.

Perry.

> 
> > > > +#endif /* _LINUX_CPUFREQ_COMMON_H */
> > > > \ No newline at end of file
> > > > --
> > > > 2.34.1
> > > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors
  2022-12-05 12:22   ` Huang Rui
@ 2022-12-08 15:43     ` Yuan, Perry
  0 siblings, 0 replies; 27+ messages in thread
From: Yuan, Perry @ 2022-12-08 15:43 UTC (permalink / raw)
  To: Huang, Ray
  Cc: rafael.j.wysocki, Limonciello, Mario, viresh.kumar, Sharma,
	Deepak, Fontenot, Nathan, Deucher, Alexander, Huang, Shimmer, Du,
	Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Ray, 

> -----Original Message-----
> From: Huang, Ray <Ray.Huang@amd.com>
> Sent: Monday, December 5, 2022 8:22 PM
> To: Yuan, Perry <Perry.Yuan@amd.com>
> Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP
> support for the AMD processors
> 
> On Fri, Dec 02, 2022 at 03:47:12PM +0800, Yuan, Perry wrote:
> > From: Perry Yuan <Perry.Yuan@amd.com>
> >
> 
> > cpufreq: amd_pstate: implement Pstate EPP support for the AMD
> > processors
> 
> I suggest that we align the format like "cpufreq: amd-pstate:" as the prefix of
> the subject like previous patches. Because it's there anywhere even in the
> documentation file. I think it's not worth to change such as minor update like
> amd-pstate to amd_pstate for all the files and comments.
> So let's align the previous way.

Agreed , I have revised the title  and part of the kernel docs update patches in V7.
Please take a look at the v7 if you have some other concerns. 


> 
> > Add EPP driver support for AMD SoCs which support a dedicated MSR for
> > CPPC.  EPP is used by the DPM controller to configure the frequency
> > that a core operates at during short periods of activity.
> >
> > The SoC EPP targets are configured on a scale from 0 to 255 where 0
> > represents maximum performance and 255 represents maximum
> efficiency.
> >
> > The amd-pstate driver exports profile string names to userspace that
> > are tied to specific EPP values.
> >
> > The balance_performance string (0x80) provides the best balance for
> > efficiency versus power on most systems, but users can choose other
> > strings to meet their needs as well.
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_
> p
> > references default performance balance_performance balance_power
> power
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preferenc
> e
> > balance_performance
> >
> > To enable the driver,it needs to add `amd_pstate=active` to kernel
> > command line and kernel will load the active mode epp driver
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >  drivers/cpufreq/amd-pstate.c | 643
> ++++++++++++++++++++++++++++++++++-
> >  include/linux/amd-pstate.h   |  35 ++
> >  2 files changed, 672 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 204e39006dda..5989d4d25d0f
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -37,9 +37,12 @@
> >  #include <linux/uaccess.h>
> >  #include <linux/static_call.h>
> >  #include <linux/amd-pstate.h>
> > +#include <linux/cpufreq_common.h>
> >
> > +#ifdef CONFIG_ACPI
> >  #include <acpi/processor.h>
> >  #include <acpi/cppc_acpi.h>
> > +#endif
> 
> The amd-pstate module depends on ACPI in the kconfig, so we won't need
> add CONFIG_ACPI here.

Removed in V7.


> 
> >
> >  #include <asm/msr.h>
> >  #include <asm/processor.h>
> > @@ -59,9 +62,130 @@
> >   * we disable it by default to go acpi-cpufreq on these processors and add
> a
> >   * module parameter to be able to enable it manually for debugging.
> >   */
> > -static struct cpufreq_driver amd_pstate_driver;
> > +static bool cppc_active;
> >  static int cppc_load __initdata;
> >
> > +static struct cpufreq_driver *default_pstate_driver; static struct
> > +amd_cpudata **all_cpu_data; static struct amd_pstate_params
> > +global_params;
> > +
> > +static DEFINE_MUTEX(amd_pstate_limits_lock);
> > +static DEFINE_MUTEX(amd_pstate_driver_lock);
> > +
> > +static bool cppc_boost __read_mostly; struct kobject
> > +*amd_pstate_kobj;
> 
> Where is it using amd_pstate_kobj?

Moved the amd_pstate_kobj definition to the patch where it is actually used in v7. 

> 
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> > +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64
> > +cppc_req_cached) {
> > +	s16 epp;
> > +	struct cppc_perf_caps perf_caps;
> > +	int ret;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		if (!cppc_req_cached) {
> > +			epp = rdmsrl_on_cpu(cpudata->cpu,
> MSR_AMD_CPPC_REQ,
> > +					&cppc_req_cached);
> > +			if (epp)
> > +				return epp;
> > +		}
> > +		epp = (cppc_req_cached >> 24) & 0xFF;
> > +	} else {
> > +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> > +		if (ret < 0) {
> > +			pr_debug("Could not retrieve energy perf value
> (%d)\n", ret);
> > +			return -EIO;
> > +		}
> > +		epp = (s16) perf_caps.energy_perf;
> > +	}
> > +
> > +	return epp;
> > +}
> > +#endif
> > +
> > +static int amd_pstate_get_energy_pref_index(struct amd_cpudata
> > +*cpudata) {
> > +	s16 epp;
> > +	int index = -EINVAL;
> > +
> > +	epp = amd_pstate_get_epp(cpudata, 0);
> > +	if (epp < 0)
> > +		return epp;
> > +
> > +	switch (epp) {
> > +	case HWP_EPP_PERFORMANCE:
> > +		index = EPP_INDEX_PERFORMANCE;
> > +		break;
> > +	case HWP_EPP_BALANCE_PERFORMANCE:
> > +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> > +		break;
> > +	case HWP_EPP_BALANCE_POWERSAVE:
> > +		index = EPP_INDEX_BALANCE_POWERSAVE;
> > +		break;
> > +	case HWP_EPP_POWERSAVE:
> > +		index = EPP_INDEX_POWERSAVE;
> > +		break;
> > +	default:
> > +			break;
> > +	}
> > +
> > +	return index;
> > +}
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> > +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp) {
> > +	int ret;
> > +	struct cppc_perf_ctrls perf_ctrls;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> value);
> > +		if (!ret)
> > +			cpudata->epp_cached = epp;
> > +	} else {
> > +		perf_ctrls.energy_perf = epp;
> > +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> > +		if (ret) {
> > +			pr_debug("failed to set energy perf value (%d)\n",
> ret);
> > +			return ret;
> > +		}
> > +		cpudata->epp_cached = epp;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_set_energy_pref_index(struct amd_cpudata
> *cpudata,
> > +		int pref_index)
> > +{
> > +	int epp = -EINVAL;
> > +	int ret;
> > +
> > +	if (!pref_index) {
> > +		pr_debug("EPP pref_index is invalid\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (epp == -EINVAL)
> > +		epp = epp_values[pref_index];
> > +
> > +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> {
> > +		pr_debug("EPP cannot be set under performance policy\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	ret = amd_pstate_set_epp(cpudata, epp);
> > +
> > +	return ret;
> > +}
> > +#endif
> > +
> >  static inline int pstate_enable(bool enable)  {
> >  	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); @@ -70,11
> +194,21
> > @@ static inline int pstate_enable(bool enable)  static int
> > cppc_enable(bool enable)  {
> >  	int cpu, ret = 0;
> > +	struct cppc_perf_ctrls perf_ctrls;
> >
> >  	for_each_present_cpu(cpu) {
> >  		ret = cppc_set_enable(cpu, enable);
> >  		if (ret)
> >  			return ret;
> > +
> > +		/* Enable autonomous mode for EPP */
> > +		if (!cppc_active) {
> > +			/* Set desired perf as zero to allow EPP firmware
> control */
> > +			perf_ctrls.desired_perf = 0;
> > +			ret = cppc_set_perf(cpu, &perf_ctrls);
> > +			if (ret)
> > +				return ret;
> > +		}
> >  	}
> >
> >  	return ret;
> > @@ -417,7 +551,7 @@ static void amd_pstate_boost_init(struct
> amd_cpudata *cpudata)
> >  		return;
> >
> >  	cpudata->boost_supported = true;
> > -	amd_pstate_driver.boost_enabled = true;
> > +	default_pstate_driver->boost_enabled = true;
> >  }
> >
> >  static void amd_perf_ctl_reset(unsigned int cpu) @@ -591,10 +725,61
> > @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy
> *policy,
> >  	return sprintf(&buf[0], "%u\n", perf);  }
> >
> > +static ssize_t show_energy_performance_available_preferences(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	int i = 0;
> > +	int ret = 0;
> > +
> > +	while (energy_perf_strings[i] != NULL)
> > +		ret += sysfs_emit(buf, "%s", energy_perf_strings[i++]);
> > +
> > +	ret += sysfs_emit(buf, "\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t store_energy_performance_preference(
> > +		struct cpufreq_policy *policy, const char *buf, size_t count) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	char str_preference[21];
> > +	ssize_t ret;
> > +
> > +	ret = sscanf(buf, "%20s", str_preference);
> > +	if (ret != 1)
> > +		return -EINVAL;
> > +
> > +	ret = match_string(energy_perf_strings, -1, str_preference);
> > +	if (ret < 0)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&amd_pstate_limits_lock);
> > +	ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> > +	mutex_unlock(&amd_pstate_limits_lock);
> > +
> > +	return ret ?: count;
> > +}
> > +
> > +static ssize_t show_energy_performance_preference(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	int preference;
> > +
> > +	preference = amd_pstate_get_energy_pref_index(cpudata);
> > +	if (preference < 0)
> > +		return preference;
> > +
> > +	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]); }
> > +
> >  cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >  cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> >  cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> > +cpufreq_freq_attr_rw(energy_performance_preference);
> > +cpufreq_freq_attr_ro(energy_performance_available_preferences);
> >
> >  static struct freq_attr *amd_pstate_attr[] = {
> >  	&amd_pstate_max_freq,
> > @@ -603,6 +788,429 @@ static struct freq_attr *amd_pstate_attr[] = {
> >  	NULL,
> >  };
> >
> > +static struct freq_attr *amd_pstate_epp_attr[] = {
> > +	&amd_pstate_max_freq,
> > +	&amd_pstate_lowest_nonlinear_freq,
> > +	&amd_pstate_highest_perf,
> > +	&energy_performance_preference,
> > +	&energy_performance_available_preferences,
> > +	NULL,
> > +};
> > +
> > +static inline void update_boost_state(void) {
> > +	u64 misc_en;
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[0];
> > +	rdmsrl(MSR_K7_HWCR, misc_en);
> > +	global_params.cppc_boost_disabled = misc_en & BIT_ULL(25); }
> > +
> > +static bool amd_pstate_acpi_pm_profile_server(void)
> > +{
> > +	if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
> > +	    acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +static int amd_pstate_init_cpu(unsigned int cpunum) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[cpunum];
> > +	if (!cpudata) {
> > +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> > +		if (!cpudata)
> > +			return -ENOMEM;
> > +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> > +
> > +		cpudata->cpu = cpunum;
> > +
> > +		if (cppc_active) {
> > +			if (amd_pstate_acpi_pm_profile_server())
> > +				cppc_boost = true;
> > +		}
> > +
> > +	}
> > +	cpudata->epp_powersave = -EINVAL;
> > +	cpudata->epp_policy = 0;
> > +	pr_debug("controlling: cpu %d\n", cpunum);
> > +	return 0;
> > +}
> > +
> > +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy) {
> > +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> > +	struct amd_cpudata *cpudata;
> > +	struct device *dev;
> > +	int rc;
> > +	u64 value;
> > +
> > +	rc = amd_pstate_init_cpu(policy->cpu);
> > +	if (rc)
> > +		return rc;
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +
> > +	dev = get_cpu_device(policy->cpu);
> > +	if (!dev)
> > +		goto free_cpudata1;
> > +
> > +	rc = amd_pstate_init_perf(cpudata);
> > +	if (rc)
> > +		goto free_cpudata1;
> > +
> > +	min_freq = amd_get_min_freq(cpudata);
> > +	max_freq = amd_get_max_freq(cpudata);
> > +	nominal_freq = amd_get_nominal_freq(cpudata);
> > +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> > +	if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > +				min_freq, max_freq);
> > +		ret = -EINVAL;
> > +		goto free_cpudata1;
> > +	}
> > +
> > +	policy->min = min_freq;
> > +	policy->max = max_freq;
> > +
> > +	policy->cpuinfo.min_freq = min_freq;
> > +	policy->cpuinfo.max_freq = max_freq;
> > +	/* It will be updated by governor */
> > +	policy->cur = policy->cpuinfo.min_freq;
> > +
> > +	/* Initial processor data capability frequencies */
> > +	cpudata->max_freq = max_freq;
> > +	cpudata->min_freq = min_freq;
> > +	cpudata->nominal_freq = nominal_freq;
> > +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > +
> > +	policy->driver_data = cpudata;
> > +
> > +	update_boost_state();
> > +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > +
> > +	policy->min = policy->cpuinfo.min_freq;
> > +	policy->max = policy->cpuinfo.max_freq;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC))
> > +		policy->fast_switch_possible = true;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> > +	}
> > +	amd_pstate_boost_init(cpudata);
> > +
> > +	return 0;
> > +
> > +free_cpudata1:
> > +	kfree(cpudata);
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> > +	int ret;
> > +
> > +	ret = __amd_pstate_cpu_init(policy);
> > +	if (ret)
> > +		return ret;
> > +	/*
> > +	 * Set the policy to powersave to provide a valid fallback value in case
> > +	 * the default cpufreq governor is neither powersave nor
> performance.
> > +	 */
> > +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy) {
> > +	pr_debug("CPU %d exiting\n", policy->cpu);
> > +	policy->fast_switch_possible = false;
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_update_max_freq(unsigned int cpu) {
> > +	struct cpufreq_policy *policy = policy = cpufreq_cpu_get(cpu);
> > +
> > +	if (!policy)
> > +		return;
> > +
> > +	refresh_frequency_limits(policy);
> > +	cpufreq_cpu_put(policy);
> > +}
> > +
> > +static void amd_pstate_epp_update_limits(unsigned int cpu) {
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	update_boost_state();
> > +	if (global_params.cppc_boost_disabled) {
> > +		for_each_possible_cpu(cpu)
> > +			amd_pstate_update_max_freq(cpu);
> > +	} else {
> > +		cpufreq_update_policy(cpu);
> > +	}
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +}
> > +
> > +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> > +
> > +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata) {
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u32 max_limit = (hwp_req & 0xff);
> > +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> > +	u32 boost_level1;
> > +
> > +	/* If max and min are equal or already at max, nothing to boost */
> > +	if (max_limit == min_limit)
> > +		return;
> > +
> > +	/* Set boost max and min to initial value */
> > +	if (!cpudata->cppc_boost_min)
> > +		cpudata->cppc_boost_min = min_limit;
> > +
> > +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) +
> min_limit) >> 1);
> > +
> > +	if (cpudata->cppc_boost_min < boost_level1)
> > +		cpudata->cppc_boost_min = boost_level1;
> > +	else if (cpudata->cppc_boost_min <
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min =
> AMD_CPPC_NOMINAL_PERF(hwp_cap);
> > +	else if (cpudata->cppc_boost_min ==
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min = max_limit;
> > +	else
> > +		return;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> > +	wrmsrl(MSR_AMD_CPPC_REQ, hwp_req);
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> > +{
> > +	bool expired;
> > +
> > +	if (cpudata->cppc_boost_min) {
> > +		expired = time_after64(cpudata->sample.time, cpudata-
> >last_update +
> > +					cppc_boost_hold_time_ns);
> > +
> > +		if (expired) {
> > +			wrmsrl(MSR_AMD_CPPC_REQ, cpudata-
> >cppc_req_cached);
> > +			cpudata->cppc_boost_min = 0;
> > +		}
> > +	}
> > +
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_update_util(struct amd_cpudata
> *cpudata,
> > +						      u64 time)
> > +{
> > +	cpudata->sample.time = time;
> > +	if (smp_processor_id() != cpudata->cpu)
> > +		return;
> > +
> > +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> > +		bool do_io = false;
> > +
> > +		cpudata->sched_flags = 0;
> > +		/*
> > +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> > +		 * is set all the time, we can't just conclude that there is
> > +		 * some IO bound activity is scheduled on this CPU with just
> > +		 * one occurrence. If we receive at least two in two
> > +		 * consecutive ticks, then we treat as boost candidate.
> > +		 * This is leveraged from Intel Pstate driver.
> > +		 */
> > +		if (time_before64(time, cpudata->last_io_update + 2 *
> TICK_NSEC))
> > +			do_io = true;
> > +
> > +		cpudata->last_io_update = time;
> > +
> > +		if (do_io)
> > +			amd_pstate_boost_up(cpudata);
> > +
> > +	} else {
> > +		amd_pstate_boost_down(cpudata);
> > +	}
> > +}
> > +
> > +static inline void amd_pstate_cppc_update_hook(struct update_util_data
> *data,
> > +						u64 time, unsigned int flags)
> > +{
> > +	struct amd_cpudata *cpudata = container_of(data,
> > +				struct amd_cpudata, update_util);
> > +
> > +	cpudata->sched_flags |= flags;
> > +
> > +	if (smp_processor_id() == cpudata->cpu)
> > +		amd_pstate_boost_update_util(cpudata, time); }
> > +
> > +static void amd_pstate_clear_update_util_hook(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +
> > +	if (!cpudata->update_util_set)
> > +		return;
> > +
> > +	cpufreq_remove_update_util_hook(cpu);
> > +	cpudata->update_util_set = false;
> > +	synchronize_rcu();
> > +}
> > +
> > +static void amd_pstate_set_update_util_hook(unsigned int cpu_num) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> > +
> > +	if (!cppc_boost) {
> > +		if (cpudata->update_util_set)
> > +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> > +		return;
> > +	}
> > +
> > +	if (cpudata->update_util_set)
> > +		return;
> > +
> > +	cpudata->sample.time = 0;
> > +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> > +
> 	amd_pstate_cppc_update_hook);
> > +	cpudata->update_util_set = true;
> > +}
> > +
> > +static void amd_pstate_epp_init(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +	u32 max_perf, min_perf;
> > +	u64 value;
> > +	s16 epp;
> > +	int ret;
> > +
> > +	max_perf = READ_ONCE(cpudata->highest_perf);
> > +	min_perf = READ_ONCE(cpudata->lowest_perf);
> > +
> > +	value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> > +		min_perf = max_perf;
> > +
> > +	/* Initial min/max values for CPPC Performance Controls Register */
> > +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	value |= AMD_CPPC_MIN_PERF(min_perf);
> > +
> > +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> > +	value |= AMD_CPPC_MAX_PERF(max_perf);
> > +
> > +	/* CPPC EPP feature require to set zero to the desire perf bit */
> > +	value &= ~AMD_CPPC_DES_PERF(~0L);
> > +	value |= AMD_CPPC_DES_PERF(0);
> > +
> > +	if (cpudata->epp_policy == cpudata->policy)
> > +		goto skip_epp;
> > +
> > +	cpudata->epp_policy = cpudata->policy;
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		cpudata->epp_powersave = epp;
> > +		if (epp < 0)
> > +			goto skip_epp;
> > +		/* force the epp value to be zero for performance policy */
> > +		epp = 0;
> > +	} else {
> > +		if (cpudata->epp_powersave < 0)
> > +			goto skip_epp;
> > +		/* Get BIOS pre-defined epp value */
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		if (epp)
> > +			goto skip_epp;
> > +		epp = cpudata->epp_powersave;
> > +	}
> > +	/* Set initial EPP value */
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +	}
> > +
> > +skip_epp:
> > +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> > +	if (!ret)
> > +		cpudata->epp_cached = epp;
> > +}
> > +
> > +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata) {
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> > +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req); }
> > +
> > +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	if (!policy->cpuinfo.max_freq)
> > +		return -ENODEV;
> > +
> > +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> > +				policy->cpuinfo.max_freq, policy->max);
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +	cpudata->policy = policy->policy;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> 
> X86_FEATURE_CPPC indicated the MSR support, but I believe the share
> memory CPU also has the EPP, right?

Yes,  the shared memory system which are using below preferred profile will add the util hook with another patchset. 
Current I would like to add this to MSR systems in this patchset as tight schedule. 

        if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
            acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)


> 
> > +		mutex_lock(&amd_pstate_limits_lock);
> > +
> > +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +			amd_pstate_clear_update_util_hook(policy->cpu);
> > +			amd_pstate_set_max_limits(cpudata);
> > +		} else {
> > +			amd_pstate_set_update_util_hook(policy->cpu);
> 
> May I know why amd processors also needs to set and update util hook?

Like intel already implemented this scheduler hook, it helps to improve the I/O performance, it is not enabled by default for Client CPU.


> 
> > +		}
> > +
> > +		if (boot_cpu_has(X86_FEATURE_CPPC))
> 
> I am confused why if (boot_cpu_has(X86_FEATURE_CPPC) is still inside of a if
> (boot_cpu_has(X86_FEATURE_CPPC)).
> 

Fixed in V7. 

> > +			amd_pstate_epp_init(policy->cpu);
> > +
> > +		mutex_unlock(&amd_pstate_limits_lock);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> > +					   struct cpufreq_policy_data *policy)
> {
> > +	update_boost_state();
> > +	cpufreq_verify_within_cpu_limits(policy);
> > +}
> > +
> > +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data
> > +*policy) {
> > +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> > +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy-
> >min);
> > +	return 0;
> > +}
> > +
> >  static struct cpufreq_driver amd_pstate_driver = {
> >  	.flags		= CPUFREQ_CONST_LOOPS |
> CPUFREQ_NEED_UPDATE_LIMITS,
> >  	.verify		= amd_pstate_verify,
> > @@ -616,8 +1224,20 @@ static struct cpufreq_driver amd_pstate_driver =
> {
> >  	.attr		= amd_pstate_attr,
> >  };
> >
> > +static struct cpufreq_driver amd_pstate_epp_driver = {
> > +	.flags		= CPUFREQ_CONST_LOOPS,
> > +	.verify		= amd_pstate_epp_verify_policy,
> > +	.setpolicy	= amd_pstate_epp_set_policy,
> > +	.init		= amd_pstate_epp_cpu_init,
> > +	.exit		= amd_pstate_epp_cpu_exit,
> > +	.update_limits	= amd_pstate_epp_update_limits,
> > +	.name		= "amd_pstate_epp",
> > +	.attr		= amd_pstate_epp_attr,
> > +};
> > +
> >  static int __init amd_pstate_init(void)  {
> > +	static struct amd_cpudata **cpudata;
> >  	int ret;
> >
> >  	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) @@ -644,7
> +1264,8 @@
> > static int __init amd_pstate_init(void)
> >  	/* capability check */
> >  	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> >  		pr_debug("AMD CPPC MSR based functionality is
> supported\n");
> > -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> > +		if (!cppc_active)
> > +			default_pstate_driver->adjust_perf =
> amd_pstate_adjust_perf;
> >  	} else {
> >  		pr_debug("AMD CPPC shared memory based functionality is
> supported\n");
> >  		static_call_update(amd_pstate_enable, cppc_enable); @@ -
> 652,6
> > +1273,10 @@ static int __init amd_pstate_init(void)
> >  		static_call_update(amd_pstate_update_perf,
> cppc_update_perf);
> >  	}
> >
> > +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> > +	if (!cpudata)
> > +		return -ENOMEM;
> > +	WRITE_ONCE(all_cpu_data, cpudata);
> >  	/* enable amd pstate feature */
> >  	ret = amd_pstate_enable(true);
> >  	if (ret) {
> > @@ -659,9 +1284,9 @@ static int __init amd_pstate_init(void)
> >  		return ret;
> >  	}
> >
> > -	ret = cpufreq_register_driver(&amd_pstate_driver);
> > +	ret = cpufreq_register_driver(default_pstate_driver);
> >  	if (ret)
> > -		pr_err("failed to register amd_pstate_driver with
> return %d\n",
> > +		pr_err("failed to register amd pstate driver with
> return %d\n",
> >  		       ret);
> >
> >  	return ret;
> > @@ -676,8 +1301,14 @@ static int __init amd_pstate_param(char *str)
> >  	if (!strcmp(str, "disable")) {
> >  		cppc_load = 0;
> >  		pr_info("driver is explicitly disabled\n");
> > -	} else if (!strcmp(str, "passive"))
> > +	} else if (!strcmp(str, "passive")) {
> >  		cppc_load = 1;
> > +		default_pstate_driver = &amd_pstate_driver;
> > +	} else if (!strcmp(str, "active")) {
> > +		cppc_active = 1;
> > +		cppc_load = 1;
> > +		default_pstate_driver = &amd_pstate_epp_driver;
> > +	}
> >
> >  	return 0;
> >  }
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index 1c4b8659f171..888af62040f1 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -25,6 +25,7 @@ struct amd_aperf_mperf {
> >  	u64 aperf;
> >  	u64 mperf;
> >  	u64 tsc;
> > +	u64 time;
> >  };
> >
> >  /**
> > @@ -47,6 +48,18 @@ struct amd_aperf_mperf {
> >   * @prev: Last Aperf/Mperf/tsc count value read from register
> >   * @freq: current cpu frequency value
> >   * @boost_supported: check whether the Processor or SBIOS supports
> > boost mode
> > + * @epp_powersave: Last saved CPPC energy performance preference
> > +				when policy switched to performance
> > + * @epp_policy: Last saved policy used to set energy-performance
> > +preference
> > + * @epp_cached: Cached CPPC energy-performance preference value
> > + * @policy: Cpufreq policy value
> > + * @sched_flags: Store scheduler flags for possible cross CPU update
> > + * @update_util_set: CPUFreq utility callback is set
> > + * @last_update: Time stamp of the last performance state update
> > + * @cppc_boost_min: Last CPPC boosted min performance state
> > + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> > + * @update_util: Cpufreq utility callback information
> > + * @sample: the stored performance sample
> >   *
> >   * The amd_cpudata is key private data for each CPU thread in AMD P-
> State, and
> >   * represents all the attributes and goals that AMD P-State requests at
> runtime.
> > @@ -72,6 +85,28 @@ struct amd_cpudata {
> >
> >  	u64	freq;
> >  	bool	boost_supported;
> > +
> > +	/* EPP feature related attributes*/
> > +	s16	epp_powersave;
> > +	s16	epp_policy;
> > +	s16	epp_cached;
> > +	u32	policy;
> > +	u32	sched_flags;
> > +	bool	update_util_set;
> > +	u64	last_update;
> > +	u64	last_io_update;
> > +	u32	cppc_boost_min;
> > +	u64	cppc_cap1_cached;
> > +	struct	update_util_data update_util;
> > +	struct	amd_aperf_mperf sample;
> > +};
> > +
> > +/**
> > + * struct amd_pstate_params - global parameters for the performance
> > +control
> > + * @ cppc_boost_disabled wheher the core performance boost disabled
> > +*/ struct amd_pstate_params {
> > +	bool cppc_boost_disabled;
> >  };
> >
> >  #endif /* _LINUX_AMD_PSTATE_H */
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors
  2022-12-02 17:36   ` Limonciello, Mario
@ 2022-12-08 15:51     ` Yuan, Perry
  0 siblings, 0 replies; 27+ messages in thread
From: Yuan, Perry @ 2022-12-08 15:51 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Huang,
	Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Mario. 

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, December 3, 2022 1:36 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP
> support for the AMD processors
> 
> On 12/2/2022 01:47, Perry Yuan wrote:
> > From: Perry Yuan <Perry.Yuan@amd.com>
> >
> > Add EPP driver support for AMD SoCs which support a dedicated MSR for
> > CPPC.  EPP is used by the DPM controller to configure the frequency
> > that a core operates at during short periods of activity.
> >
> > The SoC EPP targets are configured on a scale from 0 to 255 where 0
> > represents maximum performance and 255 represents maximum
> efficiency.
> >
> > The amd-pstate driver exports profile string names to userspace that
> > are tied to specific EPP values.
> >
> > The balance_performance string (0x80) provides the best balance for
> > efficiency versus power on most systems, but users can choose other
> > strings to meet their needs as well.
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_
> p
> > references default performance balance_performance balance_power
> power
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preferenc
> e
> > balance_performance
> >
> > To enable the driver,it needs to add `amd_pstate=active` to kernel
> > command line and kernel will load the active mode epp driver
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 643
> ++++++++++++++++++++++++++++++++++-
> >   include/linux/amd-pstate.h   |  35 ++
> >   2 files changed, 672 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 204e39006dda..5989d4d25d0f
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -37,9 +37,12 @@
> >   #include <linux/uaccess.h>
> >   #include <linux/static_call.h>
> >   #include <linux/amd-pstate.h>
> > +#include <linux/cpufreq_common.h>
> >
> > +#ifdef CONFIG_ACPI
> >   #include <acpi/processor.h>
> >   #include <acpi/cppc_acpi.h>
> > +#endif
> >
> >   #include <asm/msr.h>
> >   #include <asm/processor.h>
> > @@ -59,9 +62,130 @@
> >    * we disable it by default to go acpi-cpufreq on these processors and add
> a
> >    * module parameter to be able to enable it manually for debugging.
> >    */
> > -static struct cpufreq_driver amd_pstate_driver;
> > +static bool cppc_active;
> >   static int cppc_load __initdata;
> >
> > +static struct cpufreq_driver *default_pstate_driver; static struct
> > +amd_cpudata **all_cpu_data; static struct amd_pstate_params
> > +global_params;
> > +
> > +static DEFINE_MUTEX(amd_pstate_limits_lock);
> > +static DEFINE_MUTEX(amd_pstate_driver_lock);
> > +
> > +static bool cppc_boost __read_mostly; struct kobject
> > +*amd_pstate_kobj;
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> Rather than making this conditional I would think that amd-pstate should
> probably depend on or select CONFIG_ACPI_CPPC_LIB, no?

Yes, the Kconfig select CONFIG_ACPI_CPPC_LIB , I removed the checking in V7. 
Thanks for your review.

Perry.  

> 
> > +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64
> > +cppc_req_cached) {
> > +	s16 epp;
> > +	struct cppc_perf_caps perf_caps;
> > +	int ret;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		if (!cppc_req_cached) {
> > +			epp = rdmsrl_on_cpu(cpudata->cpu,
> MSR_AMD_CPPC_REQ,
> > +					&cppc_req_cached);
> > +			if (epp)
> > +				return epp;
> > +		}
> > +		epp = (cppc_req_cached >> 24) & 0xFF;
> > +	} else {
> > +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> > +		if (ret < 0) {
> > +			pr_debug("Could not retrieve energy perf value
> (%d)\n", ret);
> > +			return -EIO;
> > +		}
> > +		epp = (s16) perf_caps.energy_perf;
> > +	}
> > +
> > +	return epp;
> > +}
> > +#endif
> > +
> > +static int amd_pstate_get_energy_pref_index(struct amd_cpudata
> > +*cpudata) {
> > +	s16 epp;
> > +	int index = -EINVAL;
> > +
> > +	epp = amd_pstate_get_epp(cpudata, 0);
> > +	if (epp < 0)
> > +		return epp;
> > +
> > +	switch (epp) {
> > +	case HWP_EPP_PERFORMANCE:
> > +		index = EPP_INDEX_PERFORMANCE;
> > +		break;
> > +	case HWP_EPP_BALANCE_PERFORMANCE:
> > +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> > +		break;
> > +	case HWP_EPP_BALANCE_POWERSAVE:
> > +		index = EPP_INDEX_BALANCE_POWERSAVE;
> > +		break;
> > +	case HWP_EPP_POWERSAVE:
> > +		index = EPP_INDEX_POWERSAVE;
> > +		break;
> > +	default:
> > +			break;
> > +	}
> > +
> > +	return index;
> > +}
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> > +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp) {
> > +	int ret;
> > +	struct cppc_perf_ctrls perf_ctrls;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> value);
> > +		if (!ret)
> > +			cpudata->epp_cached = epp;
> > +	} else {
> > +		perf_ctrls.energy_perf = epp;
> > +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> > +		if (ret) {
> > +			pr_debug("failed to set energy perf value (%d)\n",
> ret);
> > +			return ret;
> > +		}
> > +		cpudata->epp_cached = epp;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_set_energy_pref_index(struct amd_cpudata
> *cpudata,
> > +		int pref_index)
> > +{
> > +	int epp = -EINVAL;
> > +	int ret;
> > +
> > +	if (!pref_index) {
> > +		pr_debug("EPP pref_index is invalid\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (epp == -EINVAL)
> > +		epp = epp_values[pref_index];
> > +
> > +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> {
> > +		pr_debug("EPP cannot be set under performance policy\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	ret = amd_pstate_set_epp(cpudata, epp);
> > +
> > +	return ret;
> > +}
> > +#endif
> > +
> >   static inline int pstate_enable(bool enable)
> >   {
> >   	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); @@ -70,11
> +194,21
> > @@ static inline int pstate_enable(bool enable)
> >   static int cppc_enable(bool enable)
> >   {
> >   	int cpu, ret = 0;
> > +	struct cppc_perf_ctrls perf_ctrls;
> >
> >   	for_each_present_cpu(cpu) {
> >   		ret = cppc_set_enable(cpu, enable);
> >   		if (ret)
> >   			return ret;
> > +
> > +		/* Enable autonomous mode for EPP */
> > +		if (!cppc_active) {
> > +			/* Set desired perf as zero to allow EPP firmware
> control */
> > +			perf_ctrls.desired_perf = 0;
> > +			ret = cppc_set_perf(cpu, &perf_ctrls);
> > +			if (ret)
> > +				return ret;
> > +		}
> >   	}
> >
> >   	return ret;
> > @@ -417,7 +551,7 @@ static void amd_pstate_boost_init(struct
> amd_cpudata *cpudata)
> >   		return;
> >
> >   	cpudata->boost_supported = true;
> > -	amd_pstate_driver.boost_enabled = true;
> > +	default_pstate_driver->boost_enabled = true;
> >   }
> >
> >   static void amd_perf_ctl_reset(unsigned int cpu) @@ -591,10 +725,61
> > @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy
> *policy,
> >   	return sprintf(&buf[0], "%u\n", perf);
> >   }
> >
> > +static ssize_t show_energy_performance_available_preferences(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	int i = 0;
> > +	int ret = 0;
> > +
> > +	while (energy_perf_strings[i] != NULL)
> > +		ret += sysfs_emit(buf, "%s", energy_perf_strings[i++]);
> > +
> > +	ret += sysfs_emit(buf, "\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t store_energy_performance_preference(
> > +		struct cpufreq_policy *policy, const char *buf, size_t count) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	char str_preference[21];
> > +	ssize_t ret;
> > +
> > +	ret = sscanf(buf, "%20s", str_preference);
> > +	if (ret != 1)
> > +		return -EINVAL;
> > +
> > +	ret = match_string(energy_perf_strings, -1, str_preference);
> > +	if (ret < 0)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&amd_pstate_limits_lock);
> > +	ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> > +	mutex_unlock(&amd_pstate_limits_lock);
> > +
> > +	return ret ?: count;
> > +}
> > +
> > +static ssize_t show_energy_performance_preference(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	int preference;
> > +
> > +	preference = amd_pstate_get_energy_pref_index(cpudata);
> > +	if (preference < 0)
> > +		return preference;
> > +
> > +	return  sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> 
> You've got an extra space after "return".

Fixed in V7. 

> 
> > +}
> > +
> >   cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> >   cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> > +cpufreq_freq_attr_rw(energy_performance_preference);
> > +cpufreq_freq_attr_ro(energy_performance_available_preferences);
> >
> >   static struct freq_attr *amd_pstate_attr[] = {
> >   	&amd_pstate_max_freq,
> > @@ -603,6 +788,429 @@ static struct freq_attr *amd_pstate_attr[] = {
> >   	NULL,
> >   };
> >
> > +static struct freq_attr *amd_pstate_epp_attr[] = {
> > +	&amd_pstate_max_freq,
> > +	&amd_pstate_lowest_nonlinear_freq,
> > +	&amd_pstate_highest_perf,
> > +	&energy_performance_preference,
> > +	&energy_performance_available_preferences,
> > +	NULL,
> > +};
> > +
> > +static inline void update_boost_state(void) {
> > +	u64 misc_en;
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[0];
> > +	rdmsrl(MSR_K7_HWCR, misc_en);
> > +	global_params.cppc_boost_disabled = misc_en & BIT_ULL(25); }
> > +
> > +static bool amd_pstate_acpi_pm_profile_server(void)
> > +{
> > +	if (acpi_gbl_FADT.preferred_profile == PM_ENTERPRISE_SERVER ||
> > +	    acpi_gbl_FADT.preferred_profile == PM_PERFORMANCE_SERVER)
> > +		return true;
> > +
> > +	return false;
> > +}
> 
> Right now this is only used to control boost, but I wonder if this should instad
> also control the default EPP policy as all?

It should be able to control the EPP profile as well if needed.
Currently the default EPP profile is balance_performance for all CPUs.
If we want to set different profile for server CPU, there some are codes can be added here. 

> 
> > +
> > +static int amd_pstate_init_cpu(unsigned int cpunum) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[cpunum];
> > +	if (!cpudata) {
> > +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> > +		if (!cpudata)
> > +			return -ENOMEM;
> > +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> > +
> > +		cpudata->cpu = cpunum;
> > +
> > +		if (cppc_active) {
> > +			if (amd_pstate_acpi_pm_profile_server())
> > +				cppc_boost = true;
> > +		}
> > +
> > +	}
> > +	cpudata->epp_powersave = -EINVAL;
> > +	cpudata->epp_policy = 0; > +	pr_debug("controlling: cpu %d\n",
> cpunum);
> > +	return 0;
> > +}
> > +
> > +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy) {
> > +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> > +	struct amd_cpudata *cpudata;
> > +	struct device *dev;
> > +	int rc;
> > +	u64 value;
> > +
> > +	rc = amd_pstate_init_cpu(policy->cpu);
> > +	if (rc)
> > +		return rc;
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +
> > +	dev = get_cpu_device(policy->cpu);
> > +	if (!dev)
> > +		goto free_cpudata1;
> > +
> > +	rc = amd_pstate_init_perf(cpudata);
> > +	if (rc)
> > +		goto free_cpudata1;
> > +
> > +	min_freq = amd_get_min_freq(cpudata);
> > +	max_freq = amd_get_max_freq(cpudata);
> > +	nominal_freq = amd_get_nominal_freq(cpudata);
> > +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> > +	if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > +				min_freq, max_freq);
> > +		ret = -EINVAL;
> > +		goto free_cpudata1;
> > +	}
> > +
> > +	policy->min = min_freq;
> > +	policy->max = max_freq;
> > +
> > +	policy->cpuinfo.min_freq = min_freq;
> > +	policy->cpuinfo.max_freq = max_freq;
> > +	/* It will be updated by governor */
> > +	policy->cur = policy->cpuinfo.min_freq;
> > +
> > +	/* Initial processor data capability frequencies */
> > +	cpudata->max_freq = max_freq;
> > +	cpudata->min_freq = min_freq;
> > +	cpudata->nominal_freq = nominal_freq;
> > +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > +
> > +	policy->driver_data = cpudata;
> > +
> > +	update_boost_state();
> > +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > +
> > +	policy->min = policy->cpuinfo.min_freq;
> > +	policy->max = policy->cpuinfo.max_freq;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC))
> > +		policy->fast_switch_possible = true;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> > +	}
> > +	amd_pstate_boost_init(cpudata);
> > +
> > +	return 0;
> > +
> > +free_cpudata1:
> > +	kfree(cpudata);
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> > +	int ret;
> > +
> > +	ret = __amd_pstate_cpu_init(policy);
> > +	if (ret)
> > +		return ret;
> > +	/*
> > +	 * Set the policy to powersave to provide a valid fallback value in case
> > +	 * the default cpufreq governor is neither powersave nor
> performance.
> > +	 */
> > +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy) {
> > +	pr_debug("CPU %d exiting\n", policy->cpu);
> > +	policy->fast_switch_possible = false;
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_update_max_freq(unsigned int cpu) {
> > +	struct cpufreq_policy *policy = policy = cpufreq_cpu_get(cpu);
> > +
> > +	if (!policy)
> > +		return;
> > +
> > +	refresh_frequency_limits(policy);
> > +	cpufreq_cpu_put(policy);
> > +}
> > +
> > +static void amd_pstate_epp_update_limits(unsigned int cpu) {
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	update_boost_state();
> > +	if (global_params.cppc_boost_disabled) {
> > +		for_each_possible_cpu(cpu)
> > +			amd_pstate_update_max_freq(cpu);
> > +	} else {
> > +		cpufreq_update_policy(cpu);
> > +	}
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +}
> > +
> > +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> > +
> > +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata) {
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u32 max_limit = (hwp_req & 0xff);
> > +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> > +	u32 boost_level1;
> > +
> > +	/* If max and min are equal or already at max, nothing to boost */
> > +	if (max_limit == min_limit)
> > +		return;
> > +
> > +	/* Set boost max and min to initial value */
> > +	if (!cpudata->cppc_boost_min)
> > +		cpudata->cppc_boost_min = min_limit;
> > +
> > +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) +
> min_limit) >> 1);
> > +
> > +	if (cpudata->cppc_boost_min < boost_level1)
> > +		cpudata->cppc_boost_min = boost_level1;
> > +	else if (cpudata->cppc_boost_min <
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min =
> AMD_CPPC_NOMINAL_PERF(hwp_cap);
> > +	else if (cpudata->cppc_boost_min ==
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min = max_limit;
> > +	else
> > +		return;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> > +	wrmsrl(MSR_AMD_CPPC_REQ, hwp_req);
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> > +{
> > +	bool expired;
> > +
> > +	if (cpudata->cppc_boost_min) {
> > +		expired = time_after64(cpudata->sample.time, cpudata-
> >last_update +
> > +					cppc_boost_hold_time_ns);
> > +
> > +		if (expired) {
> > +			wrmsrl(MSR_AMD_CPPC_REQ, cpudata-
> >cppc_req_cached);
> > +			cpudata->cppc_boost_min = 0;
> > +		}
> > +	}
> > +
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_update_util(struct amd_cpudata
> *cpudata,
> > +						      u64 time)
> > +{
> > +	cpudata->sample.time = time;
> > +	if (smp_processor_id() != cpudata->cpu)
> > +		return;
> > +
> > +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> > +		bool do_io = false;
> > +
> > +		cpudata->sched_flags = 0;
> > +		/*
> > +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> > +		 * is set all the time, we can't just conclude that there is
> > +		 * some IO bound activity is scheduled on this CPU with just
> > +		 * one occurrence. If we receive at least two in two
> > +		 * consecutive ticks, then we treat as boost candidate.
> > +		 * This is leveraged from Intel Pstate driver.
> > +		 */
> > +		if (time_before64(time, cpudata->last_io_update + 2 *
> TICK_NSEC))
> > +			do_io = true;
> > +
> > +		cpudata->last_io_update = time;
> > +
> > +		if (do_io)
> > +			amd_pstate_boost_up(cpudata);
> > +
> > +	} else {
> > +		amd_pstate_boost_down(cpudata);
> > +	}
> > +}
> > +
> > +static inline void amd_pstate_cppc_update_hook(struct update_util_data
> *data,
> > +						u64 time, unsigned int flags)
> > +{
> > +	struct amd_cpudata *cpudata = container_of(data,
> > +				struct amd_cpudata, update_util);
> > +
> > +	cpudata->sched_flags |= flags;
> > +
> > +	if (smp_processor_id() == cpudata->cpu)
> > +		amd_pstate_boost_update_util(cpudata, time); }
> > +
> > +static void amd_pstate_clear_update_util_hook(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +
> > +	if (!cpudata->update_util_set)
> > +		return;
> > +
> > +	cpufreq_remove_update_util_hook(cpu);
> > +	cpudata->update_util_set = false;
> > +	synchronize_rcu();
> > +}
> > +
> > +static void amd_pstate_set_update_util_hook(unsigned int cpu_num) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> > +
> > +	if (!cppc_boost) {
> > +		if (cpudata->update_util_set)
> > +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> > +		return;
> > +	}
> > +
> > +	if (cpudata->update_util_set)
> > +		return;
> > +
> > +	cpudata->sample.time = 0;
> > +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> > +
> 	amd_pstate_cppc_update_hook);
> > +	cpudata->update_util_set = true;
> > +}
> > +
> > +static void amd_pstate_epp_init(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +	u32 max_perf, min_perf;
> > +	u64 value;
> > +	s16 epp;
> > +	int ret;
> > +
> > +	max_perf = READ_ONCE(cpudata->highest_perf);
> > +	min_perf = READ_ONCE(cpudata->lowest_perf);
> > +
> > +	value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> > +		min_perf = max_perf;
> > +
> > +	/* Initial min/max values for CPPC Performance Controls Register */
> > +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	value |= AMD_CPPC_MIN_PERF(min_perf);
> > +
> > +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> > +	value |= AMD_CPPC_MAX_PERF(max_perf);
> > +
> > +	/* CPPC EPP feature require to set zero to the desire perf bit */
> > +	value &= ~AMD_CPPC_DES_PERF(~0L);
> > +	value |= AMD_CPPC_DES_PERF(0);
> > +
> > +	if (cpudata->epp_policy == cpudata->policy)
> > +		goto skip_epp;
> > +
> > +	cpudata->epp_policy = cpudata->policy;
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		cpudata->epp_powersave = epp;
> > +		if (epp < 0)
> > +			goto skip_epp;
> > +		/* force the epp value to be zero for performance policy */
> > +		epp = 0;
> > +	} else {
> > +		if (cpudata->epp_powersave < 0)
> > +			goto skip_epp;
> > +		/* Get BIOS pre-defined epp value */
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		if (epp)
> > +			goto skip_epp;
> > +		epp = cpudata->epp_powersave;
> > +	}
> > +	/* Set initial EPP value */
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +	}
> > +
> > +skip_epp:
> > +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> > +	if (!ret)
> > +		cpudata->epp_cached = epp;
> > +}
> > +
> > +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata) {
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> > +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req); }
> > +
> > +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	if (!policy->cpuinfo.max_freq)
> > +		return -ENODEV;
> > +
> > +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> > +				policy->cpuinfo.max_freq, policy->max);
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +	cpudata->policy = policy->policy;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		mutex_lock(&amd_pstate_limits_lock);
> > +
> > +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +			amd_pstate_clear_update_util_hook(policy->cpu);
> > +			amd_pstate_set_max_limits(cpudata);
> > +		} else {
> > +			amd_pstate_set_update_util_hook(policy->cpu);
> > +		}
> > +
> > +		if (boot_cpu_has(X86_FEATURE_CPPC))
> > +			amd_pstate_epp_init(policy->cpu);
> > +
> > +		mutex_unlock(&amd_pstate_limits_lock);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> > +					   struct cpufreq_policy_data *policy)
> {
> > +	update_boost_state();
> > +	cpufreq_verify_within_cpu_limits(policy);
> > +}
> > +
> > +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data
> > +*policy) {
> > +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> > +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy-
> >min);
> > +	return 0;
> > +}
> > +
> >   static struct cpufreq_driver amd_pstate_driver = {
> >   	.flags		= CPUFREQ_CONST_LOOPS |
> CPUFREQ_NEED_UPDATE_LIMITS,
> >   	.verify		= amd_pstate_verify,
> > @@ -616,8 +1224,20 @@ static struct cpufreq_driver amd_pstate_driver =
> {
> >   	.attr		= amd_pstate_attr,
> >   };
> >
> > +static struct cpufreq_driver amd_pstate_epp_driver = {
> > +	.flags		= CPUFREQ_CONST_LOOPS,
> > +	.verify		= amd_pstate_epp_verify_policy,
> > +	.setpolicy	= amd_pstate_epp_set_policy,
> > +	.init		= amd_pstate_epp_cpu_init,
> > +	.exit		= amd_pstate_epp_cpu_exit,
> > +	.update_limits	= amd_pstate_epp_update_limits,
> > +	.name		= "amd_pstate_epp",
> > +	.attr		= amd_pstate_epp_attr,
> > +};
> > +
> >   static int __init amd_pstate_init(void)
> >   {
> > +	static struct amd_cpudata **cpudata;
> >   	int ret;
> >
> >   	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) @@ -644,7
> +1264,8
> > @@ static int __init amd_pstate_init(void)
> >   	/* capability check */
> >   	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> >   		pr_debug("AMD CPPC MSR based functionality is
> supported\n");
> > -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> > +		if (!cppc_active)
> > +			default_pstate_driver->adjust_perf =
> amd_pstate_adjust_perf;
> >   	} else {
> >   		pr_debug("AMD CPPC shared memory based functionality is
> supported\n");
> >   		static_call_update(amd_pstate_enable, cppc_enable); @@ -
> 652,6
> > +1273,10 @@ static int __init amd_pstate_init(void)
> >   		static_call_update(amd_pstate_update_perf,
> cppc_update_perf);
> >   	}
> >
> > +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> > +	if (!cpudata)
> > +		return -ENOMEM;
> > +	WRITE_ONCE(all_cpu_data, cpudata);
> >   	/* enable amd pstate feature */
> >   	ret = amd_pstate_enable(true);
> >   	if (ret) {
> > @@ -659,9 +1284,9 @@ static int __init amd_pstate_init(void)
> >   		return ret;
> >   	}
> >
> > -	ret = cpufreq_register_driver(&amd_pstate_driver);
> > +	ret = cpufreq_register_driver(default_pstate_driver);
> >   	if (ret)
> > -		pr_err("failed to register amd_pstate_driver with
> return %d\n",
> > +		pr_err("failed to register amd pstate driver with
> return %d\n",
> >   		       ret);
> >
> >   	return ret;
> > @@ -676,8 +1301,14 @@ static int __init amd_pstate_param(char *str)
> >   	if (!strcmp(str, "disable")) {
> >   		cppc_load = 0;
> >   		pr_info("driver is explicitly disabled\n");
> > -	} else if (!strcmp(str, "passive"))
> > +	} else if (!strcmp(str, "passive")) {
> >   		cppc_load = 1;
> > +		default_pstate_driver = &amd_pstate_driver;
> > +	} else if (!strcmp(str, "active")) {
> > +		cppc_active = 1;
> > +		cppc_load = 1;
> > +		default_pstate_driver = &amd_pstate_epp_driver;
> > +	}
> >
> >   	return 0;
> >   }
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index 1c4b8659f171..888af62040f1 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -25,6 +25,7 @@ struct amd_aperf_mperf {
> >   	u64 aperf;
> >   	u64 mperf;
> >   	u64 tsc;
> > +	u64 time;
> >   };
> >
> >   /**
> > @@ -47,6 +48,18 @@ struct amd_aperf_mperf {
> >    * @prev: Last Aperf/Mperf/tsc count value read from register
> >    * @freq: current cpu frequency value
> >    * @boost_supported: check whether the Processor or SBIOS supports
> > boost mode
> > + * @epp_powersave: Last saved CPPC energy performance preference
> > +				when policy switched to performance
> > + * @epp_policy: Last saved policy used to set energy-performance
> > +preference
> > + * @epp_cached: Cached CPPC energy-performance preference value
> > + * @policy: Cpufreq policy value
> > + * @sched_flags: Store scheduler flags for possible cross CPU update
> > + * @update_util_set: CPUFreq utility callback is set
> > + * @last_update: Time stamp of the last performance state update
> > + * @cppc_boost_min: Last CPPC boosted min performance state
> > + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> > + * @update_util: Cpufreq utility callback information
> > + * @sample: the stored performance sample
> >    *
> >    * The amd_cpudata is key private data for each CPU thread in AMD P-
> State, and
> >    * represents all the attributes and goals that AMD P-State requests at
> runtime.
> > @@ -72,6 +85,28 @@ struct amd_cpudata {
> >
> >   	u64	freq;
> >   	bool	boost_supported;
> > +
> > +	/* EPP feature related attributes*/
> > +	s16	epp_powersave;
> > +	s16	epp_policy;
> > +	s16	epp_cached;
> > +	u32	policy;
> > +	u32	sched_flags;
> > +	bool	update_util_set;
> > +	u64	last_update;
> > +	u64	last_io_update;
> > +	u32	cppc_boost_min;
> > +	u64	cppc_cap1_cached;
> > +	struct	update_util_data update_util;
> > +	struct	amd_aperf_mperf sample;
> > +};
> > +
> > +/**
> > + * struct amd_pstate_params - global parameters for the performance
> > +control
> > + * @ cppc_boost_disabled wheher the core performance boost disabled
> > +*/ struct amd_pstate_params {
> > +	bool cppc_boost_disabled;
> >   };
> >
> >   #endif /* _LINUX_AMD_PSTATE_H */

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry
  2022-12-02 15:57   ` Limonciello, Mario
@ 2022-12-08 15:55     ` Yuan, Perry
  0 siblings, 0 replies; 27+ messages in thread
From: Yuan, Perry @ 2022-12-08 15:55 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Huang,
	Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Mario.

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Friday, December 2, 2022 11:57 PM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v6 08/11] cpufreq: amd_pstate: add driver working
> mode status sysfs entry
> 
> On 12/2/2022 01:47, Perry Yuan wrote:
> > From: Perry Yuan <Perry.Yuan@amd.com>
> >
> > While amd-pstate driver was loaded with specific driver mode, it will
> 
> The *driver* doesn't need to check which mode is enabled from the sysfs
> file, but userspace may want to check this.  I think you should reword this
> accordingly in the commit message.
> 
> > need to check which mode is enabled for the pstate driver,add this
> > sysfs entry to show the current status
> >
> > $ cat /sys/devices/system/cpu/amd-pstate/status
> > active
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 44
> ++++++++++++++++++++++++++++++++++++
> >   1 file changed, 44 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 6936df6e8642..7f748a579023 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -66,6 +66,8 @@ static bool cppc_active;
> >   static int cppc_load __initdata;
> >
> >   static struct cpufreq_driver *default_pstate_driver;
> > +static struct cpufreq_driver amd_pstate_epp_driver; static struct
> > +cpufreq_driver amd_pstate_driver;
> >   static struct amd_cpudata **all_cpu_data;
> >   static struct amd_pstate_params global_params;
> >
> > @@ -807,6 +809,46 @@ static ssize_t store_boost(struct kobject *a,
> >   	return count;
> >   }
> >
> > +static ssize_t amd_pstate_show_status(char *buf) {
> > +	if (!default_pstate_driver)
> > +		return sysfs_emit(buf, "off\n");
> > +
> > +	return sysfs_emit(buf, "%s\n", default_pstate_driver ==
> &amd_pstate_epp_driver ?
> > +					"active" : "passive");
> > +}
> > +
> > +static int amd_pstate_update_status(const char *buf, size_t size) {
> > +	/* FIXME! */
> > +	return -EOPNOTSUPP;
> > +}
> 
> Why not just fix this as part of the series?  It should be no different than
> unloading the driver and reloading it in the appropriate mode, right?

There is one kernel hang issue for the amd-pstate driver when unloading it before this. 
I have added one fix patch for this in V7, and added the driver mode switching code as well.
The driver mode can be switched now.

Perry.

> 
> Is this going to be a short term FIXME/TODO or a long term?  If it's long term,
> it might be better to just make the attribute ro and and have just the show
> callback.  Then when you're ready to make it rw you can add the store
> callback, change it from ro to rw and update documentation at that time.
> 

Fixed in V7. It is working well now. 

> > +
> > +static ssize_t show_status(struct kobject *kobj,
> > +			   struct kobj_attribute *attr, char *buf)
> > +{
> > +	ssize_t ret;
> > +
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	ret = amd_pstate_show_status(buf);
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
> > +			    const char *buf, size_t count)
> > +{
> > +	char *p = memchr(buf, '\n', count);
> > +	int ret;
> > +
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	ret = amd_pstate_update_status(buf, p ? p - buf : count);
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +
> > +	return ret < 0 ? ret : count;
> > +}
> > +
> >   cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >   cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> > @@ -814,6 +856,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> >   cpufreq_freq_attr_rw(energy_performance_preference);
> >   cpufreq_freq_attr_ro(energy_performance_available_preferences);
> >   define_one_global_rw(boost);
> > +define_one_global_rw(status);
> >
> >   static struct freq_attr *amd_pstate_attr[] = {
> >   	&amd_pstate_max_freq,
> > @@ -833,6 +876,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
> >
> >   static struct attribute *pstate_global_attributes[] = {
> >   	&boost.attr,
> > +	&status.attr,
> 
> In the series I didn't see a matching Documentation update for the new
> sysfs file, can you please add one?

Added in V7.
Please help to take a look

> 
> >   	NULL
> >   };
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
  2022-12-08 14:40         ` Yuan, Perry
@ 2022-12-08 16:46           ` Limonciello, Mario
  0 siblings, 0 replies; 27+ messages in thread
From: Limonciello, Mario @ 2022-12-08 16:46 UTC (permalink / raw)
  To: Yuan, Perry, Huang, Ray
  Cc: rafael.j.wysocki, viresh.kumar, Sharma, Deepak, Fontenot, Nathan,
	Deucher, Alexander, Huang, Shimmer, Du, Xiaojian, Meng,
	Li (Jassmine),
	Karny, Wyes, linux-pm, linux-kernel

[Public]



> -----Original Message-----
> From: Yuan, Perry <Perry.Yuan@amd.com>
> Sent: Thursday, December 8, 2022 08:40
> To: Limonciello, Mario <Mario.Limonciello@amd.com>; Huang, Ray
> <Ray.Huang@amd.com>
> Cc: rafael.j.wysocki@intel.com; viresh.kumar@linaro.org; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>; Meng,
> Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes <Wyes.Karny@amd.com>;
> linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> definition for Energy Preference Performance(EPP)
> 
> [Public]
> 
> Hi Mario.
> 
> > -----Original Message-----
> > From: Limonciello, Mario <Mario.Limonciello@amd.com>
> > Sent: Tuesday, December 6, 2022 12:45 AM
> > To: Yuan, Perry <Perry.Yuan@amd.com>; Huang, Ray
> <Ray.Huang@amd.com>
> > Cc: rafael.j.wysocki@intel.com; viresh.kumar@linaro.org; Sharma, Deepak
> > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> > <Nathan.Fontenot@amd.com>; Deucher, Alexander
> > <Alexander.Deucher@amd.com>; Huang, Shimmer
> > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> Meng,
> > Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes
> <Wyes.Karny@amd.com>;
> > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> > definition for Energy Preference Performance(EPP)
> >
> > [Public]
> >
> >
> >
> > > -----Original Message-----
> > > From: Yuan, Perry <Perry.Yuan@amd.com>
> > > Sent: Monday, December 5, 2022 08:41
> > > To: Huang, Ray <Ray.Huang@amd.com>
> > > Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> > > <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma,
> Deepak
> > > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> > <Nathan.Fontenot@amd.com>;
> > > Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Shimmer
> > > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> > Meng, Li
> > > (Jassmine) <Li.Meng@amd.com>; Karny, Wyes
> <Wyes.Karny@amd.com>;
> > > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: RE: [PATCH v6 03/11] cpufreq: intel_pstate: use common macro
> > > definition for Energy Preference Performance(EPP)
> > >
> > > [AMD Official Use Only - General]
> > >
> > > Hi Ray.
> > >
> > > > -----Original Message-----
> > > > From: Huang, Ray <Ray.Huang@amd.com>
> > > > Sent: Monday, December 5, 2022 7:49 PM
> > > > To: Yuan, Perry <Perry.Yuan@amd.com>
> > > > Cc: rafael.j.wysocki@intel.com; Limonciello, Mario
> > > > <Mario.Limonciello@amd.com>; viresh.kumar@linaro.org; Sharma,
> > Deepak
> > > > <Deepak.Sharma@amd.com>; Fontenot, Nathan
> > <Nathan.Fontenot@amd.com>;
> > > > Deucher, Alexander <Alexander.Deucher@amd.com>; Huang,
> Shimmer
> > > > <Shimmer.Huang@amd.com>; Du, Xiaojian <Xiaojian.Du@amd.com>;
> > > Meng,
> > > > Li (Jassmine) <Li.Meng@amd.com>; Karny, Wyes
> > > <Wyes.Karny@amd.com>;
> > > > linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH v6 03/11] cpufreq: intel_pstate: use common
> > > > macro definition for Energy Preference Performance(EPP)
> > > >
> > > > On Fri, Dec 02, 2022 at 03:47:11PM +0800, Yuan, Perry wrote:
> > > > > make the energy preference performance strings and profiles using
> > > > > one common header for intel_pstate driver, then the amd_pstate
> epp
> > > > > driver can use the common header as well. This will simpify the
> > > > > intel_pstate and amd_pstate driver.
> > > > >
> > > > > Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> > > > > ---
> > > > >  arch/x86/include/asm/msr-index.h |  4 ---
> > > > >  drivers/cpufreq/intel_pstate.c   | 37 +--------------------
> > > > >  include/linux/cpufreq_common.h   | 56
> > > > ++++++++++++++++++++++++++++++++
> > > >
> > > > I don't find any specific reason why you have to use another common
> > > > cpufreq_common header instead of include/linux/cpufreq.h.
> > > >
> > > > Thanks,
> > > > Ray
> > >
> > > That is fine for me to use the cpufreq.h to store the common vars.
> > > I will move the declaration to that header file.
> > >
> > > Thanks for the review.
> > >
> > > Perry.
> > >
> > > >
> > > > >  3 files changed, 57 insertions(+), 40 deletions(-)  create mode
> > > > > 100644 include/linux/cpufreq_common.h
> > > > >
> > > > > diff --git a/arch/x86/include/asm/msr-index.h
> > > > > b/arch/x86/include/asm/msr-index.h
> > > > > index 4a2af82553e4..3983378cff5b 100644
> > > > > --- a/arch/x86/include/asm/msr-index.h
> > > > > +++ b/arch/x86/include/asm/msr-index.h
> > > > > @@ -472,10 +472,6 @@
> > > > >  #define HWP_MAX_PERF(x) 		((x & 0xff) << 8)
> > > > >  #define HWP_DESIRED_PERF(x)		((x & 0xff) << 16)
> > > > >  #define HWP_ENERGY_PERF_PREFERENCE(x)	(((unsigned long long)
> > > > x & 0xff) << 24)
> > > > > -#define HWP_EPP_PERFORMANCE		0x00
> > > > > -#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > > > -#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > > > -#define HWP_EPP_POWERSAVE		0xFF
> > > > >  #define HWP_ACTIVITY_WINDOW(x)		((unsigned long
> > > > long)(x & 0xff3) << 32)
> > > > >  #define HWP_PACKAGE_CONTROL(x)		((unsigned long
> > > > long)(x & 0x1) << 42)
> > > > >
> > > > > diff --git a/drivers/cpufreq/intel_pstate.c
> > > > > b/drivers/cpufreq/intel_pstate.c index ad9be31753b6..65036ca21719
> > > > > 100644
> > > > > --- a/drivers/cpufreq/intel_pstate.c
> > > > > +++ b/drivers/cpufreq/intel_pstate.c
> > > > > @@ -25,6 +25,7 @@
> > > > >  #include <linux/acpi.h>
> > > > >  #include <linux/vmalloc.h>
> > > > >  #include <linux/pm_qos.h>
> > > > > +#include <linux/cpufreq_common.h>
> > > > >  #include <trace/events/power.h>
> > > > >
> > > > >  #include <asm/cpu.h>
> > > > > @@ -628,42 +629,6 @@ static int intel_pstate_set_epb(int cpu, s16
> pref)
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > -/*
> > > > > - * EPP/EPB display strings corresponding to EPP index in the
> > > > > - * energy_perf_strings[]
> > > > > - *	index		String
> > > > > - *-------------------------------------
> > > > > - *	0		default
> > > > > - *	1		performance
> > > > > - *	2		balance_performance
> > > > > - *	3		balance_power
> > > > > - *	4		power
> > > > > - */
> > > > > -
> > > > > -enum energy_perf_value_index {
> > > > > -	EPP_INDEX_DEFAULT = 0,
> > > > > -	EPP_INDEX_PERFORMANCE,
> > > > > -	EPP_INDEX_BALANCE_PERFORMANCE,
> > > > > -	EPP_INDEX_BALANCE_POWERSAVE,
> > > > > -	EPP_INDEX_POWERSAVE,
> > > > > -};
> > > > > -
> > > > > -static const char * const energy_perf_strings[] = {
> > > > > -	[EPP_INDEX_DEFAULT] = "default",
> > > > > -	[EPP_INDEX_PERFORMANCE] = "performance",
> > > > > -	[EPP_INDEX_BALANCE_PERFORMANCE] =
> "balance_performance",
> > > > > -	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > > > -	[EPP_INDEX_POWERSAVE] = "power",
> > > > > -	NULL
> > > > > -};
> > > > > -static unsigned int epp_values[] = {
> > > > > -	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > > > -	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > > > -	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > > > HWP_EPP_BALANCE_PERFORMANCE,
> > > > > -	[EPP_INDEX_BALANCE_POWERSAVE] =
> > > > HWP_EPP_BALANCE_POWERSAVE,
> > > > > -	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
> > > > > -};
> > > > > -
> > > > >  static int intel_pstate_get_energy_pref_index(struct cpudata
> > > > > *cpu_data, int *raw_epp)  {
> > > > >  	s16 epp;
> > > > > diff --git a/include/linux/cpufreq_common.h
> > > > > b/include/linux/cpufreq_common.h new file mode 100644 index
> > > > > 000000000000..2d14b0b0f55c
> > > > > --- /dev/null
> > > > > +++ b/include/linux/cpufreq_common.h
> > > > > @@ -0,0 +1,56 @@
> > > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > > +/*
> > > > > + * linux/include/linux/cpufreq_common.h
> > > > > + *
> > > > > + * Copyright (C) 2022 Advanced Micro Devices, Inc.
> > > > > + *
> > > > > + * Author: Perry Yuan <Perry.Yuan@amd.com>  */
> > > > > +
> > > > > +#ifndef _LINUX_CPUFREQ_COMMON_H
> > > > > +#define _LINUX_CPUFREQ_COMMON_H
> > > > > +
> > > > > +#include <asm/msr.h>
> > > > > +/*
> > > > > + * EPP/EPB display strings corresponding to EPP index in the
> > > > > + * energy_perf_strings[]
> > > > > + *	index		String
> > > > > + *-------------------------------------
> > > > > + *	0		default
> > > > > + *	1		performance
> > > > > + *	2		balance_performance
> > > > > + *	3		balance_power
> > > > > + *	4		power
> > > > > + */
> > > > > +
> > > > > +#define HWP_EPP_PERFORMANCE		0x00
> > > > > +#define HWP_EPP_BALANCE_PERFORMANCE	0x80
> > > > > +#define HWP_EPP_BALANCE_POWERSAVE	0xC0
> > > > > +#define HWP_EPP_POWERSAVE		0xFF
> > > > > +
> > > > > +enum energy_perf_value_index {
> > > > > +	EPP_INDEX_DEFAULT = 0,
> > > > > +	EPP_INDEX_PERFORMANCE,
> > > > > +	EPP_INDEX_BALANCE_PERFORMANCE,
> > > > > +	EPP_INDEX_BALANCE_POWERSAVE,
> > > > > +	EPP_INDEX_POWERSAVE,
> > > > > +};
> > > > > +
> > > > > +static const char * const energy_perf_strings[] = {
> > > > > +	[EPP_INDEX_DEFAULT] = "default",
> > > > > +	[EPP_INDEX_PERFORMANCE] = "performance",
> > > > > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> "balance_performance",
> > > > > +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > > > > +	[EPP_INDEX_POWERSAVE] = "power",
> > > > > +	NULL
> > > > > +};
> > > > > +
> > > > > +static unsigned int epp_values[] = {
> > > > > +	[EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > > > > +	[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > > > > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> > > > HWP_EPP_BALANCE_PERFORMANCE,
> > > > > +	[EPP_INDEX_BALANCE_POWERSAVE] =
> > > > HWP_EPP_BALANCE_POWERSAVE,
> > > > > +	[EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE, };
> > > > > +
> >
> > I don't believe you should be making these static in the header file.
> 
> If I do not make these arrays as static type, it will be reporting lots of build
> errors.
> 
> ld: drivers/scsi/lpfc/lpfc_sli.o:(.data+0xa0): multiple definition of
> `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
> ld: drivers/scsi/lpfc/lpfc_sli.o:(.rodata+0x120): multiple definition of
> `energy_perf_strings'; drivers/scsi/lpfc/lpfc_mem.o:(.rodata+0x0): first
> defined here
> ld: drivers/scsi/lpfc/lpfc_ct.o:(.data+0x150): multiple definition of
> `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
> ld: drivers/scsi/lpfc/lpfc_ct.o:(.rodata+0x60): multiple definition of
> `energy_perf_strings'; drivers/scsi/lpfc/lpfc_mem.o:(.rodata+0x0): first
> defined here
> ld: drivers/scsi/lpfc/lpfc_els.o:(.data+0x0): multiple definition of
> `epp_values'; drivers/scsi/lpfc/lpfc_mem.o:(.data+0x0): first defined here
>  ....
> 
> 
> I will check how to get this fixed if we really need set it as static in V8.
> 

Right; What I think you should do is have in the header (.h) something like

extern int epp_values[];

and then declare the actual definitions in a source file used by all the consumers (.c).

> Perry.
> 
> >
> > > > > +#endif /* _LINUX_CPUFREQ_COMMON_H */
> > > > > \ No newline at end of file
> > > > > --
> > > > > 2.34.1
> > > > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-12-08 16:46 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02  7:47 [PATCH v6 00/11] Implement AMD Pstate EPP Driver Perry Yuan
2022-12-02  7:47 ` [PATCH v6 01/11] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
2022-12-02 16:58   ` Limonciello, Mario
2022-12-05 15:17     ` Yuan, Perry
2022-12-02  7:47 ` [PATCH v6 02/11] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
2022-12-02  7:47 ` [PATCH v6 03/11] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP) Perry Yuan
2022-12-05 11:48   ` Huang Rui
2022-12-05 14:40     ` Yuan, Perry
2022-12-05 16:44       ` Limonciello, Mario
2022-12-08 14:40         ` Yuan, Perry
2022-12-08 16:46           ` Limonciello, Mario
2022-12-02  7:47 ` [PATCH v6 04/11] cpufreq: amd_pstate: implement Pstate EPP support for the AMD processors Perry Yuan
2022-12-02 17:36   ` Limonciello, Mario
2022-12-08 15:51     ` Yuan, Perry
2022-12-05 12:22   ` Huang Rui
2022-12-08 15:43     ` Yuan, Perry
2022-12-02  7:47 ` [PATCH v6 05/11] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
2022-12-02  7:47 ` [PATCH v6 06/11] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
2022-12-02  7:47 ` [PATCH v6 07/11] cpufreq: amd-pstate: add frequency dynamic boost sysfs control Perry Yuan
2022-12-02 15:59   ` Limonciello, Mario
2022-12-05 15:19     ` Yuan, Perry
2022-12-02  7:47 ` [PATCH v6 08/11] cpufreq: amd_pstate: add driver working mode status sysfs entry Perry Yuan
2022-12-02 15:57   ` Limonciello, Mario
2022-12-08 15:55     ` Yuan, Perry
2022-12-02  7:47 ` [PATCH v6 09/11] Documentation: amd-pstate: add amd pstate driver mode introduction Perry Yuan
2022-12-02  7:47 ` [PATCH v6 10/11] Documentation: introduce amd pstate active mode kernel command line options Perry Yuan
2022-12-02  7:47 ` [PATCH v6 11/11] cpufreq: amd_pstate: convert sprintf with sysfs_emit() Perry Yuan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).