linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Implement AMD Pstate EPP Driver
@ 2022-09-09 16:45 Perry Yuan
  2022-09-09 16:45 ` [PATCH 1/7] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

Hi all,

This patchset implements one new AMD CPU frequency driver
"amd-pstate-epp” instance for better performance and power control.
CPPC has a parameter called energy preference performance (EPP).
The EPP is used in the CCLK DPM controller to drive the frequency that a core
is going to operate during short periods of activity. 
EPP values will be utilized for different OS profiles (balanced, performance, power savings).

Energy Performance Preference (EPP) provides a hint to the hardware
if software wants to bias toward performance (0x0) or energy efficiency (0xff)
The lowlevel power firmware will calculate the runtime frequency according to the EPP preference value
So the EPP hint will impact the CPU cores frequency responsiveness.

We use the RAPL interface with "perf" tool to get the energy data of the package power.
Performance Per Watt (PPW) Calculation:

The PPW calculation is referred by below paper:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&reserved=0

Below formula is referred from below spec to measure the PPW:

(F / t) / P = F * t / (t * E) = F / E,

"F" is the number of frames per second.
"P" is power measured in watts.
"E" is energy measured in joules.

Gitsouce Benchmark Data on ROME Server CPU
+------------------------------+------------------------------+------------+------------------+
| Kernel Module                | PPW (1 / s * J)      	      | Energy(J)  | Improvement (%)  |
+==============================+==============================+============+==================+
| acpi-cpufreq:schedutil       | 5.85658E-05                  | 17074.8    | base             |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:ondemand        | 5.03079E-05                  | 19877.6    | -14.10%	      |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:performance     | 5.88132E-05                  | 17003      | 0.42%            |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:ondemand          | 4.60295E-05                  | 21725.2    | -21.41%          |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:schedutil         | 4.70026E-05                  | 21275.4    | -19.7%           |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:performance       | 5.80094E-05                  | 17238.6    | -0.95%           |
+------------------------------+------------------------------+------------+------------------+
| EPP:performance              | 5.8292E-05                   | 17155      | -0.47%           |
+------------------------------+------------------------------+------------+------------------+
| EPP: balance performance:    | 6.71709E-05                  | 14887.4    | 14.69%           |
+------------------------------+------------------------------+------------+------------------+
| EPP:power                    | 6.66951E-05                  | 4993.6     | 13.88%           |
+------------------------------+------------------------------+------------+------------------+

Tbench Benchmark Data one ROME Server CPU
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| Kernel Module                               | PPW MB / (s * J)  |Throughput(MB/s)| Energy (J)  | Improvement (%)  |
+=============================================+===================+==============+=============+==================+
| acpi_cpufreq: schedutil                     | 46.39             | 17191        | 37057.3     | base             |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: ondemand                      | 51.51             | 19269.5      | 37406.5     | 11.04 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: performance                   | 45.96             | 17063.7      | 37123.7     | -0.74 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: performance(0)               | 54.46             | 20263.1      | 37205       | 17.87 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance performance          | 55.03             | 20481.9      | 37221.5     | 19.14 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance_power                | 54.43             | 20245.9      | 37194.2     | 17.77 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: power(255)                   | 54.26             | 20181.7      | 37197.4     | 17.40 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: schedutil                       | 48.22             | 17844.9      | 37006.6     | 3.80 %           |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: ondemand                        | 61.30             | 22988        | 37503.4     | 33.72 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: performance                     | 54.52             | 20252.6      | 37147.8     | 17.81 %          |
+---------------------------------------------+-------------------+--------------+-------------+------------------+


Perry Yuan (7):
  ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  cpufreq: amd_pstate: add module parameter to load amd pstate EPP
    driver
  Documentation: amd-pstate: add EPP profiles introduction
  cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type
    processor
  cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based
    processors
  cpufreq: amd_pstate: implement amd pstate cpu online and offline
    callback
  cpufreq: amd-pstate: implement suspend and resume callbacks

 Documentation/admin-guide/pm/amd-pstate.rst |  19 +
 arch/x86/include/asm/msr-index.h            |   4 +
 drivers/acpi/cppc_acpi.c                    | 128 ++-
 drivers/cpufreq/amd-pstate.c                | 942 +++++++++++++++++++-
 include/acpi/cppc_acpi.h                    |  17 +
 5 files changed, 1103 insertions(+), 7 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/7] ACPI: CPPC: Add AMD pstate energy performance preference cppc control
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 16:45 ` [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver Perry Yuan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

Add the EPP(Energy Performance Preference) support for the
AMD SoCs without the dedicated CPPC MSR, those SoCs need to add this
cppc acpi functions to update EPP values and desired perf value.

In order to get EPP worked, cppc_get_epp_caps() will query EPP preference
value and cppc_set_epp_perf() will set EPP new value.
Before the EPP works, pstate driver will use cppc_set_auto_epp() to
enable EPP function from firmware firstly.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/acpi/cppc_acpi.c | 128 ++++++++++++++++++++++++++++++++++++++-
 include/acpi/cppc_acpi.h |  17 ++++++
 2 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 59c0e3410f97..6ebedb40589d 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1329,6 +1329,132 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
 }
 EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
 
+/**
+ * cppc_get_epp_caps - Get the energy preference register value.
+ * @cpunum: CPU from which to get epp preference level.
+ * @perf_caps: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
+{
+	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
+	struct cpc_register_resource *energy_perf_reg;
+	u64 energy_perf;
+
+	if (!cpc_desc) {
+		pr_warn("No CPC descriptor for CPU:%d\n", cpunum);
+		return -ENODEV;
+	}
+
+	energy_perf_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
+
+	if (!CPC_SUPPORTED(energy_perf_reg))
+		pr_warn("energy perf reg update is unsupported!\n");
+
+	if (CPC_IN_PCC(energy_perf_reg)) {
+		int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
+		struct cppc_pcc_data *pcc_ss_data = NULL;
+		int ret = 0;
+
+		if (pcc_ss_id < 0)
+			return -EIO;
+
+		pcc_ss_data = pcc_data[pcc_ss_id];
+
+		down_write(&pcc_ss_data->pcc_lock);
+
+		if (send_pcc_cmd(pcc_ss_id, CMD_READ) >= 0) {
+			cpc_read(cpunum, energy_perf_reg, &energy_perf);
+			perf_caps->energy_perf = energy_perf;
+		} else {
+			ret = -EIO;
+		}
+
+		up_write(&pcc_ss_data->pcc_lock);
+
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cppc_get_epp_caps);
+
+int cppc_set_auto_epp(int cpu, bool enable)
+{
+       int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
+       struct cpc_register_resource *auto_sel_reg;
+       struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
+       struct cppc_pcc_data *pcc_ss_data = NULL;
+       int ret = -EINVAL;
+
+       if (!cpc_desc) {
+               pr_warn("No CPC descriptor for CPU:%d\n", cpu);
+               return -EINVAL;
+       }
+
+       auto_sel_reg = &cpc_desc->cpc_regs[AUTO_SEL_ENABLE];
+
+       if (CPC_IN_PCC(auto_sel_reg)) {
+               if (pcc_ss_id < 0)
+                       return -EIO;
+
+               ret = cpc_write(cpu, auto_sel_reg, enable);
+               if (ret)
+                       return ret;
+
+               pcc_ss_data = pcc_data[pcc_ss_id];
+
+               down_write(&pcc_ss_data->pcc_lock);
+               /* after writing CPC, transfer the ownership of PCC to platform */
+               ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
+               up_write(&pcc_ss_data->pcc_lock);
+               return ret;
+       }
+
+       return cpc_write(cpu, auto_sel_reg, enable);
+}
+EXPORT_SYMBOL_GPL(cppc_set_auto_epp);
+
+/*
+ * Set Energy Performance Preference Register value through
+ * Performance Controls Interface
+ */
+int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
+{
+	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
+	struct cpc_register_resource *epp_set_reg;
+	struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
+	struct cppc_pcc_data *pcc_ss_data = NULL;
+	int ret = -EINVAL;
+
+	if (!cpc_desc) {
+		pr_warn("No CPC descriptor for CPU:%d\n", cpu);
+		return -EINVAL;
+	}
+
+	epp_set_reg = &cpc_desc->cpc_regs[ENERGY_PERF];
+
+	if (CPC_IN_PCC(epp_set_reg)) {
+		if (pcc_ss_id < 0)
+			return -EIO;
+
+		ret = cpc_write(cpu, epp_set_reg, perf_ctrls->energy_perf);
+		if (ret)
+			return ret;
+
+		pcc_ss_data = pcc_data[pcc_ss_id];
+
+		down_write(&pcc_ss_data->pcc_lock);
+		/* after writing CPC, transfer the ownership of PCC to platform */
+		ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
+		up_write(&pcc_ss_data->pcc_lock);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cppc_set_epp_perf);
+
 /**
  * cppc_set_enable - Set to enable CPPC on the processor by writing the
  * Continuous Performance Control package EnableRegister field.
@@ -1364,7 +1490,7 @@ int cppc_set_enable(int cpu, bool enable)
 		pcc_ss_data = pcc_data[pcc_ss_id];
 
 		down_write(&pcc_ss_data->pcc_lock);
-		/* after writing CPC, transfer the ownership of PCC to platfrom */
+		/* after writing CPC, transfer the ownership of PCC to platform */
 		ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE);
 		up_write(&pcc_ss_data->pcc_lock);
 		return ret;
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index d389bab54241..b9f6b11ef072 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -108,12 +108,14 @@ struct cppc_perf_caps {
 	u32 lowest_nonlinear_perf;
 	u32 lowest_freq;
 	u32 nominal_freq;
+	u32 energy_perf;
 };
 
 struct cppc_perf_ctrls {
 	u32 max_perf;
 	u32 min_perf;
 	u32 desired_perf;
+	u32 energy_perf;
 };
 
 struct cppc_perf_fb_ctrs {
@@ -148,6 +150,9 @@ extern bool cpc_ffh_supported(void);
 extern bool cpc_supported_by_cpu(void);
 extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val);
 extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
+extern int cppc_set_auto_epp(int cpu, bool enable);
+extern int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps);
+extern int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 #else /* !CONFIG_ACPI_CPPC_LIB */
 static inline int cppc_get_desired_perf(int cpunum, u64 *desired_perf)
 {
@@ -197,6 +202,18 @@ static inline int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val)
 {
 	return -ENOTSUPP;
 }
+static inline int cppc_set_auto_epp(int cpu, bool enable)
+{
+	return -ENOTSUPP;
+}
+static inline int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
+{
+	return -ENOTSUPP;
+}
+static inline int cppc_get_epp_caps(int cpunum, struct cppc_perf_caps *perf_caps)
+{
+	return -ENOTSUPP;
+}
 #endif /* !CONFIG_ACPI_CPPC_LIB */
 
 #endif /* _CPPC_ACPI_H*/
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
  2022-09-09 16:45 ` [PATCH 1/7] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 18:49   ` Limonciello, Mario
  2022-09-09 16:45 ` [PATCH 3/7] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

The amd_pstate mode parameter will allow user to select which amd pstate
working mode as booting mode, amd_pstate instance or amd_pstate_epp instance.

1) amd_pstate instance is depending on the target operation mode.
2) amd_pstate_epp instance is depending on the set_policy operation mode.It
   is also called active mode that AMD SMU has EPP algorithm to control the
   CPU runtime frequency according to the EPP set value and workload.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index a2463f785322..451295284a26 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -58,6 +58,12 @@ module_param(shared_mem, bool, 0444);
 MODULE_PARM_DESC(shared_mem,
 		 "enable amd-pstate on processors with shared memory solution (false = disabled (default), true = enabled)");
 
+static bool epp_enabled = true;
+module_param(epp_enabled, bool, 0444);
+MODULE_PARM_DESC(epp_enabled,
+                "load amd_pstate or amd_pstate_epp (true = amd_pstate_epp driver instance (default), false = amd_pstate driver instance)");
+
+
 static struct cpufreq_driver amd_pstate_driver;
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/7] Documentation: amd-pstate: add EPP profiles introduction
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
  2022-09-09 16:45 ` [PATCH 1/7] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
  2022-09-09 16:45 ` [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 16:45 ` [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor Perry Yuan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

The patch add AMD pstate EPP feature introduction and what EPP
preference supported for AMD processors.

User can get supported list from
energy_performance_available_preferences attribute file, or update
current profile to energy_performance_preference file

1) See all EPP profiles
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

2) Check current EPP profile
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance

3) Set new EPP profile
$ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 83b58eb4ab4d..d0f0e115013b 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -261,6 +261,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
 <perf_cap_>`_.)
 This attribute is read-only.
 
+``energy_performance_available_preferences``
+
+All the supported EPP preference could be selected, List of the strings that
+can be set to the ``energy_performance_preference`` attribute
+those different profiles represent different energy vs efficiency hints provided
+to low-level firmware
+however, the ``default`` represents the epp value is set by platform firmware
+This attribute is read-only.
+
+``energy_performance_preference``
+
+The current energy performance preference can be read from this attribute.
+and user can change current preference according to energy or performance needs
+Please get all support profiles list from
+``energy_performance_available_preferences`` attribute, all the profiles are
+integer values defined between 0 to 255 when EPP feature is enabled by platform
+firmware, if EPP feature is disabled, driver will ignore the written value
+This attribute is read-write.
+
 Other performance and frequency values can be read back from
 ``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (2 preceding siblings ...)
  2022-09-09 16:45 ` [PATCH 3/7] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 18:52   ` Limonciello, Mario
  2022-09-15 16:24   ` Nathan Fontenot
  2022-09-09 16:45 ` [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors Perry Yuan
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

Add Energy Performance Preference support for AMD SOCs which only
support the shared memory interface that implemented on Zen2 and Zen3
processors, because this type CPU has no MSR supported, it will use
ACPI PCC channel to enable EPP and reset desired perf to be zero.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 451295284a26..fff298744a8e 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
 
 static int cppc_enable(bool enable)
 {
+	struct cppc_perf_ctrls perf_ctrls;
 	int cpu, ret = 0;
 
 	for_each_present_cpu(cpu) {
 		ret = cppc_set_enable(cpu, enable);
 		if (ret)
 			return ret;
+
+	/* Enable active mode for EPP */
+	ret = cppc_set_auto_epp(cpu, enable);
+	if (ret)
+		return ret;
+
+	/* Set zero to desired perf to enable EPP control*/
+	perf_ctrls.desired_perf = 0;
+	ret = cppc_set_perf(cpu, &perf_ctrls);
+	if (ret)
+		return ret;
 	}
 
 	return ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (3 preceding siblings ...)
  2022-09-09 16:45 ` [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-15 18:58   ` Nathan Fontenot
  2022-09-09 16:45 ` [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
  2022-09-09 16:45 ` [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
  6 siblings, 1 reply; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

Add EPP driver support for those AMD CPUs which has full MSR feature
enabled, The EPP is used in the DPM controller to drive the frequency
that a core is going to operate during short periods of activity.

EPP values will be utilized for different OS profiles (balanced, performance,
power savings). cppc performance can be controlled by the user space interface
sys attributes for min and max frequency limits, when pstate driver is
working under power save policy.

EPP scale is 0 - 255, 0 is the max performance and 255 is min level.
balance_performance (0x80) can provide best balance performance and watt for
most of system, meanwhile user can choose performance policy on needs.

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
default performance balance_performance balance_power power

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
balance_performance

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 arch/x86/include/asm/msr-index.h |   4 +
 drivers/cpufreq/amd-pstate.c     | 818 ++++++++++++++++++++++++++++++-
 2 files changed, 804 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 43a3d8e4eb9a..4c540badab4e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -577,6 +577,10 @@
 #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL		0xc0000301
 #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR	0xc0000302
 
+#define AMD_CPPC_EPP_PERFORMANCE		0x00
+#define AMD_CPPC_EPP_BALANCE_PERFORMANCE	0x80
+#define AMD_CPPC_EPP_BALANCE_POWERSAVE		0xBF
+#define AMD_CPPC_EPP_POWERSAVE			0xFF
 /* Fam 17h MSRs */
 #define MSR_F17H_IRPERF			0xc00000e9
 
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index fff298744a8e..e71b06e20050 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -63,8 +63,8 @@ module_param(epp_enabled, bool, 0444);
 MODULE_PARM_DESC(epp_enabled,
                 "load amd_pstate or amd_pstate_epp (true = amd_pstate_epp driver instance (default), false = amd_pstate driver instance)");
 
-
-static struct cpufreq_driver amd_pstate_driver;
+static struct cpufreq_driver *default_pstate_driver;
+static struct amd_cpudata **all_cpu_data;
 
 /**
  * struct  amd_aperf_mperf
@@ -76,6 +76,7 @@ struct amd_aperf_mperf {
 	u64 aperf;
 	u64 mperf;
 	u64 tsc;
+	u64 time;
 };
 
 /**
@@ -98,7 +99,19 @@ struct amd_aperf_mperf {
  * @prev: Last Aperf/Mperf/tsc count value read from register
  * @freq: current cpu frequency value
  * @boost_supported: check whether the Processor or SBIOS supports boost mode
- *
+ * @epp_powersave: Last saved CPPC energy performance preference
+				when policy switched to performance
+ * @epp_policy: Last saved policy used to set energy-performance preference
+ * @epp_cached: Cached CPPC energy-performance preference value
+ * @policy: Cpufreq policy value
+ * @sched_flags: Store scheduler flags for possible cross CPU update
+ * @update_util_set: CPUFreq utility callback is set
+ * @last_update: Time stamp of the last performance state update
+ * @cppc_boost_min: Last CPPC boosted min performance state
+ * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
+ * @update_util: Cpufreq utility callback information
+ * @sample: the stored performance sample
+
  * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
  * represents all the attributes and goals that AMD P-State requests at runtime.
  */
@@ -124,8 +137,195 @@ struct amd_cpudata {
 	u64 	freq;
 	bool	boost_supported;
 	u64 	cppc_hw_conf_cached;
+
+	/* EPP feature related attributes*/
+	s16	epp_powersave;
+	s16	epp_policy;
+	s16	epp_cached;
+	u32	policy;
+	u32	sched_flags;
+	bool	update_util_set;
+	u64	last_update;
+	u64	last_io_update;
+	u32	cppc_boost_min;
+	u64	cppc_cap1_cached;
+	struct	update_util_data update_util;
+	struct	amd_aperf_mperf sample;
+};
+
+/**
+ * struct amd_pstate_params - global parameters for the performance control
+ * @ cppc_boost_disabled Wheter or not the core performance boost disabled
+ */
+struct amd_pstate_params {
+	bool cppc_boost_disabled;
+};
+
+/*
+ * AMD Energy Preference Performance (EPP)
+ * The EPP is used in the CCLK DPM controller to drive
+ * the frequency that a core is going to operate during
+ * short periods of activity. EPP values will be utilized for
+ * different OS profiles (balanced, performance, power savings)
+ * display strings corresponding to EPP index in the
+ * energy_perf_strings[]
+ *	index		String
+ *-------------------------------------
+ *	0		default
+ *	1		performance
+ *	2		balance_performance
+ *	3		balance_power
+ *	4		power
+ */
+ enum energy_perf_value_index {
+	EPP_INDEX_DEFAULT = 0,
+	EPP_INDEX_PERFORMANCE,
+	EPP_INDEX_BALANCE_PERFORMANCE,
+	EPP_INDEX_BALANCE_POWERSAVE,
+	EPP_INDEX_POWERSAVE,
+};
+
+static const char * const energy_perf_strings[] = {
+	[EPP_INDEX_DEFAULT] = "default",
+	[EPP_INDEX_PERFORMANCE] = "performance",
+	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
+	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
+	[EPP_INDEX_POWERSAVE] = "power",
+	NULL
 };
 
+static unsigned int epp_values[] = {
+	[EPP_INDEX_DEFAULT] = 0,
+	[EPP_INDEX_PERFORMANCE] = AMD_CPPC_EPP_PERFORMANCE,
+	[EPP_INDEX_BALANCE_PERFORMANCE] = AMD_CPPC_EPP_BALANCE_PERFORMANCE,
+	[EPP_INDEX_BALANCE_POWERSAVE] = AMD_CPPC_EPP_BALANCE_POWERSAVE,
+	[EPP_INDEX_POWERSAVE] = AMD_CPPC_EPP_POWERSAVE,
+};
+
+static struct amd_pstate_params global;
+
+static DEFINE_MUTEX(amd_pstate_limits_lock);
+static DEFINE_MUTEX(amd_pstate_driver_lock);
+static DEFINE_SPINLOCK(amd_pstate_cpu_lock);
+
+static bool cppc_boost __read_mostly;
+struct kobject *amd_pstate_kobj;
+
+#ifdef CONFIG_ACPI_CPPC_LIB
+static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
+{
+	s16 epp;
+	struct cppc_perf_caps perf_caps;
+	int ret;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		if (!cppc_req_cached) {
+			epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
+					    &cppc_req_cached);
+			if (epp)
+				return epp;
+		}
+		epp = (cppc_req_cached >> 24) & 0xFF;
+	} else {
+		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
+		if (ret < 0) {
+			pr_debug("Could not retrieve energy perf value (%d)\n", ret);
+			return -EIO;
+		}
+		epp = (s16) perf_caps.energy_perf;
+	}
+
+	return epp;
+}
+#endif
+
+static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata, int *raw_epp)
+{
+	s16 epp;
+	int index = -EINVAL;
+
+	*raw_epp = 0;
+	epp = amd_pstate_get_epp(cpudata, 0);
+	if (epp < 0)
+		return epp;
+
+	switch (epp) {
+	case AMD_CPPC_EPP_PERFORMANCE:
+		index = EPP_INDEX_PERFORMANCE;
+		break;
+	case AMD_CPPC_EPP_BALANCE_PERFORMANCE:
+		index = EPP_INDEX_BALANCE_PERFORMANCE;
+		break;
+	case AMD_CPPC_EPP_BALANCE_POWERSAVE:
+		index = EPP_INDEX_BALANCE_POWERSAVE;
+		break;
+	case AMD_CPPC_EPP_POWERSAVE:
+		index = EPP_INDEX_POWERSAVE;
+		break;
+	default:
+		*raw_epp = epp;
+		index = 0;
+	}
+
+	return index;
+}
+
+#ifdef CONFIG_ACPI_CPPC_LIB
+static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
+{
+	int ret;
+	struct cppc_perf_ctrls perf_ctrls;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		u64 value = READ_ONCE(cpudata->cppc_req_cached);
+
+		value &= ~GENMASK_ULL(31, 24);
+		value |= (u64)epp << 24;
+		WRITE_ONCE(cpudata->cppc_req_cached, value);
+		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+		if (!ret) {
+			cpudata->epp_cached = epp;
+		}
+	} else {
+		perf_ctrls.energy_perf = epp;
+		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls);
+		if (ret){
+			pr_debug("failed to set energy perf value (%d)\n", ret);
+			return ret;
+        }
+		cpudata->epp_cached = epp;
+	}
+	return ret;
+}
+
+static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
+					      int pref_index, bool use_raw,
+					      u32 raw_epp)
+{
+	int epp = -EINVAL;
+	int ret;
+
+	if (!pref_index) {
+		pr_debug("EPP pref_index is invaid\n");
+		return -EINVAL;
+	}
+
+	if (use_raw)
+		epp = raw_epp;
+	else if (epp == -EINVAL)
+		epp = epp_values[pref_index];
+
+	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+		pr_debug("EPP cannot be set under performance policy\n");
+		return -EBUSY;
+	}
+
+	ret = amd_pstate_set_epp(cpudata, epp);
+
+	return ret;
+}
+#endif
+
 static inline int pstate_enable(bool enable)
 {
 	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
@@ -141,16 +341,18 @@ static int cppc_enable(bool enable)
 		if (ret)
 			return ret;
 
-	/* Enable active mode for EPP */
-	ret = cppc_set_auto_epp(cpu, enable);
-	if (ret)
-		return ret;
-
-	/* Set zero to desired perf to enable EPP control*/
-	perf_ctrls.desired_perf = 0;
-	ret = cppc_set_perf(cpu, &perf_ctrls);
-	if (ret)
-		return ret;
+		if (epp_enabled) {
+			/* Enable autonomous mode for EPP */
+			ret = cppc_set_auto_epp(cpu, enable);
+			if (ret)
+				return ret;
+
+			/* Set zero to desired perf to allow EPP firmware control*/
+			perf_ctrls.desired_perf = 0;
+			ret = cppc_set_perf(cpu, &perf_ctrls);
+			if (ret)
+				return ret;
+		}
 	}
 
 	return ret;
@@ -496,7 +698,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
 		return;
 
 	cpudata->boost_supported = true;
-	amd_pstate_driver.boost_enabled = true;
+	default_pstate_driver->boost_enabled = true;
 }
 
 static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
@@ -671,10 +873,108 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
 	return sprintf(&buf[0], "%u\n", perf);
 }
 
+static ssize_t show_energy_performance_available_preferences(
+				struct cpufreq_policy *policy, char *buf)
+{
+	int i = 0;
+	int ret = 0;
+
+	while (energy_perf_strings[i] != NULL)
+		ret += sprintf(&buf[ret], "%s ", energy_perf_strings[i++]);
+
+	ret += sprintf(&buf[ret], "\n");
+
+	return ret;
+}
+
+static ssize_t store_energy_performance_preference(
+		struct cpufreq_policy *policy, const char *buf, size_t count)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+	char str_preference[21];
+	bool raw = false;
+	ssize_t ret;
+	u32 epp = 0;
+
+	ret = sscanf(buf, "%20s", str_preference);
+	if (ret != 1)
+		return -EINVAL;
+
+	ret = match_string(energy_perf_strings, -1, str_preference);
+	if (ret < 0) {
+		ret = kstrtouint(buf, 10, &epp);
+		if (ret)
+			return ret;
+
+		if ((epp > 255) || (epp < 0))
+			return -EINVAL;
+
+		raw = true;
+	}
+
+	mutex_lock(&amd_pstate_limits_lock);
+	ret = amd_pstate_set_energy_pref_index(cpudata, ret, raw, epp);
+	mutex_unlock(&amd_pstate_limits_lock);
+
+	return ret ?: count;
+}
+
+static ssize_t show_energy_performance_preference(
+				struct cpufreq_policy *policy, char *buf)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+	int preference, raw_epp;
+
+	preference = amd_pstate_get_energy_pref_index(cpudata, &raw_epp);
+	if (preference < 0)
+		return preference;
+
+	if (raw_epp)
+		return  sprintf(buf, "%d\n", raw_epp);
+	else
+		return  sprintf(buf, "%s\n", energy_perf_strings[preference]);
+}
+
+static void amd_pstate_update_policies(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		cpufreq_update_policy(cpu);
+}
+
+static ssize_t show_pstate_dynamic_boost(struct kobject *kobj,
+				struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", cppc_boost);
+}
+
+static ssize_t store_pstate_dynamic_boost(struct kobject *a,
+				       struct kobj_attribute *b,
+				       const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+
+	ret = kstrtouint(buf, 10, &input);
+	if (ret)
+		return ret;
+
+	mutex_lock(&amd_pstate_driver_lock);
+	cppc_boost = !!input;
+	amd_pstate_update_policies();
+	mutex_unlock(&amd_pstate_driver_lock);
+
+	return count;
+}
+
 cpufreq_freq_attr_ro(amd_pstate_max_freq);
 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
 
 cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_rw(energy_performance_preference);
+cpufreq_freq_attr_ro(energy_performance_available_preferences);
+define_one_global_rw(pstate_dynamic_boost);
 
 static struct freq_attr *amd_pstate_attr[] = {
 	&amd_pstate_max_freq,
@@ -683,6 +983,424 @@ static struct freq_attr *amd_pstate_attr[] = {
 	NULL,
 };
 
+static struct freq_attr *amd_pstate_epp_attr[] = {
+	&amd_pstate_max_freq,
+	&amd_pstate_lowest_nonlinear_freq,
+	&amd_pstate_highest_perf,
+	&energy_performance_preference,
+	&energy_performance_available_preferences,
+	NULL,
+};
+
+static struct attribute *pstate_global_attributes[] = {
+	&pstate_dynamic_boost.attr,
+	NULL
+};
+
+static const struct attribute_group amd_pstate_global_attr_group = {
+	.attrs = pstate_global_attributes,
+};
+
+static inline void update_boost_state(void)
+{
+	u64 misc_en;
+	struct amd_cpudata *cpudata;
+
+	cpudata = all_cpu_data[0];
+	rdmsrl(MSR_AMD_CPPC_HW_CTL, misc_en);
+	global.cppc_boost_disabled = misc_en & AMD_CPPC_PRECISION_BOOST_ENABLED;
+}
+
+static int amd_pstate_init_cpu(unsigned int cpunum)
+{
+	struct amd_cpudata *cpudata;
+
+	cpudata = all_cpu_data[cpunum];
+	if (!cpudata) {
+		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
+		if (!cpudata)
+			return -ENOMEM;
+		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
+
+		cpudata->cpu = cpunum;
+	}
+	cpudata->epp_powersave = -EINVAL;
+	cpudata->epp_policy = 0;
+	pr_debug("controlling: cpu %d\n", cpunum);
+	return 0;
+}
+
+static int __amd_pstate_cpu_init(struct cpufreq_policy *policy)
+{
+	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
+	struct amd_cpudata *cpudata;
+	struct device *dev;
+	int rc;
+	u64 value;
+
+	rc = amd_pstate_init_cpu(policy->cpu);
+	if (rc)
+		return rc;
+
+	cpudata = all_cpu_data[policy->cpu];
+
+	dev = get_cpu_device(policy->cpu);
+	if (!dev)
+		goto free_cpudata1;
+
+	rc = amd_pstate_init_perf(cpudata);
+	if (rc)
+		goto free_cpudata1;
+
+	min_freq = amd_get_min_freq(cpudata);
+	max_freq = amd_get_max_freq(cpudata);
+	nominal_freq = amd_get_nominal_freq(cpudata);
+	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+		if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
+		dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
+			min_freq, max_freq);
+		ret = -EINVAL;
+		goto free_cpudata1;
+	}
+
+	policy->min = min_freq;
+	policy->max = max_freq;
+
+	policy->cpuinfo.min_freq = min_freq;
+	policy->cpuinfo.max_freq = max_freq;
+	/* It will be updated by governor */
+	policy->cur = policy->cpuinfo.min_freq;
+
+	/* Initial processor data capability frequencies */
+	cpudata->max_freq = max_freq;
+	cpudata->min_freq = min_freq;
+	cpudata->nominal_freq = nominal_freq;
+	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
+
+	policy->driver_data = cpudata;
+
+	update_boost_state();
+	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
+
+	policy->min = policy->cpuinfo.min_freq;
+	policy->max = policy->cpuinfo.max_freq;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC))
+		policy->fast_switch_possible = true;
+
+	if (!shared_mem && boot_cpu_has(X86_FEATURE_CPPC)) {
+		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
+		if (ret)
+			return ret;
+		WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
+		if (ret)
+			return ret;
+		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
+	}
+	amd_pstate_boost_init(cpudata);
+
+	return 0;
+
+free_cpudata1:
+	kfree(cpudata);
+	return ret;
+}
+
+static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
+{
+	int ret;
+
+	ret = __amd_pstate_cpu_init(policy);
+	if (ret)
+		return ret;
+	/*
+	 * Set the policy to powersave to provide a valid fallback value in case
+	 * the default cpufreq governor is neither powersave nor performance.
+	 */
+	policy->policy = CPUFREQ_POLICY_POWERSAVE;
+
+	return 0;
+}
+
+static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
+{
+	pr_debug("amd-pstate: CPU %d exiting\n", policy->cpu);
+	policy->fast_switch_possible = false;
+	return 0;
+}
+
+static void amd_pstate_update_max_freq(unsigned int cpu)
+{
+	struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu);
+
+	if (!policy)
+		return;
+
+	refresh_frequency_limits(policy);
+	cpufreq_cpu_release(policy);
+}
+
+static void amd_pstate_epp_update_limits(unsigned int cpu)
+{
+	mutex_lock(&amd_pstate_driver_lock);
+	update_boost_state();
+	if (global.cppc_boost_disabled) {
+		for_each_possible_cpu(cpu)
+			amd_pstate_update_max_freq(cpu);
+	} else {
+		cpufreq_update_policy(cpu);
+	}
+	mutex_unlock(&amd_pstate_driver_lock);
+}
+
+static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
+
+static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata)
+{
+	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
+	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
+	u32 max_limit = (hwp_req & 0xff);
+	u32 min_limit = (hwp_req & 0xff00) >> 8;
+	u32 boost_level1;
+
+	/* If max and min are equal or already at max, nothing to boost */
+	if (max_limit == min_limit)
+		return;
+
+	/* Set boost max and min to initial value */
+	if (!cpudata->cppc_boost_min)
+		cpudata->cppc_boost_min = min_limit;
+
+	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) + min_limit) >> 1);
+
+	if (cpudata->cppc_boost_min < boost_level1)
+		cpudata->cppc_boost_min = boost_level1;
+	else if (cpudata->cppc_boost_min < AMD_CPPC_NOMINAL_PERF(hwp_cap))
+		cpudata->cppc_boost_min = AMD_CPPC_NOMINAL_PERF(hwp_cap);
+	else if (cpudata->cppc_boost_min == AMD_CPPC_NOMINAL_PERF(hwp_cap))
+		cpudata->cppc_boost_min = max_limit;
+	else
+		return;
+
+	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
+	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
+	wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
+	cpudata->last_update = cpudata->sample.time;
+}
+
+static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
+{
+	bool expired;
+
+	if (cpudata->cppc_boost_min) {
+		expired = time_after64(cpudata->sample.time, cpudata->last_update +
+					cppc_boost_hold_time_ns);
+
+		if (expired) {
+			wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
+			cpudata->cppc_boost_min = 0;
+		}
+	}
+
+	cpudata->last_update = cpudata->sample.time;
+}
+
+static inline void amd_pstate_boost_update_util(struct amd_cpudata *cpudata,
+						      u64 time)
+{
+	cpudata->sample.time = time;
+	if (smp_processor_id() != cpudata->cpu)
+		return;
+
+	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
+		bool do_io = false;
+
+		cpudata->sched_flags = 0;
+		/*
+		 * Set iowait_boost flag and update time. Since IO WAIT flag
+		 * is set all the time, we can't just conclude that there is
+		 * some IO bound activity is scheduled on this CPU with just
+		 * one occurrence. If we receive at least two in two
+		 * consecutive ticks, then we treat as boost candidate.
+		 * This is leveraged from Intel Pstate driver.
+		 */
+		if (time_before64(time, cpudata->last_io_update + 2 * TICK_NSEC))
+			do_io = true;
+
+		cpudata->last_io_update = time;
+
+		if (do_io)
+			amd_pstate_boost_up(cpudata);
+
+	} else {
+		amd_pstate_boost_down(cpudata);
+	}
+}
+
+static inline void amd_pstate_cppc_update_hook(struct update_util_data *data,
+						u64 time, unsigned int flags)
+{
+	struct amd_cpudata *cpudata = container_of(data,
+				struct amd_cpudata, update_util);
+
+	cpudata->sched_flags |= flags;
+
+	if (smp_processor_id() == cpudata->cpu)
+		amd_pstate_boost_update_util(cpudata, time);
+}
+
+static void amd_pstate_clear_update_util_hook(unsigned int cpu)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu];
+
+	if (!cpudata->update_util_set)
+		return;
+
+	cpufreq_remove_update_util_hook(cpu);
+	cpudata->update_util_set = false;
+	synchronize_rcu();
+}
+
+static void amd_pstate_set_update_util_hook(unsigned int cpu_num)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
+
+	if (!cppc_boost) {
+		if (cpudata->update_util_set)
+			amd_pstate_clear_update_util_hook(cpudata->cpu);
+		return;
+	}
+
+	if (cpudata->update_util_set)
+		return;
+
+	cpudata->sample.time = 0;
+	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
+						amd_pstate_cppc_update_hook);
+	cpudata->update_util_set = true;
+}
+
+static void amd_pstate_epp_init(unsigned int cpu)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[cpu];
+	u32 max_perf, min_perf;
+	u64 value;
+	s16 epp;
+	int ret;
+
+	max_perf = READ_ONCE(cpudata->highest_perf);
+	min_perf = READ_ONCE(cpudata->lowest_perf);
+
+	value = READ_ONCE(cpudata->cppc_req_cached);
+
+	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+		min_perf = max_perf;
+
+	/* Initial min/max values for CPPC Performance Controls Register */
+	value &= ~AMD_CPPC_MIN_PERF(~0L);
+	value |= AMD_CPPC_MIN_PERF(min_perf);
+
+	value &= ~AMD_CPPC_MAX_PERF(~0L);
+	value |= AMD_CPPC_MAX_PERF(max_perf);
+
+	/* CPPC EPP feature require to set zero to the desire perf bit */
+	value &= ~AMD_CPPC_DES_PERF(~0L);
+	value |= AMD_CPPC_DES_PERF(0);
+
+	if (cpudata->epp_policy == cpudata->policy)
+		goto skip_epp;
+
+	cpudata->epp_policy = cpudata->policy;
+
+	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+		epp = amd_pstate_get_epp(cpudata, value);
+		cpudata->epp_powersave = epp;
+		if (epp < 0)
+			goto skip_epp;
+
+		epp = 0;
+	} else {
+		if (cpudata->epp_powersave < 0)
+			goto skip_epp;
+		/* Get BIOS pre-defined epp value */
+		epp = amd_pstate_get_epp(cpudata, value);
+		if (epp)
+			goto skip_epp;
+		epp = cpudata->epp_powersave;
+	}
+	/* Set initial EPP value */
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		value &= ~GENMASK_ULL(31, 24);
+		value |= (u64)epp << 24;
+	}
+
+skip_epp:
+	WRITE_ONCE(cpudata->cppc_req_cached, value);
+	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	if (!ret) {
+		cpudata->epp_cached = epp;
+	}
+}
+
+static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata)
+{
+	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
+	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
+	u32 max_limit = (hwp_cap >> 24) & 0xff;
+
+	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
+	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
+	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
+}
+
+static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata;
+
+	if (!policy->cpuinfo.max_freq)
+		return -ENODEV;
+
+	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
+				policy->cpuinfo.max_freq, policy->max);
+
+	cpudata = all_cpu_data[policy->cpu];
+	cpudata->policy = policy->policy;
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		mutex_lock(&amd_pstate_limits_lock);
+
+		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+			amd_pstate_clear_update_util_hook(policy->cpu);
+			amd_pstate_set_max_limits(cpudata);
+		} else {
+			amd_pstate_set_update_util_hook(policy->cpu);
+		}
+
+		if (boot_cpu_has(X86_FEATURE_CPPC))
+			amd_pstate_epp_init(policy->cpu);
+
+		mutex_unlock(&amd_pstate_limits_lock);
+	}
+
+	return 0;
+}
+
+static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
+					   struct cpufreq_policy_data *policy)
+{
+	update_boost_state();
+	cpufreq_verify_within_cpu_limits(policy);
+}
+
+static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
+{
+	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
+	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min );
+	return 0;
+}
+
 static struct cpufreq_driver amd_pstate_driver = {
 	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
 	.verify		= amd_pstate_verify,
@@ -696,8 +1414,20 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.attr		= amd_pstate_attr,
 };
 
+static struct cpufreq_driver amd_pstate_epp_driver = {
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.verify		= amd_pstate_epp_verify_policy,
+	.setpolicy	= amd_pstate_epp_set_policy,
+	.init		= amd_pstate_epp_cpu_init,
+	.exit		= amd_pstate_epp_cpu_exit,
+	.update_limits	= amd_pstate_epp_update_limits,
+	.name		= "amd_pstate_epp",
+	.attr		= amd_pstate_epp_attr,
+};
+
 static int __init amd_pstate_init(void)
 {
+	static struct amd_cpudata **cpudata;
 	int ret;
 
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
@@ -712,10 +1442,25 @@ static int __init amd_pstate_init(void)
 	if (cpufreq_get_current_driver())
 		return -EEXIST;
 
+	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
+	if (!cpudata)
+		return -ENOMEM;
+	WRITE_ONCE(all_cpu_data, cpudata);
+
+	if (epp_enabled) {
+		pr_info("AMD CPPC loading with amd_pstate_epp driver instance.\n");
+		default_pstate_driver = &amd_pstate_epp_driver;
+	} else {
+		pr_info("AMD CPPC loading with amd_pstate driver instance.\n");
+		default_pstate_driver = &amd_pstate_driver;
+	}
+
 	/* capability check */
 	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		if (!epp_enabled) {
+			default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
+		}
 		pr_debug("AMD CPPC MSR based functionality is supported\n");
-		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
 	} else if (shared_mem) {
 		static_call_update(amd_pstate_enable, cppc_enable);
 		static_call_update(amd_pstate_init_perf, cppc_init_perf);
@@ -732,19 +1477,56 @@ static int __init amd_pstate_init(void)
 		return ret;
 	}
 
-	ret = cpufreq_register_driver(&amd_pstate_driver);
+	ret = cpufreq_register_driver(default_pstate_driver);
 	if (ret)
-		pr_err("failed to register amd_pstate_driver with return %d\n",
+		pr_err("failed to register amd pstate driver with return %d\n",
 		       ret);
 
+	amd_pstate_kobj = kobject_create_and_add("amd-pstate", &cpu_subsys.dev_root->kobj);
+	if (!amd_pstate_kobj) {
+		pr_err("amd-pstate: Global sysfs registration failed.\n");
+	}
+
+	ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
+	if (ret) {
+		pr_err("amd-pstate: Sysfs attribute export failed with error %d.\n",
+		       ret);
+	}
+
 	return ret;
 }
 
+static inline void amd_pstate_kobj_cleanup(struct kobject *kobj)
+{
+	kobject_del(kobj);
+	kobject_put(kobj);
+}
+
 static void __exit amd_pstate_exit(void)
 {
-	cpufreq_unregister_driver(&amd_pstate_driver);
+	unsigned int cpu;
+
+	cpufreq_unregister_driver(default_pstate_driver);
 
 	amd_pstate_enable(false);
+
+	sysfs_remove_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
+	amd_pstate_kobj_cleanup(amd_pstate_kobj);
+
+	cpus_read_lock();
+	for_each_online_cpu(cpu) {
+		if (all_cpu_data[cpu]) {
+			if (default_pstate_driver == &amd_pstate_epp_driver)
+				amd_pstate_clear_update_util_hook(cpu);
+
+			spin_lock(&amd_pstate_cpu_lock);
+			kfree(all_cpu_data[cpu]);
+			WRITE_ONCE(all_cpu_data[cpu], NULL);
+			spin_unlock(&amd_pstate_cpu_lock);
+		}
+	}
+	cpus_read_unlock();
+
 }
 
 module_init(amd_pstate_init);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (4 preceding siblings ...)
  2022-09-09 16:45 ` [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 19:02   ` Limonciello, Mario
  2022-09-15 19:03   ` Nathan Fontenot
  2022-09-09 16:45 ` [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
  6 siblings, 2 replies; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

The patch adds online and offline driver callback to support to allow cpu cores go
offline and help to restore the previous working states when core goes
back online later for EPP driver mode.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 93 +++++++++++++++++++++++++++++++++++-
 1 file changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index e71b06e20050..e63fed39f90c 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -111,7 +111,8 @@ struct amd_aperf_mperf {
  * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
  * @update_util: Cpufreq utility callback information
  * @sample: the stored performance sample
-
+ * @suspended:	Whether or not the driver has been suspended.
+ *
  * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
  * represents all the attributes and goals that AMD P-State requests at runtime.
  */
@@ -151,6 +152,7 @@ struct amd_cpudata {
 	u64	cppc_cap1_cached;
 	struct	update_util_data update_util;
 	struct	amd_aperf_mperf sample;
+	bool suspended;
 };
 
 /**
@@ -1387,6 +1389,93 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
 	return 0;
 }
 
+static void amd_pstate_epp_reenable(struct amd_cpudata * cpudata)
+{
+	struct cppc_perf_ctrls perf_ctrls;
+	u64 value, max_perf;
+	int ret;
+
+	ret = amd_pstate_enable(true);
+	if (ret)
+		pr_err("failed to enable amd pstate during resume, return %d\n", ret);
+
+	value = READ_ONCE(cpudata->cppc_req_cached);
+	max_perf = READ_ONCE(cpudata->highest_perf);
+
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	} else {
+		perf_ctrls.max_perf = max_perf;
+		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(cpudata->epp_cached);
+		cppc_set_perf(cpudata->cpu, &perf_ctrls);
+	}
+}
+
+static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
+
+	if (epp_enabled) {
+		amd_pstate_epp_reenable(cpudata);
+		cpudata->suspended = false;
+	}
+
+	return 0;
+}
+
+static void amd_pstate_epp_offline(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+	struct cppc_perf_ctrls perf_ctrls;
+	int min_perf;
+	u64 value;
+
+	min_perf = READ_ONCE(cpudata->lowest_perf);
+	value = READ_ONCE(cpudata->cppc_req_cached);
+
+	mutex_lock(&amd_pstate_limits_lock);
+	if (boot_cpu_has(X86_FEATURE_CPPC)) {
+		cpudata->epp_policy = CPUFREQ_POLICY_UNKNOWN;
+
+		/* Set max perf same as min perf */
+		value &= ~AMD_CPPC_MAX_PERF(~0L);
+                value |= AMD_CPPC_MAX_PERF(min_perf);
+                value &= ~AMD_CPPC_MIN_PERF(~0L);
+                value |= AMD_CPPC_MIN_PERF(min_perf);
+		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+	} else {
+		perf_ctrls.desired_perf = 0;
+		perf_ctrls.max_perf = min_perf;
+		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(AMD_CPPC_EPP_POWERSAVE);
+		cppc_set_perf(cpudata->cpu, &perf_ctrls);
+	}
+	mutex_unlock(&amd_pstate_limits_lock);
+}
+
+static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	pr_debug("AMD CPU Core %d going offline\n", cpudata->cpu);
+
+	if (cpudata->suspended)
+		return 0;
+
+	if (epp_enabled)
+		amd_pstate_epp_offline(policy);
+
+	return 0;
+}
+
+static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
+{
+	amd_pstate_clear_update_util_hook(policy->cpu);
+
+	return amd_pstate_cpu_offline(policy);
+}
+
 static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
 					   struct cpufreq_policy_data *policy)
 {
@@ -1421,6 +1510,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.init		= amd_pstate_epp_cpu_init,
 	.exit		= amd_pstate_epp_cpu_exit,
 	.update_limits	= amd_pstate_epp_update_limits,
+	.offline	= amd_pstate_epp_cpu_offline,
+	.online		= amd_pstate_epp_cpu_online,
 	.name		= "amd_pstate_epp",
 	.attr		= amd_pstate_epp_attr,
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks
  2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
                   ` (5 preceding siblings ...)
  2022-09-09 16:45 ` [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
@ 2022-09-09 16:45 ` Perry Yuan
  2022-09-09 19:00   ` Limonciello, Mario
  6 siblings, 1 reply; 22+ messages in thread
From: Perry Yuan @ 2022-09-09 16:45 UTC (permalink / raw)
  To: rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Nathan.Fontenot,
	Alexander.Deucher, Jinzhou.Su, Shimmer.Huang, Xiaojian.Du,
	Li.Meng, linux-pm, linux-kernel, Perry Yuan

add suspend and resume support for the AMD processors by amd_pstate_epp
driver instance.

When the CPPC is suspended, EPP driver will set EPP profile to 'power'
profile and set max/min perf to lowest perf value.
When resume happens, it will restore the MSR registers with
previous cached value.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 39 ++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index e63fed39f90c..749083d28b05 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1476,6 +1476,43 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
 	return amd_pstate_cpu_offline(policy);
 }
 
+static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+	int ret;
+
+	pr_debug("AMD CPU Core %d suspending\n", cpudata->cpu);
+
+	cpudata->suspended = true;
+
+	/* disable CPPC in lowlevel firmware */
+	ret = amd_pstate_enable(false);
+	if (ret)
+		pr_err("failed to disable amd pstate during suspend, return %d\n", ret);
+
+	return 0;
+}
+
+static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
+
+	pr_debug("AMD CPU Core %d resuming\n", cpudata->cpu);
+
+	if (cpudata->suspended && epp_enabled) {
+		mutex_lock(&amd_pstate_limits_lock);
+
+		/* enable amd pstate from suspend state*/
+		amd_pstate_epp_reenable(cpudata);
+
+		mutex_unlock(&amd_pstate_limits_lock);
+	}
+
+	cpudata->suspended = false;
+
+	return 0;
+}
+
 static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
 					   struct cpufreq_policy_data *policy)
 {
@@ -1512,6 +1549,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.update_limits	= amd_pstate_epp_update_limits,
 	.offline	= amd_pstate_epp_cpu_offline,
 	.online		= amd_pstate_epp_cpu_online,
+	.suspend	= amd_pstate_epp_suspend,
+	.resume		= amd_pstate_epp_resume,
 	.name		= "amd_pstate_epp",
 	.attr		= amd_pstate_epp_attr,
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver
  2022-09-09 16:45 ` [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver Perry Yuan
@ 2022-09-09 18:49   ` Limonciello, Mario
  2022-09-13 15:14     ` Yuan, Perry
  2022-09-25 16:58     ` Yuan, Perry
  0 siblings, 2 replies; 22+ messages in thread
From: Limonciello, Mario @ 2022-09-09 18:49 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/2022 11:45, Perry Yuan wrote:
> The amd_pstate mode parameter will allow user to select which amd pstate
> working mode as booting mode, amd_pstate instance or amd_pstate_epp instance.
> 
> 1) amd_pstate instance is depending on the target operation mode.
> 2) amd_pstate_epp instance is depending on the set_policy operation mode.It
>     is also called active mode that AMD SMU has EPP algorithm to control the
>     CPU runtime frequency according to the EPP set value and workload.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index a2463f785322..451295284a26 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -58,6 +58,12 @@ module_param(shared_mem, bool, 0444);
>   MODULE_PARM_DESC(shared_mem,
>   		 "enable amd-pstate on processors with shared memory solution (false = disabled (default), true = enabled)");
>   
> +static bool epp_enabled = true;
> +module_param(epp_enabled, bool, 0444);
> +MODULE_PARM_DESC(epp_enabled,
> +                "load amd_pstate or amd_pstate_epp (true = amd_pstate_epp driver instance (default), false = amd_pstate driver instance)");
> +

If you're operating in EPP mode or not the kernel module is still 
'amd-pstate'.  So to a user I think this is a pretty confusing 
designation.

I would propose the following instead:

 > +static bool epp = true;
 > +module_param(epp, bool, 0444);
 > +MODULE_PARM_DESC(epp,
 > +                "Enable energy performance preference (EPP) control");

> +
>   static struct cpufreq_driver amd_pstate_driver;
>   
>   /**


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-09 16:45 ` [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor Perry Yuan
@ 2022-09-09 18:52   ` Limonciello, Mario
  2022-09-13 15:20     ` Yuan, Perry
  2022-09-25 17:01     ` Yuan, Perry
  2022-09-15 16:24   ` Nathan Fontenot
  1 sibling, 2 replies; 22+ messages in thread
From: Limonciello, Mario @ 2022-09-09 18:52 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/2022 11:45, Perry Yuan wrote:
> Add Energy Performance Preference support for AMD SOCs which only
> support the shared memory interface that implemented on Zen2 and Zen3
> processors, because this type CPU has no MSR supported, it will use
> ACPI PCC channel to enable EPP and reset desired perf to be zero.

This reads like all Zen2 and Zen3 processors don't have the MSR, but 
that's not true. How about:

"Add Energy Performance Preference support for AMD SOCs which do not 
contain a designated MSR for CPPC support. A shared memory interface
is used for CPPC on these SOCs and the ACPI PCC channel is used to
enable EPP and reset the desired performance."

> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 451295284a26..fff298744a8e 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
>   
>   static int cppc_enable(bool enable)
>   {
> +	struct cppc_perf_ctrls perf_ctrls;
>   	int cpu, ret = 0;
>   
>   	for_each_present_cpu(cpu) {
>   		ret = cppc_set_enable(cpu, enable);
>   		if (ret)
>   			return ret;
> +
> +	/* Enable active mode for EPP */
> +	ret = cppc_set_auto_epp(cpu, enable);
> +	if (ret)
> +		return ret;
> +
> +	/* Set zero to desired perf to enable EPP control*/
> +	perf_ctrls.desired_perf = 0;
> +	ret = cppc_set_perf(cpu, &perf_ctrls);
> +	if (ret)
> +		return ret;
>   	}
>   
>   	return ret;


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks
  2022-09-09 16:45 ` [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
@ 2022-09-09 19:00   ` Limonciello, Mario
  0 siblings, 0 replies; 22+ messages in thread
From: Limonciello, Mario @ 2022-09-09 19:00 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/2022 11:45, Perry Yuan wrote:
> add suspend and resume support for the AMD processors by amd_pstate_epp
> driver instance.
> 
> When the CPPC is suspended, EPP driver will set EPP profile to 'power'
> profile and set max/min perf to lowest perf value.
> When resume happens, it will restore the MSR registers with
> previous cached value.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 39 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 39 insertions(+)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index e63fed39f90c..749083d28b05 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1476,6 +1476,43 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
>   	return amd_pstate_cpu_offline(policy);
>   }
>   
> +static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +	int ret;

I don't see an explicit guard in here to only run this code for epp 
mode.  If you want it to be running both for EPP and non-EPP then you 
should update the commit message.  If you only want it running for EPP, 
I would think you need a:

if (!epp_enabled)
     return 0;

> +
> +	pr_debug("AMD CPU Core %d suspending\n", cpudata->cpu);

This debug statement seems needlessly noisy to me (even for dyndbg). 
Unless they're for debugging synchronization problems, I would think 
that you can get a similar result using ftrace for function names.

> +
> +	cpudata->suspended = true;
> +
> +	/* disable CPPC in lowlevel firmware */
> +	ret = amd_pstate_enable(false);
> +	if (ret)
> +		pr_err("failed to disable amd pstate during suspend, return %d\n", ret);

amd-pstate uses pr_fmt.  You don't need to mention so much in your error 
message.  Something like this would suffice:

pr_err("failed to suspend: %d\n, ret);

> +
> +	return 0;

Shouldn't you be returning ret here?

> +}
> +
> +static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> + > +	pr_debug("AMD CPU Core %d resuming\n", cpudata->cpu);

Ditto on above comments.

> +
> +	if (cpudata->suspended && epp_enabled) {

If you end up adopting the suggestion above for checking epp_enabled in 
suspend, I don't belivee you also need to check in resume.  Your check 
for cpudata->suspended will make sure this only runs if you did 
something for suspend.

> +		mutex_lock(&amd_pstate_limits_lock);
> +
> +		/* enable amd pstate from suspend state*/
> +		amd_pstate_epp_reenable(cpudata);
> +
> +		mutex_unlock(&amd_pstate_limits_lock);
> +	}
> +
> +	cpudata->suspended = false;

You can move this into the if statement.

> +
> +	return 0;
> +}
> +
>   static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
>   					   struct cpufreq_policy_data *policy)
>   {
> @@ -1512,6 +1549,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
>   	.update_limits	= amd_pstate_epp_update_limits,
>   	.offline	= amd_pstate_epp_cpu_offline,
>   	.online		= amd_pstate_epp_cpu_online,
> +	.suspend	= amd_pstate_epp_suspend,
> +	.resume		= amd_pstate_epp_resume,
>   	.name		= "amd_pstate_epp",
>   	.attr		= amd_pstate_epp_attr,
>   };


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback
  2022-09-09 16:45 ` [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
@ 2022-09-09 19:02   ` Limonciello, Mario
  2022-09-15 19:03   ` Nathan Fontenot
  1 sibling, 0 replies; 22+ messages in thread
From: Limonciello, Mario @ 2022-09-09 19:02 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Nathan.Fontenot, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/2022 11:45, Perry Yuan wrote:
> The patch adds online and offline driver callback to support to allow cpu cores go
> offline and help to restore the previous working states when core goes
> back online later for EPP driver mode.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>   drivers/cpufreq/amd-pstate.c | 93 +++++++++++++++++++++++++++++++++++-
>   1 file changed, 92 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index e71b06e20050..e63fed39f90c 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -111,7 +111,8 @@ struct amd_aperf_mperf {
>    * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
>    * @update_util: Cpufreq utility callback information
>    * @sample: the stored performance sample
> -
> + * @suspended:	Whether or not the driver has been suspended.
> + *
>    * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
>    * represents all the attributes and goals that AMD P-State requests at runtime.
>    */
> @@ -151,6 +152,7 @@ struct amd_cpudata {
>   	u64	cppc_cap1_cached;
>   	struct	update_util_data update_util;
>   	struct	amd_aperf_mperf sample;
> +	bool suspended;
>   };
>   
>   /**
> @@ -1387,6 +1389,93 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>   	return 0;
>   }
>   
> +static void amd_pstate_epp_reenable(struct amd_cpudata * cpudata)
> +{
> +	struct cppc_perf_ctrls perf_ctrls;
> +	u64 value, max_perf;
> +	int ret;
> +
> +	ret = amd_pstate_enable(true);
> +	if (ret)
> +		pr_err("failed to enable amd pstate during resume, return %d\n", ret);

As this driver uses pr_fmt, this doesn't need to be as verbose.  Look at 
my comments in patch 7/7 and you can adopt the same here.

> +
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +	max_perf = READ_ONCE(cpudata->highest_perf);
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	} else {
> +		perf_ctrls.max_perf = max_perf;
> +		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(cpudata->epp_cached);
> +		cppc_set_perf(cpudata->cpu, &perf_ctrls);
> +	}
> +}
> +
> +static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +
> +	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
> +
> +	if (epp_enabled) {
> +		amd_pstate_epp_reenable(cpudata);
> +		cpudata->suspended = false;
> +	}
> +
> +	return 0;
> +}
> +
> +static void amd_pstate_epp_offline(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +	struct cppc_perf_ctrls perf_ctrls;
> +	int min_perf;
> +	u64 value;
> +
> +	min_perf = READ_ONCE(cpudata->lowest_perf);
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +	mutex_lock(&amd_pstate_limits_lock);
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		cpudata->epp_policy = CPUFREQ_POLICY_UNKNOWN;
> +
> +		/* Set max perf same as min perf */
> +		value &= ~AMD_CPPC_MAX_PERF(~0L);
> +                value |= AMD_CPPC_MAX_PERF(min_perf);
> +                value &= ~AMD_CPPC_MIN_PERF(~0L);
> +                value |= AMD_CPPC_MIN_PERF(min_perf);
> +		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	} else {
> +		perf_ctrls.desired_perf = 0;
> +		perf_ctrls.max_perf = min_perf;
> +		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(AMD_CPPC_EPP_POWERSAVE);
> +		cppc_set_perf(cpudata->cpu, &perf_ctrls);
> +	}
> +	mutex_unlock(&amd_pstate_limits_lock);
> +}
> +
> +static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +
> +	pr_debug("AMD CPU Core %d going offline\n", cpudata->cpu); > +
> +	if (cpudata->suspended)
> +		return 0;
> +
> +	if (epp_enabled)
> +		amd_pstate_epp_offline(policy);
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	amd_pstate_clear_update_util_hook(policy->cpu);
> +
> +	return amd_pstate_cpu_offline(policy);
> +}
> +
>   static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
>   					   struct cpufreq_policy_data *policy)
>   {
> @@ -1421,6 +1510,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
>   	.init		= amd_pstate_epp_cpu_init,
>   	.exit		= amd_pstate_epp_cpu_exit,
>   	.update_limits	= amd_pstate_epp_update_limits,
> +	.offline	= amd_pstate_epp_cpu_offline,
> +	.online		= amd_pstate_epp_cpu_online,
>   	.name		= "amd_pstate_epp",
>   	.attr		= amd_pstate_epp_attr,
>   };


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver
  2022-09-09 18:49   ` Limonciello, Mario
@ 2022-09-13 15:14     ` Yuan, Perry
  2022-09-25 16:58     ` Yuan, Perry
  1 sibling, 0 replies; 22+ messages in thread
From: Yuan, Perry @ 2022-09-13 15:14 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

HI Mario

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, September 10, 2022 2:50 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 2/7] cpufreq: amd_pstate: add module parameter to
> load amd pstate EPP driver
> 
> On 9/9/2022 11:45, Perry Yuan wrote:
> > The amd_pstate mode parameter will allow user to select which amd
> > pstate working mode as booting mode, amd_pstate instance or
> amd_pstate_epp instance.
> >
> > 1) amd_pstate instance is depending on the target operation mode.
> > 2) amd_pstate_epp instance is depending on the set_policy operation
> mode.It
> >     is also called active mode that AMD SMU has EPP algorithm to control the
> >     CPU runtime frequency according to the EPP set value and workload.
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index a2463f785322..451295284a26 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -58,6 +58,12 @@ module_param(shared_mem, bool, 0444);
> >   MODULE_PARM_DESC(shared_mem,
> >   		 "enable amd-pstate on processors with shared memory
> solution
> > (false = disabled (default), true = enabled)");
> >
> > +static bool epp_enabled = true;
> > +module_param(epp_enabled, bool, 0444);
> MODULE_PARM_DESC(epp_enabled,
> > +                "load amd_pstate or amd_pstate_epp (true =
> > +amd_pstate_epp driver instance (default), false = amd_pstate driver
> > +instance)");
> > +
> 
> If you're operating in EPP mode or not the kernel module is still 'amd-pstate'.
> So to a user I think this is a pretty confusing designation.
> 
> I would propose the following instead:
> 
>  > +static bool epp = true;
>  > +module_param(epp, bool, 0444);
>  > +MODULE_PARM_DESC(epp,
>  > +                "Enable energy performance preference (EPP) control");
> 
> > +
> >   static struct cpufreq_driver amd_pstate_driver;
> >
> >   /**

Make sense, will make this change in the V2. 
Thanks.

Perry. 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-09 18:52   ` Limonciello, Mario
@ 2022-09-13 15:20     ` Yuan, Perry
  2022-09-25 17:01     ` Yuan, Perry
  1 sibling, 0 replies; 22+ messages in thread
From: Yuan, Perry @ 2022-09-13 15:20 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Mario. 

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, September 10, 2022 2:53 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support
> for shared memory type processor
> 
> On 9/9/2022 11:45, Perry Yuan wrote:
> > Add Energy Performance Preference support for AMD SOCs which only
> > support the shared memory interface that implemented on Zen2 and Zen3
> > processors, because this type CPU has no MSR supported, it will use
> > ACPI PCC channel to enable EPP and reset desired perf to be zero.
> 
> This reads like all Zen2 and Zen3 processors don't have the MSR, but that's
> not true. How about:
> 
> "Add Energy Performance Preference support for AMD SOCs which do not
> contain a designated MSR for CPPC support. A shared memory interface is
> used for CPPC on these SOCs and the ACPI PCC channel is used to enable EPP
> and reset the desired performance."
> 

Yes, those new interfaces are added to support the none MSR processors on EPP mode.
Will update the commit info like you suggested.
Thanks 

Perry. 

> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 451295284a26..fff298744a8e 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
> >
> >   static int cppc_enable(bool enable)
> >   {
> > +	struct cppc_perf_ctrls perf_ctrls;
> >   	int cpu, ret = 0;
> >
> >   	for_each_present_cpu(cpu) {
> >   		ret = cppc_set_enable(cpu, enable);
> >   		if (ret)
> >   			return ret;
> > +
> > +	/* Enable active mode for EPP */
> > +	ret = cppc_set_auto_epp(cpu, enable);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Set zero to desired perf to enable EPP control*/
> > +	perf_ctrls.desired_perf = 0;
> > +	ret = cppc_set_perf(cpu, &perf_ctrls);
> > +	if (ret)
> > +		return ret;
> >   	}
> >
> >   	return ret;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-09 16:45 ` [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor Perry Yuan
  2022-09-09 18:52   ` Limonciello, Mario
@ 2022-09-15 16:24   ` Nathan Fontenot
  2022-09-25 12:23     ` Yuan, Perry
  1 sibling, 1 reply; 22+ messages in thread
From: Nathan Fontenot @ 2022-09-15 16:24 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/22 11:45, Perry Yuan wrote:
> Add Energy Performance Preference support for AMD SOCs which only
> support the shared memory interface that implemented on Zen2 and Zen3
> processors, because this type CPU has no MSR supported, it will use
> ACPI PCC channel to enable EPP and reset desired perf to be zero.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 451295284a26..fff298744a8e 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
>  
>  static int cppc_enable(bool enable)
>  {
> +	struct cppc_perf_ctrls perf_ctrls;
>  	int cpu, ret = 0;
>  
>  	for_each_present_cpu(cpu) {
>  		ret = cppc_set_enable(cpu, enable);
>  		if (ret)
>  			return ret;
> +
> +	/* Enable active mode for EPP */
> +	ret = cppc_set_auto_epp(cpu, enable);
> +	if (ret)
> +		return ret;
> +
> +	/* Set zero to desired perf to enable EPP control*/
> +	perf_ctrls.desired_perf = 0;
> +	ret = cppc_set_perf(cpu, &perf_ctrls);
> +	if (ret)
> +		return ret;

Shouldn't this entire block be indented one additional tab over since its
part of the for_each_present_cpu() loop?

-Nathan

>  	}
>  
>  	return ret;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors
  2022-09-09 16:45 ` [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors Perry Yuan
@ 2022-09-15 18:58   ` Nathan Fontenot
  2022-09-25 17:12     ` Yuan, Perry
  0 siblings, 1 reply; 22+ messages in thread
From: Nathan Fontenot @ 2022-09-15 18:58 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/22 11:45, Perry Yuan wrote:
> Add EPP driver support for those AMD CPUs which has full MSR feature
> enabled, The EPP is used in the DPM controller to drive the frequency
> that a core is going to operate during short periods of activity.
> 
> EPP values will be utilized for different OS profiles (balanced, performance,
> power savings). cppc performance can be controlled by the user space interface
> sys attributes for min and max frequency limits, when pstate driver is
> working under power save policy.
> 
> EPP scale is 0 - 255, 0 is the max performance and 255 is min level.
> balance_performance (0x80) can provide best balance performance and watt for
> most of system, meanwhile user can choose performance policy on needs.
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
> 
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance

A lot of what you do in this patch with respect to the sysfs files for
energy_performance_available_preferences and energy_performance_preference mirror
what is done in the intel_pstate driver for EPP. Would there be any value in
making these pieces common code?

> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>  arch/x86/include/asm/msr-index.h |   4 +
>  drivers/cpufreq/amd-pstate.c     | 818 ++++++++++++++++++++++++++++++-
>  2 files changed, 804 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 43a3d8e4eb9a..4c540badab4e 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -577,6 +577,10 @@
>  #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL		0xc0000301
>  #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR	0xc0000302
>  
> +#define AMD_CPPC_EPP_PERFORMANCE		0x00
> +#define AMD_CPPC_EPP_BALANCE_PERFORMANCE	0x80
> +#define AMD_CPPC_EPP_BALANCE_POWERSAVE		0xBF
> +#define AMD_CPPC_EPP_POWERSAVE			0xFF
>  /* Fam 17h MSRs */
>  #define MSR_F17H_IRPERF			0xc00000e9
>  
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index fff298744a8e..e71b06e20050 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -63,8 +63,8 @@ module_param(epp_enabled, bool, 0444);
>  MODULE_PARM_DESC(epp_enabled,
>                  "load amd_pstate or amd_pstate_epp (true = amd_pstate_epp driver instance (default), false = amd_pstate driver instance)");
>  
> -
> -static struct cpufreq_driver amd_pstate_driver;
> +static struct cpufreq_driver *default_pstate_driver;
> +static struct amd_cpudata **all_cpu_data;
>  
>  /**
>   * struct  amd_aperf_mperf
> @@ -76,6 +76,7 @@ struct amd_aperf_mperf {
>  	u64 aperf;
>  	u64 mperf;
>  	u64 tsc;
> +	u64 time;
>  };
>  
>  /**
> @@ -98,7 +99,19 @@ struct amd_aperf_mperf {
>   * @prev: Last Aperf/Mperf/tsc count value read from register
>   * @freq: current cpu frequency value
>   * @boost_supported: check whether the Processor or SBIOS supports boost mode
> - *
> + * @epp_powersave: Last saved CPPC energy performance preference
> +				when policy switched to performance
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @sched_flags: Store scheduler flags for possible cross CPU update
> + * @update_util_set: CPUFreq utility callback is set
> + * @last_update: Time stamp of the last performance state update
> + * @cppc_boost_min: Last CPPC boosted min performance state
> + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> + * @update_util: Cpufreq utility callback information
> + * @sample: the stored performance sample
> +
>   * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
>   * represents all the attributes and goals that AMD P-State requests at runtime.
>   */
> @@ -124,8 +137,195 @@ struct amd_cpudata {
>  	u64 	freq;
>  	bool	boost_supported;
>  	u64 	cppc_hw_conf_cached;
> +
> +	/* EPP feature related attributes*/
> +	s16	epp_powersave;
> +	s16	epp_policy;
> +	s16	epp_cached;
> +	u32	policy;
> +	u32	sched_flags;
> +	bool	update_util_set;
> +	u64	last_update;
> +	u64	last_io_update;
> +	u32	cppc_boost_min;
> +	u64	cppc_cap1_cached;
> +	struct	update_util_data update_util;
> +	struct	amd_aperf_mperf sample;
> +};
> +
> +/**
> + * struct amd_pstate_params - global parameters for the performance control
> + * @ cppc_boost_disabled Wheter or not the core performance boost disabled
> + */
> +struct amd_pstate_params {
> +	bool cppc_boost_disabled;
> +};
> +
> +/*
> + * AMD Energy Preference Performance (EPP)
> + * The EPP is used in the CCLK DPM controller to drive
> + * the frequency that a core is going to operate during
> + * short periods of activity. EPP values will be utilized for
> + * different OS profiles (balanced, performance, power savings)
> + * display strings corresponding to EPP index in the
> + * energy_perf_strings[]
> + *	index		String
> + *-------------------------------------
> + *	0		default
> + *	1		performance
> + *	2		balance_performance
> + *	3		balance_power
> + *	4		power
> + */
> + enum energy_perf_value_index {
> +	EPP_INDEX_DEFAULT = 0,
> +	EPP_INDEX_PERFORMANCE,
> +	EPP_INDEX_BALANCE_PERFORMANCE,
> +	EPP_INDEX_BALANCE_POWERSAVE,
> +	EPP_INDEX_POWERSAVE,
> +};
> +
> +static const char * const energy_perf_strings[] = {
> +	[EPP_INDEX_DEFAULT] = "default",
> +	[EPP_INDEX_PERFORMANCE] = "performance",
> +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> +	[EPP_INDEX_POWERSAVE] = "power",
> +	NULL
>  };
>  
> +static unsigned int epp_values[] = {
> +	[EPP_INDEX_DEFAULT] = 0,
> +	[EPP_INDEX_PERFORMANCE] = AMD_CPPC_EPP_PERFORMANCE,
> +	[EPP_INDEX_BALANCE_PERFORMANCE] = AMD_CPPC_EPP_BALANCE_PERFORMANCE,
> +	[EPP_INDEX_BALANCE_POWERSAVE] = AMD_CPPC_EPP_BALANCE_POWERSAVE,
> +	[EPP_INDEX_POWERSAVE] = AMD_CPPC_EPP_POWERSAVE,
> +};
> +
> +static struct amd_pstate_params global;

I think a name more descriptive than 'global' should be used here.

> +
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +static DEFINE_SPINLOCK(amd_pstate_cpu_lock);
> +
> +static bool cppc_boost __read_mostly;
> +struct kobject *amd_pstate_kobj;
> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> +	s16 epp;
> +	struct cppc_perf_caps perf_caps;
> +	int ret;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		if (!cppc_req_cached) {
> +			epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> +					    &cppc_req_cached);
> +			if (epp)
> +				return epp;
> +		}
> +		epp = (cppc_req_cached >> 24) & 0xFF;
> +	} else {
> +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> +		if (ret < 0) {
> +			pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> +			return -EIO;
> +		}
> +		epp = (s16) perf_caps.energy_perf;
> +	}
> +
> +	return epp;
> +}
> +#endif
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata, int *raw_epp)
> +{
> +	s16 epp;
> +	int index = -EINVAL;
> +
> +	*raw_epp = 0;
> +	epp = amd_pstate_get_epp(cpudata, 0);
> +	if (epp < 0)
> +		return epp;
> +
> +	switch (epp) {
> +	case AMD_CPPC_EPP_PERFORMANCE:
> +		index = EPP_INDEX_PERFORMANCE;
> +		break;
> +	case AMD_CPPC_EPP_BALANCE_PERFORMANCE:
> +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> +		break;
> +	case AMD_CPPC_EPP_BALANCE_POWERSAVE:
> +		index = EPP_INDEX_BALANCE_POWERSAVE;
> +		break;
> +	case AMD_CPPC_EPP_POWERSAVE:
> +		index = EPP_INDEX_POWERSAVE;
> +		break;
> +	default:
> +		*raw_epp = epp;
> +		index = 0;
> +	}
> +
> +	return index;
> +}
> +
> +#ifdef CONFIG_ACPI_CPPC_LIB
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> +	int ret;
> +	struct cppc_perf_ctrls perf_ctrls;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +		if (!ret) {
> +			cpudata->epp_cached = epp;
> +		}
> +	} else {
> +		perf_ctrls.energy_perf = epp;
> +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls);
> +		if (ret){
> +			pr_debug("failed to set energy perf value (%d)\n", ret);
> +			return ret;
> +        }

Accidental '}' not indented enough?

> +		cpudata->epp_cached = epp;
> +	}
> +	return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> +					      int pref_index, bool use_raw,
> +					      u32 raw_epp)
> +{
> +	int epp = -EINVAL;
> +	int ret;
> +
> +	if (!pref_index) {
> +		pr_debug("EPP pref_index is invaid\n");

s/invaid/invalid

> +		return -EINVAL;
> +	}
> +
> +	if (use_raw)
> +		epp = raw_epp;
> +	else if (epp == -EINVAL)
> +		epp = epp_values[pref_index];
> +
> +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		pr_debug("EPP cannot be set under performance policy\n");
> +		return -EBUSY;
> +	}
> +
> +	ret = amd_pstate_set_epp(cpudata, epp);
> +
> +	return ret;
> +}
> +#endif
> +
>  static inline int pstate_enable(bool enable)
>  {
>  	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -141,16 +341,18 @@ static int cppc_enable(bool enable)
>  		if (ret)
>  			return ret;
>  
> -	/* Enable active mode for EPP */
> -	ret = cppc_set_auto_epp(cpu, enable);
> -	if (ret)
> -		return ret;
> -
> -	/* Set zero to desired perf to enable EPP control*/
> -	perf_ctrls.desired_perf = 0;
> -	ret = cppc_set_perf(cpu, &perf_ctrls);
> -	if (ret)
> -		return ret;
> +		if (epp_enabled) {
> +			/* Enable autonomous mode for EPP */
> +			ret = cppc_set_auto_epp(cpu, enable);
> +			if (ret)
> +				return ret;
> +
> +			/* Set zero to desired perf to allow EPP firmware control*/
> +			perf_ctrls.desired_perf = 0;
> +			ret = cppc_set_perf(cpu, &perf_ctrls);
> +			if (ret)
> +				return ret;
> +		}

This update to cppc_enable() should be part of Patch 4/7.

>  	}
>  
>  	return ret;
> @@ -496,7 +698,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
>  		return;
>  
>  	cpudata->boost_supported = true;
> -	amd_pstate_driver.boost_enabled = true;
> +	default_pstate_driver->boost_enabled = true;
>  }
>  
>  static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> @@ -671,10 +873,108 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
>  	return sprintf(&buf[0], "%u\n", perf);
>  }
>  
> +static ssize_t show_energy_performance_available_preferences(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	int i = 0;
> +	int ret = 0;
> +
> +	while (energy_perf_strings[i] != NULL)
> +		ret += sprintf(&buf[ret], "%s ", energy_perf_strings[i++]);
> +
> +	ret += sprintf(&buf[ret], "\n");
> +
> +	return ret;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> +		struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	char str_preference[21];
> +	bool raw = false;
> +	ssize_t ret;
> +	u32 epp = 0;
> +
> +	ret = sscanf(buf, "%20s", str_preference);
> +	if (ret != 1)
> +		return -EINVAL;
> +
> +	ret = match_string(energy_perf_strings, -1, str_preference);
> +	if (ret < 0) {
> +		ret = kstrtouint(buf, 10, &epp);
> +		if (ret)
> +			return ret;
> +
> +		if ((epp > 255) || (epp < 0))
> +			return -EINVAL;
> +
> +		raw = true;
> +	}
> +
> +	mutex_lock(&amd_pstate_limits_lock);
> +	ret = amd_pstate_set_energy_pref_index(cpudata, ret, raw, epp);
> +	mutex_unlock(&amd_pstate_limits_lock);
> +
> +	return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> +				struct cpufreq_policy *policy, char *buf)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +	int preference, raw_epp;
> +
> +	preference = amd_pstate_get_energy_pref_index(cpudata, &raw_epp);
> +	if (preference < 0)
> +		return preference;
> +
> +	if (raw_epp)
> +		return  sprintf(buf, "%d\n", raw_epp);
> +	else
> +		return  sprintf(buf, "%s\n", energy_perf_strings[preference]);
> +}
> +
> +static void amd_pstate_update_policies(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu)
> +		cpufreq_update_policy(cpu);
> +}
> +
> +static ssize_t show_pstate_dynamic_boost(struct kobject *kobj,
> +				struct kobj_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%u\n", cppc_boost);
> +}
> +
> +static ssize_t store_pstate_dynamic_boost(struct kobject *a,
> +				       struct kobj_attribute *b,
> +				       const char *buf, size_t count)
> +{
> +	unsigned int input;
> +	int ret;
> +
> +	ret = kstrtouint(buf, 10, &input);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&amd_pstate_driver_lock);
> +	cppc_boost = !!input;
> +	amd_pstate_update_policies();
> +	mutex_unlock(&amd_pstate_driver_lock);
> +
> +	return count;
> +}
> +
>  cpufreq_freq_attr_ro(amd_pstate_max_freq);
>  cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>  
>  cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
> +define_one_global_rw(pstate_dynamic_boost);
>  
>  static struct freq_attr *amd_pstate_attr[] = {
>  	&amd_pstate_max_freq,
> @@ -683,6 +983,424 @@ static struct freq_attr *amd_pstate_attr[] = {
>  	NULL,
>  };
>  
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> +	&amd_pstate_max_freq,
> +	&amd_pstate_lowest_nonlinear_freq,
> +	&amd_pstate_highest_perf,
> +	&energy_performance_preference,
> +	&energy_performance_available_preferences,
> +	NULL,
> +};
> +
> +static struct attribute *pstate_global_attributes[] = {
> +	&pstate_dynamic_boost.attr,
> +	NULL
> +};
> +
> +static const struct attribute_group amd_pstate_global_attr_group = {
> +	.attrs = pstate_global_attributes,
> +};
> +
> +static inline void update_boost_state(void)
> +{
> +	u64 misc_en;
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[0];
> +	rdmsrl(MSR_AMD_CPPC_HW_CTL, misc_en);
> +	global.cppc_boost_disabled = misc_en & AMD_CPPC_PRECISION_BOOST_ENABLED;
> +}
> +
> +static int amd_pstate_init_cpu(unsigned int cpunum)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	cpudata = all_cpu_data[cpunum];
> +	if (!cpudata) {
> +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> +		if (!cpudata)
> +			return -ENOMEM;
> +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> +
> +		cpudata->cpu = cpunum;
> +	}
> +	cpudata->epp_powersave = -EINVAL;
> +	cpudata->epp_policy = 0;
> +	pr_debug("controlling: cpu %d\n", cpunum);
> +	return 0;
> +}
> +
> +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> +	struct amd_cpudata *cpudata;
> +	struct device *dev;
> +	int rc;
> +	u64 value;
> +
> +	rc = amd_pstate_init_cpu(policy->cpu);
> +	if (rc)
> +		return rc;
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +
> +	dev = get_cpu_device(policy->cpu);
> +	if (!dev)
> +		goto free_cpudata1;
> +
> +	rc = amd_pstate_init_perf(cpudata);
> +	if (rc)
> +		goto free_cpudata1;
> +
> +	min_freq = amd_get_min_freq(cpudata);
> +	max_freq = amd_get_max_freq(cpudata);
> +	nominal_freq = amd_get_nominal_freq(cpudata);
> +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> +		if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> +			min_freq, max_freq);
> +		ret = -EINVAL;
> +		goto free_cpudata1;
> +	}
> +
> +	policy->min = min_freq;
> +	policy->max = max_freq;
> +
> +	policy->cpuinfo.min_freq = min_freq;
> +	policy->cpuinfo.max_freq = max_freq;
> +	/* It will be updated by governor */
> +	policy->cur = policy->cpuinfo.min_freq;
> +
> +	/* Initial processor data capability frequencies */
> +	cpudata->max_freq = max_freq;
> +	cpudata->min_freq = min_freq;
> +	cpudata->nominal_freq = nominal_freq;
> +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> +	policy->driver_data = cpudata;
> +
> +	update_boost_state();
> +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> +	policy->min = policy->cpuinfo.min_freq;
> +	policy->max = policy->cpuinfo.max_freq;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC))
> +		policy->fast_switch_possible = true;
> +
> +	if (!shared_mem && boot_cpu_has(X86_FEATURE_CPPC)) {
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> +		if (ret)
> +			return ret;
> +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> +	}
> +	amd_pstate_boost_init(cpudata);
> +
> +	return 0;
> +
> +free_cpudata1:
> +	kfree(cpudata);
> +	return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> +	int ret;
> +
> +	ret = __amd_pstate_cpu_init(policy);
> +	if (ret)
> +		return ret;
> +	/*
> +	 * Set the policy to powersave to provide a valid fallback value in case
> +	 * the default cpufreq governor is neither powersave nor performance.
> +	 */
> +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> +	pr_debug("amd-pstate: CPU %d exiting\n", policy->cpu);
> +	policy->fast_switch_possible = false;
> +	return 0;
> +}
> +
> +static void amd_pstate_update_max_freq(unsigned int cpu)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu);
> +
> +	if (!policy)
> +		return;
> +
> +	refresh_frequency_limits(policy);
> +	cpufreq_cpu_release(policy);
> +}
> +
> +static void amd_pstate_epp_update_limits(unsigned int cpu)
> +{
> +	mutex_lock(&amd_pstate_driver_lock);
> +	update_boost_state();
> +	if (global.cppc_boost_disabled) {
> +		for_each_possible_cpu(cpu)
> +			amd_pstate_update_max_freq(cpu);
> +	} else {
> +		cpufreq_update_policy(cpu);
> +	}
> +	mutex_unlock(&amd_pstate_driver_lock);
> +}
> +
> +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> +
> +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u32 max_limit = (hwp_req & 0xff);
> +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> +	u32 boost_level1;
> +
> +	/* If max and min are equal or already at max, nothing to boost */
> +	if (max_limit == min_limit)
> +		return;
> +
> +	/* Set boost max and min to initial value */
> +	if (!cpudata->cppc_boost_min)
> +		cpudata->cppc_boost_min = min_limit;
> +
> +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) + min_limit) >> 1);
> +
> +	if (cpudata->cppc_boost_min < boost_level1)
> +		cpudata->cppc_boost_min = boost_level1;
> +	else if (cpudata->cppc_boost_min < AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = AMD_CPPC_NOMINAL_PERF(hwp_cap);
> +	else if (cpudata->cppc_boost_min == AMD_CPPC_NOMINAL_PERF(hwp_cap))
> +		cpudata->cppc_boost_min = max_limit;
> +	else
> +		return;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> +	wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> +{
> +	bool expired;
> +
> +	if (cpudata->cppc_boost_min) {
> +		expired = time_after64(cpudata->sample.time, cpudata->last_update +
> +					cppc_boost_hold_time_ns);
> +
> +		if (expired) {
> +			wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
> +			cpudata->cppc_boost_min = 0;
> +		}
> +	}
> +
> +	cpudata->last_update = cpudata->sample.time;
> +}
> +
> +static inline void amd_pstate_boost_update_util(struct amd_cpudata *cpudata,
> +						      u64 time)
> +{
> +	cpudata->sample.time = time;
> +	if (smp_processor_id() != cpudata->cpu)
> +		return;
> +
> +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> +		bool do_io = false;
> +
> +		cpudata->sched_flags = 0;
> +		/*
> +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> +		 * is set all the time, we can't just conclude that there is
> +		 * some IO bound activity is scheduled on this CPU with just
> +		 * one occurrence. If we receive at least two in two
> +		 * consecutive ticks, then we treat as boost candidate.
> +		 * This is leveraged from Intel Pstate driver.
> +		 */
> +		if (time_before64(time, cpudata->last_io_update + 2 * TICK_NSEC))
> +			do_io = true;
> +
> +		cpudata->last_io_update = time;
> +
> +		if (do_io)
> +			amd_pstate_boost_up(cpudata);
> +
> +	} else {
> +		amd_pstate_boost_down(cpudata);
> +	}
> +}
> +
> +static inline void amd_pstate_cppc_update_hook(struct update_util_data *data,
> +						u64 time, unsigned int flags)
> +{
> +	struct amd_cpudata *cpudata = container_of(data,
> +				struct amd_cpudata, update_util);
> +
> +	cpudata->sched_flags |= flags;
> +
> +	if (smp_processor_id() == cpudata->cpu)
> +		amd_pstate_boost_update_util(cpudata, time);
> +}
> +
> +static void amd_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +
> +	if (!cpudata->update_util_set)
> +		return;
> +
> +	cpufreq_remove_update_util_hook(cpu);
> +	cpudata->update_util_set = false;
> +	synchronize_rcu();
> +}
> +
> +static void amd_pstate_set_update_util_hook(unsigned int cpu_num)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> +
> +	if (!cppc_boost) {
> +		if (cpudata->update_util_set)
> +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> +		return;
> +	}
> +
> +	if (cpudata->update_util_set)
> +		return;
> +
> +	cpudata->sample.time = 0;
> +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> +						amd_pstate_cppc_update_hook);

Here, and elsewhere, the param on the second line should line up under the first param
on the previous line.

> +	cpudata->update_util_set = true;
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> +	u32 max_perf, min_perf;
> +	u64 value;
> +	s16 epp;
> +	int ret;
> +
> +	max_perf = READ_ONCE(cpudata->highest_perf);
> +	min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> +		min_perf = max_perf;
> +
> +	/* Initial min/max values for CPPC Performance Controls Register */
> +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> +	value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> +	value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> +	/* CPPC EPP feature require to set zero to the desire perf bit */
> +	value &= ~AMD_CPPC_DES_PERF(~0L);
> +	value |= AMD_CPPC_DES_PERF(0);
> +
> +	if (cpudata->epp_policy == cpudata->policy)
> +		goto skip_epp;
> +
> +	cpudata->epp_policy = cpudata->policy;
> +
> +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		cpudata->epp_powersave = epp;
> +		if (epp < 0)
> +			goto skip_epp;

This looks wrong. The epp value is a s16, but reading the MSR can return an EPP setting
between 0x00 - 0xFF. The epp value returned from amd_pstate_get_epp() can be both
negative and a valid value.

Were your intentions to skip any EPP settings that tend toward POWERSAVE values? If so
I think a comment explaining this or re-working the code to make that clear would be good.

> +
> +		epp = 0;
> +	} else {
> +		if (cpudata->epp_powersave < 0)
> +			goto skip_epp;
> +		/* Get BIOS pre-defined epp value */
> +		epp = amd_pstate_get_epp(cpudata, value);
> +		if (epp)
> +			goto skip_epp;
> +		epp = cpudata->epp_powersave;
> +	}
> +	/* Set initial EPP value */
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		value &= ~GENMASK_ULL(31, 24);
> +		value |= (u64)epp << 24;
> +	}
> +
> +skip_epp:
> +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	if (!ret) {
> +		cpudata->epp_cached = epp;
> +	}

No need for curly braces here.

-Nathan

> +}
> +
> +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata)
> +{
> +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> +
> +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata;
> +
> +	if (!policy->cpuinfo.max_freq)
> +		return -ENODEV;
> +
> +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> +				policy->cpuinfo.max_freq, policy->max);
> +
> +	cpudata = all_cpu_data[policy->cpu];
> +	cpudata->policy = policy->policy;
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		mutex_lock(&amd_pstate_limits_lock);
> +
> +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> +			amd_pstate_clear_update_util_hook(policy->cpu);
> +			amd_pstate_set_max_limits(cpudata);
> +		} else {
> +			amd_pstate_set_update_util_hook(policy->cpu);
> +		}
> +
> +		if (boot_cpu_has(X86_FEATURE_CPPC))
> +			amd_pstate_epp_init(policy->cpu);
> +
> +		mutex_unlock(&amd_pstate_limits_lock);
> +	}
> +
> +	return 0;
> +}
> +
> +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> +					   struct cpufreq_policy_data *policy)
> +{
> +	update_boost_state();
> +	cpufreq_verify_within_cpu_limits(policy);
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min );
> +	return 0;
> +}
> +
>  static struct cpufreq_driver amd_pstate_driver = {
>  	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
>  	.verify		= amd_pstate_verify,
> @@ -696,8 +1414,20 @@ static struct cpufreq_driver amd_pstate_driver = {
>  	.attr		= amd_pstate_attr,
>  };
>  
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> +	.flags		= CPUFREQ_CONST_LOOPS,
> +	.verify		= amd_pstate_epp_verify_policy,
> +	.setpolicy	= amd_pstate_epp_set_policy,
> +	.init		= amd_pstate_epp_cpu_init,
> +	.exit		= amd_pstate_epp_cpu_exit,
> +	.update_limits	= amd_pstate_epp_update_limits,
> +	.name		= "amd_pstate_epp",
> +	.attr		= amd_pstate_epp_attr,
> +};
> +
>  static int __init amd_pstate_init(void)
>  {
> +	static struct amd_cpudata **cpudata;
>  	int ret;
>  
>  	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> @@ -712,10 +1442,25 @@ static int __init amd_pstate_init(void)
>  	if (cpufreq_get_current_driver())
>  		return -EEXIST;
>  
> +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> +	if (!cpudata)
> +		return -ENOMEM;
> +	WRITE_ONCE(all_cpu_data, cpudata);
> +
> +	if (epp_enabled) {
> +		pr_info("AMD CPPC loading with amd_pstate_epp driver instance.\n");
> +		default_pstate_driver = &amd_pstate_epp_driver;
> +	} else {
> +		pr_info("AMD CPPC loading with amd_pstate driver instance.\n");
> +		default_pstate_driver = &amd_pstate_driver;
> +	}
> +
>  	/* capability check */
>  	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		if (!epp_enabled) {
> +			default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> +		}
>  		pr_debug("AMD CPPC MSR based functionality is supported\n");
> -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
>  	} else if (shared_mem) {
>  		static_call_update(amd_pstate_enable, cppc_enable);
>  		static_call_update(amd_pstate_init_perf, cppc_init_perf);
> @@ -732,19 +1477,56 @@ static int __init amd_pstate_init(void)
>  		return ret;
>  	}
>  
> -	ret = cpufreq_register_driver(&amd_pstate_driver);
> +	ret = cpufreq_register_driver(default_pstate_driver);
>  	if (ret)
> -		pr_err("failed to register amd_pstate_driver with return %d\n",
> +		pr_err("failed to register amd pstate driver with return %d\n",
>  		       ret);
>  
> +	amd_pstate_kobj = kobject_create_and_add("amd-pstate", &cpu_subsys.dev_root->kobj);
> +	if (!amd_pstate_kobj) {
> +		pr_err("amd-pstate: Global sysfs registration failed.\n");
> +	}
> +
> +	ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
> +	if (ret) {
> +		pr_err("amd-pstate: Sysfs attribute export failed with error %d.\n",
> +		       ret);
> +	}
> +
>  	return ret;
>  }
>  
> +static inline void amd_pstate_kobj_cleanup(struct kobject *kobj)
> +{
> +	kobject_del(kobj);
> +	kobject_put(kobj);
> +}
> +
>  static void __exit amd_pstate_exit(void)
>  {
> -	cpufreq_unregister_driver(&amd_pstate_driver);
> +	unsigned int cpu;
> +
> +	cpufreq_unregister_driver(default_pstate_driver);
>  
>  	amd_pstate_enable(false);
> +
> +	sysfs_remove_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
> +	amd_pstate_kobj_cleanup(amd_pstate_kobj);
> +
> +	cpus_read_lock();
> +	for_each_online_cpu(cpu) {
> +		if (all_cpu_data[cpu]) {
> +			if (default_pstate_driver == &amd_pstate_epp_driver)
> +				amd_pstate_clear_update_util_hook(cpu);
> +
> +			spin_lock(&amd_pstate_cpu_lock);
> +			kfree(all_cpu_data[cpu]);
> +			WRITE_ONCE(all_cpu_data[cpu], NULL);
> +			spin_unlock(&amd_pstate_cpu_lock);
> +		}
> +	}
> +	cpus_read_unlock();
> +
>  }
>  
>  module_init(amd_pstate_init);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback
  2022-09-09 16:45 ` [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
  2022-09-09 19:02   ` Limonciello, Mario
@ 2022-09-15 19:03   ` Nathan Fontenot
  1 sibling, 0 replies; 22+ messages in thread
From: Nathan Fontenot @ 2022-09-15 19:03 UTC (permalink / raw)
  To: Perry Yuan, rafael.j.wysocki, ray.huang, viresh.kumar
  Cc: Deepak.Sharma, Mario.Limonciello, Alexander.Deucher, Jinzhou.Su,
	Shimmer.Huang, Xiaojian.Du, Li.Meng, linux-pm, linux-kernel

On 9/9/22 11:45, Perry Yuan wrote:
> The patch adds online and offline driver callback to support to allow cpu cores go
> offline and help to restore the previous working states when core goes
> back online later for EPP driver mode.
> 
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 93 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 92 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index e71b06e20050..e63fed39f90c 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -111,7 +111,8 @@ struct amd_aperf_mperf {
>   * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
>   * @update_util: Cpufreq utility callback information
>   * @sample: the stored performance sample
> -
> + * @suspended:	Whether or not the driver has been suspended.
> + *
>   * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
>   * represents all the attributes and goals that AMD P-State requests at runtime.
>   */
> @@ -151,6 +152,7 @@ struct amd_cpudata {
>  	u64	cppc_cap1_cached;
>  	struct	update_util_data update_util;
>  	struct	amd_aperf_mperf sample;
> +	bool suspended;
>  };
>  
>  /**
> @@ -1387,6 +1389,93 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>  	return 0;
>  }
>  
> +static void amd_pstate_epp_reenable(struct amd_cpudata * cpudata)
> +{
> +	struct cppc_perf_ctrls perf_ctrls;
> +	u64 value, max_perf;
> +	int ret;
> +
> +	ret = amd_pstate_enable(true);
> +	if (ret)
> +		pr_err("failed to enable amd pstate during resume, return %d\n", ret);
> +
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +	max_perf = READ_ONCE(cpudata->highest_perf);
> +
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> +	} else {
> +		perf_ctrls.max_perf = max_perf;
> +		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(cpudata->epp_cached);
> +		cppc_set_perf(cpudata->cpu, &perf_ctrls);
> +	}
> +}
> +
> +static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +
> +	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
> +
> +	if (epp_enabled) {
> +		amd_pstate_epp_reenable(cpudata);
> +		cpudata->suspended = false;

You've added the suspended flag to indicate when a cpu is online/offline but I don't see
any place in the offline code where you set suspended to true.

> +	}
> +
> +	return 0;
> +}
> +
> +static void amd_pstate_epp_offline(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +	struct cppc_perf_ctrls perf_ctrls;
> +	int min_perf;
> +	u64 value;
> +
> +	min_perf = READ_ONCE(cpudata->lowest_perf);
> +	value = READ_ONCE(cpudata->cppc_req_cached);
> +
> +	mutex_lock(&amd_pstate_limits_lock);
> +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> +		cpudata->epp_policy = CPUFREQ_POLICY_UNKNOWN;
> +
> +		/* Set max perf same as min perf */
> +		value &= ~AMD_CPPC_MAX_PERF(~0L);
> +                value |= AMD_CPPC_MAX_PERF(min_perf);
> +                value &= ~AMD_CPPC_MIN_PERF(~0L);
> +                value |= AMD_CPPC_MIN_PERF(min_perf);
> +		wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);

Some odd indentation here, looks like you may have a mix of tabs and spaces. Did you run
checkpatch?

-Nathan

> +	} else {
> +		perf_ctrls.desired_perf = 0;
> +		perf_ctrls.max_perf = min_perf;
> +		perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(AMD_CPPC_EPP_POWERSAVE);
> +		cppc_set_perf(cpudata->cpu, &perf_ctrls);
> +	}
> +	mutex_unlock(&amd_pstate_limits_lock);
> +}
> +
> +static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = all_cpu_data[policy->cpu];
> +
> +	pr_debug("AMD CPU Core %d going offline\n", cpudata->cpu);
> +
> +	if (cpudata->suspended)
> +		return 0;
> +
> +	if (epp_enabled)
> +		amd_pstate_epp_offline(policy);
> +
> +	return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	amd_pstate_clear_update_util_hook(policy->cpu);
> +
> +	return amd_pstate_cpu_offline(policy);
> +}
> +
>  static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
>  					   struct cpufreq_policy_data *policy)
>  {
> @@ -1421,6 +1510,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
>  	.init		= amd_pstate_epp_cpu_init,
>  	.exit		= amd_pstate_epp_cpu_exit,
>  	.update_limits	= amd_pstate_epp_update_limits,
> +	.offline	= amd_pstate_epp_cpu_offline,
> +	.online		= amd_pstate_epp_cpu_online,
>  	.name		= "amd_pstate_epp",
>  	.attr		= amd_pstate_epp_attr,
>  };

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-15 16:24   ` Nathan Fontenot
@ 2022-09-25 12:23     ` Yuan, Perry
  2022-09-29 14:08       ` Nathan Fontenot
  0 siblings, 1 reply; 22+ messages in thread
From: Yuan, Perry @ 2022-09-25 12:23 UTC (permalink / raw)
  To: Fontenot, Nathan, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Limonciello, Mario, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Nathan.


> -----Original Message-----
> From: Fontenot, Nathan <Nathan.Fontenot@amd.com>
> Sent: Friday, September 16, 2022 12:24 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support
> for shared memory type processor
> 
> On 9/9/22 11:45, Perry Yuan wrote:
> > Add Energy Performance Preference support for AMD SOCs which only
> > support the shared memory interface that implemented on Zen2 and Zen3
> > processors, because this type CPU has no MSR supported, it will use
> > ACPI PCC channel to enable EPP and reset desired perf to be zero.
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >  drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 451295284a26..fff298744a8e 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
> >
> >  static int cppc_enable(bool enable)
> >  {
> > +	struct cppc_perf_ctrls perf_ctrls;
> >  	int cpu, ret = 0;
> >
> >  	for_each_present_cpu(cpu) {
> >  		ret = cppc_set_enable(cpu, enable);
> >  		if (ret)
> >  			return ret;
> > +
> > +	/* Enable active mode for EPP */
> > +	ret = cppc_set_auto_epp(cpu, enable);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Set zero to desired perf to enable EPP control*/
> > +	perf_ctrls.desired_perf = 0;
> > +	ret = cppc_set_perf(cpu, &perf_ctrls);
> > +	if (ret)
> > +		return ret;
> 
> Shouldn't this entire block be indented one additional tab over since its part
> of the for_each_present_cpu() loop?
> 
> -Nathan

Yes, I can see the indent in my local vscode editor, I do not know why the patch didn`t show that.
Is it correct as below ?

	for_each_present_cpu(cpu) {
		ret = cppc_set_enable(cpu, enable);
		if (ret)
			return ret;

		if (epp_enabled) {
			/* Enable autonomous mode for EPP */
			ret = cppc_set_auto_epp(cpu, enable);
			if (ret)
				return ret;

			/* Set zero to desired perf to allow EPP firmware control*/
			perf_ctrls.desired_perf = 0;
			ret = cppc_set_perf(cpu, &perf_ctrls);
			if (ret)
				return ret;
		}
	}

Perry. 

> 
> >  	}
> >
> >  	return ret;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver
  2022-09-09 18:49   ` Limonciello, Mario
  2022-09-13 15:14     ` Yuan, Perry
@ 2022-09-25 16:58     ` Yuan, Perry
  1 sibling, 0 replies; 22+ messages in thread
From: Yuan, Perry @ 2022-09-25 16:58 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Mario. 

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, September 10, 2022 2:50 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 2/7] cpufreq: amd_pstate: add module parameter to
> load amd pstate EPP driver
> 
> On 9/9/2022 11:45, Perry Yuan wrote:
> > The amd_pstate mode parameter will allow user to select which amd
> > pstate working mode as booting mode, amd_pstate instance or
> amd_pstate_epp instance.
> >
> > 1) amd_pstate instance is depending on the target operation mode.
> > 2) amd_pstate_epp instance is depending on the set_policy operation
> mode.It
> >     is also called active mode that AMD SMU has EPP algorithm to control the
> >     CPU runtime frequency according to the EPP set value and workload.
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index a2463f785322..451295284a26 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -58,6 +58,12 @@ module_param(shared_mem, bool, 0444);
> >   MODULE_PARM_DESC(shared_mem,
> >   		 "enable amd-pstate on processors with shared memory
> solution
> > (false = disabled (default), true = enabled)");
> >
> > +static bool epp_enabled = true;
> > +module_param(epp_enabled, bool, 0444);
> MODULE_PARM_DESC(epp_enabled,
> > +                "load amd_pstate or amd_pstate_epp (true =
> > +amd_pstate_epp driver instance (default), false = amd_pstate driver
> > +instance)");
> > +
> 
> If you're operating in EPP mode or not the kernel module is still 'amd-pstate'.
> So to a user I think this is a pretty confusing designation.
> 
> I would propose the following instead:
> 
>  > +static bool epp = true;
>  > +module_param(epp, bool, 0444);
>  > +MODULE_PARM_DESC(epp,
>  > +                "Enable energy performance preference (EPP) control");
> 

Make sense to me, get this changed in V2 like you suggested.
Thanks.

> > +
> >   static struct cpufreq_driver amd_pstate_driver;
> >
> >   /**

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-09 18:52   ` Limonciello, Mario
  2022-09-13 15:20     ` Yuan, Perry
@ 2022-09-25 17:01     ` Yuan, Perry
  1 sibling, 0 replies; 22+ messages in thread
From: Yuan, Perry @ 2022-09-25 17:01 UTC (permalink / raw)
  To: Limonciello, Mario, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Fontenot, Nathan, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

HI Mario.

> -----Original Message-----
> From: Limonciello, Mario <Mario.Limonciello@amd.com>
> Sent: Saturday, September 10, 2022 2:53 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support
> for shared memory type processor
> 
> On 9/9/2022 11:45, Perry Yuan wrote:
> > Add Energy Performance Preference support for AMD SOCs which only
> > support the shared memory interface that implemented on Zen2 and Zen3
> > processors, because this type CPU has no MSR supported, it will use
> > ACPI PCC channel to enable EPP and reset desired perf to be zero.
> 
> This reads like all Zen2 and Zen3 processors don't have the MSR, but that's
> not true. How about:
> 
> "Add Energy Performance Preference support for AMD SOCs which do not
> contain a designated MSR for CPPC support. A shared memory interface is
> used for CPPC on these SOCs and the ACPI PCC channel is used to enable EPP
> and reset the desired performance."
> 

This is more correctly to describe the detail, take this into V2.
Thank you !

Perry.  

> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >   drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 451295284a26..fff298744a8e 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
> >
> >   static int cppc_enable(bool enable)
> >   {
> > +	struct cppc_perf_ctrls perf_ctrls;
> >   	int cpu, ret = 0;
> >
> >   	for_each_present_cpu(cpu) {
> >   		ret = cppc_set_enable(cpu, enable);
> >   		if (ret)
> >   			return ret;
> > +
> > +	/* Enable active mode for EPP */
> > +	ret = cppc_set_auto_epp(cpu, enable);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Set zero to desired perf to enable EPP control*/
> > +	perf_ctrls.desired_perf = 0;
> > +	ret = cppc_set_perf(cpu, &perf_ctrls);
> > +	if (ret)
> > +		return ret;
> >   	}
> >
> >   	return ret;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors
  2022-09-15 18:58   ` Nathan Fontenot
@ 2022-09-25 17:12     ` Yuan, Perry
  0 siblings, 0 replies; 22+ messages in thread
From: Yuan, Perry @ 2022-09-25 17:12 UTC (permalink / raw)
  To: Fontenot, Nathan, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Limonciello, Mario, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

[AMD Official Use Only - General]

Hi Nathan,

> -----Original Message-----
> From: Fontenot, Nathan <Nathan.Fontenot@amd.com>
> Sent: Friday, September 16, 2022 2:59 AM
> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support
> for the MSR based processors
> 	
> On 9/9/22 11:45, Perry Yuan wrote:
> > Add EPP driver support for those AMD CPUs which has full MSR feature
> > enabled, The EPP is used in the DPM controller to drive the frequency
> > that a core is going to operate during short periods of activity.
> >
> > EPP values will be utilized for different OS profiles (balanced,
> > performance, power savings). cppc performance can be controlled by the
> > user space interface sys attributes for min and max frequency limits,
> > when pstate driver is working under power save policy.
> >
> > EPP scale is 0 - 255, 0 is the max performance and 255 is min level.
> > balance_performance (0x80) can provide best balance performance and
> > watt for most of system, meanwhile user can choose performance policy
> on needs.
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_
> p
> > references default performance balance_performance balance_power
> power
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preferenc
> e
> > balance_performance
> 
> A lot of what you do in this patch with respect to the sysfs files for
> energy_performance_available_preferences and
> energy_performance_preference mirror what is done in the intel_pstate
> driver for EPP. Would there be any value in making these pieces common
> code?

So far the amd pstate epp driver is under reviewing, we still need to add some other attribute files,
If we add some common the sysfs files with Intel-pstate driver, it will make more complex to review.

Perry.

> 
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > ---
> >  arch/x86/include/asm/msr-index.h |   4 +
> >  drivers/cpufreq/amd-pstate.c     | 818
> ++++++++++++++++++++++++++++++-
> >  2 files changed, 804 insertions(+), 18 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/msr-index.h
> > b/arch/x86/include/asm/msr-index.h
> > index 43a3d8e4eb9a..4c540badab4e 100644
> > --- a/arch/x86/include/asm/msr-index.h
> > +++ b/arch/x86/include/asm/msr-index.h
> > @@ -577,6 +577,10 @@
> >  #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL		0xc0000301
> >  #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR	0xc0000302
> >
> > +#define AMD_CPPC_EPP_PERFORMANCE		0x00
> > +#define AMD_CPPC_EPP_BALANCE_PERFORMANCE	0x80
> > +#define AMD_CPPC_EPP_BALANCE_POWERSAVE		0xBF
> > +#define AMD_CPPC_EPP_POWERSAVE			0xFF
> >  /* Fam 17h MSRs */
> >  #define MSR_F17H_IRPERF			0xc00000e9
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index fff298744a8e..e71b06e20050 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -63,8 +63,8 @@ module_param(epp_enabled, bool, 0444);
> > MODULE_PARM_DESC(epp_enabled,
> >                  "load amd_pstate or amd_pstate_epp (true =
> > amd_pstate_epp driver instance (default), false = amd_pstate driver
> > instance)");
> >
> > -
> > -static struct cpufreq_driver amd_pstate_driver;
> > +static struct cpufreq_driver *default_pstate_driver; static struct
> > +amd_cpudata **all_cpu_data;
> >
> >  /**
> >   * struct  amd_aperf_mperf
> > @@ -76,6 +76,7 @@ struct amd_aperf_mperf {
> >  	u64 aperf;
> >  	u64 mperf;
> >  	u64 tsc;
> > +	u64 time;
> >  };
> >
> >  /**
> > @@ -98,7 +99,19 @@ struct amd_aperf_mperf {
> >   * @prev: Last Aperf/Mperf/tsc count value read from register
> >   * @freq: current cpu frequency value
> >   * @boost_supported: check whether the Processor or SBIOS supports
> > boost mode
> > - *
> > + * @epp_powersave: Last saved CPPC energy performance preference
> > +				when policy switched to performance
> > + * @epp_policy: Last saved policy used to set energy-performance
> > +preference
> > + * @epp_cached: Cached CPPC energy-performance preference value
> > + * @policy: Cpufreq policy value
> > + * @sched_flags: Store scheduler flags for possible cross CPU update
> > + * @update_util_set: CPUFreq utility callback is set
> > + * @last_update: Time stamp of the last performance state update
> > + * @cppc_boost_min: Last CPPC boosted min performance state
> > + * @cppc_cap1_cached: Cached value of the last CPPC Capabilities MSR
> > + * @update_util: Cpufreq utility callback information
> > + * @sample: the stored performance sample
> > +
> >   * The amd_cpudata is key private data for each CPU thread in AMD P-
> State, and
> >   * represents all the attributes and goals that AMD P-State requests at
> runtime.
> >   */
> > @@ -124,8 +137,195 @@ struct amd_cpudata {
> >  	u64 	freq;
> >  	bool	boost_supported;
> >  	u64 	cppc_hw_conf_cached;
> > +
> > +	/* EPP feature related attributes*/
> > +	s16	epp_powersave;
> > +	s16	epp_policy;
> > +	s16	epp_cached;
> > +	u32	policy;
> > +	u32	sched_flags;
> > +	bool	update_util_set;
> > +	u64	last_update;
> > +	u64	last_io_update;
> > +	u32	cppc_boost_min;
> > +	u64	cppc_cap1_cached;
> > +	struct	update_util_data update_util;
> > +	struct	amd_aperf_mperf sample;
> > +};
> > +
> > +/**
> > + * struct amd_pstate_params - global parameters for the performance
> > +control
> > + * @ cppc_boost_disabled Wheter or not the core performance boost
> > +disabled  */ struct amd_pstate_params {
> > +	bool cppc_boost_disabled;
> > +};
> > +
> > +/*
> > + * AMD Energy Preference Performance (EPP)
> > + * The EPP is used in the CCLK DPM controller to drive
> > + * the frequency that a core is going to operate during
> > + * short periods of activity. EPP values will be utilized for
> > + * different OS profiles (balanced, performance, power savings)
> > + * display strings corresponding to EPP index in the
> > + * energy_perf_strings[]
> > + *	index		String
> > + *-------------------------------------
> > + *	0		default
> > + *	1		performance
> > + *	2		balance_performance
> > + *	3		balance_power
> > + *	4		power
> > + */
> > + enum energy_perf_value_index {
> > +	EPP_INDEX_DEFAULT = 0,
> > +	EPP_INDEX_PERFORMANCE,
> > +	EPP_INDEX_BALANCE_PERFORMANCE,
> > +	EPP_INDEX_BALANCE_POWERSAVE,
> > +	EPP_INDEX_POWERSAVE,
> > +};
> > +
> > +static const char * const energy_perf_strings[] = {
> > +	[EPP_INDEX_DEFAULT] = "default",
> > +	[EPP_INDEX_PERFORMANCE] = "performance",
> > +	[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> > +	[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
> > +	[EPP_INDEX_POWERSAVE] = "power",
> > +	NULL
> >  };
> >
> > +static unsigned int epp_values[] = {
> > +	[EPP_INDEX_DEFAULT] = 0,
> > +	[EPP_INDEX_PERFORMANCE] = AMD_CPPC_EPP_PERFORMANCE,
> > +	[EPP_INDEX_BALANCE_PERFORMANCE] =
> AMD_CPPC_EPP_BALANCE_PERFORMANCE,
> > +	[EPP_INDEX_BALANCE_POWERSAVE] =
> AMD_CPPC_EPP_BALANCE_POWERSAVE,
> > +	[EPP_INDEX_POWERSAVE] = AMD_CPPC_EPP_POWERSAVE, };
> > +
> > +static struct amd_pstate_params global;
> 
> I think a name more descriptive than 'global' should be used here.

Changed to   "global_params"

> 
> > +
> > +static DEFINE_MUTEX(amd_pstate_limits_lock);
> > +static DEFINE_MUTEX(amd_pstate_driver_lock);
> > +static DEFINE_SPINLOCK(amd_pstate_cpu_lock);
> > +
> > +static bool cppc_boost __read_mostly; struct kobject
> > +*amd_pstate_kobj;
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> > +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64
> > +cppc_req_cached) {
> > +	s16 epp;
> > +	struct cppc_perf_caps perf_caps;
> > +	int ret;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		if (!cppc_req_cached) {
> > +			epp = rdmsrl_on_cpu(cpudata->cpu,
> MSR_AMD_CPPC_REQ,
> > +					    &cppc_req_cached);
> > +			if (epp)
> > +				return epp;
> > +		}
> > +		epp = (cppc_req_cached >> 24) & 0xFF;
> > +	} else {
> > +		ret = cppc_get_epp_caps(cpudata->cpu, &perf_caps);
> > +		if (ret < 0) {
> > +			pr_debug("Could not retrieve energy perf value
> (%d)\n", ret);
> > +			return -EIO;
> > +		}
> > +		epp = (s16) perf_caps.energy_perf;
> > +	}
> > +
> > +	return epp;
> > +}
> > +#endif
> > +
> > +static int amd_pstate_get_energy_pref_index(struct amd_cpudata
> > +*cpudata, int *raw_epp) {
> > +	s16 epp;
> > +	int index = -EINVAL;
> > +
> > +	*raw_epp = 0;
> > +	epp = amd_pstate_get_epp(cpudata, 0);
> > +	if (epp < 0)
> > +		return epp;
> > +
> > +	switch (epp) {
> > +	case AMD_CPPC_EPP_PERFORMANCE:
> > +		index = EPP_INDEX_PERFORMANCE;
> > +		break;
> > +	case AMD_CPPC_EPP_BALANCE_PERFORMANCE:
> > +		index = EPP_INDEX_BALANCE_PERFORMANCE;
> > +		break;
> > +	case AMD_CPPC_EPP_BALANCE_POWERSAVE:
> > +		index = EPP_INDEX_BALANCE_POWERSAVE;
> > +		break;
> > +	case AMD_CPPC_EPP_POWERSAVE:
> > +		index = EPP_INDEX_POWERSAVE;
> > +		break;
> > +	default:
> > +		*raw_epp = epp;
> > +		index = 0;
> > +	}
> > +
> > +	return index;
> > +}
> > +
> > +#ifdef CONFIG_ACPI_CPPC_LIB
> > +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp) {
> > +	int ret;
> > +	struct cppc_perf_ctrls perf_ctrls;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		u64 value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +		ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> value);
> > +		if (!ret) {
> > +			cpudata->epp_cached = epp;
> > +		}
> > +	} else {
> > +		perf_ctrls.energy_perf = epp;
> > +		ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls);
> > +		if (ret){
> > +			pr_debug("failed to set energy perf value (%d)\n",
> ret);
> > +			return ret;
> > +        }
> 
> Accidental '}' not indented enough?


Fixed by V2.

> 
> > +		cpudata->epp_cached = epp;
> > +	}
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_set_energy_pref_index(struct amd_cpudata
> *cpudata,
> > +					      int pref_index, bool use_raw,
> > +					      u32 raw_epp)
> > +{
> > +	int epp = -EINVAL;
> > +	int ret;
> > +
> > +	if (!pref_index) {
> > +		pr_debug("EPP pref_index is invaid\n");
> 
> s/invaid/invalid

Fixed by V2.

> 
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (use_raw)
> > +		epp = raw_epp;
> > +	else if (epp == -EINVAL)
> > +		epp = epp_values[pref_index];
> > +
> > +	if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> {
> > +		pr_debug("EPP cannot be set under performance policy\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	ret = amd_pstate_set_epp(cpudata, epp);
> > +
> > +	return ret;
> > +}
> > +#endif
> > +
> >  static inline int pstate_enable(bool enable)  {
> >  	return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); @@ -141,16
> +341,18
> > @@ static int cppc_enable(bool enable)
> >  		if (ret)
> >  			return ret;
> >
> > -	/* Enable active mode for EPP */
> > -	ret = cppc_set_auto_epp(cpu, enable);
> > -	if (ret)
> > -		return ret;
> > -
> > -	/* Set zero to desired perf to enable EPP control*/
> > -	perf_ctrls.desired_perf = 0;
> > -	ret = cppc_set_perf(cpu, &perf_ctrls);
> > -	if (ret)
> > -		return ret;
> > +		if (epp_enabled) {
> > +			/* Enable autonomous mode for EPP */
> > +			ret = cppc_set_auto_epp(cpu, enable);
> > +			if (ret)
> > +				return ret;
> > +
> > +			/* Set zero to desired perf to allow EPP firmware
> control*/
> > +			perf_ctrls.desired_perf = 0;
> > +			ret = cppc_set_perf(cpu, &perf_ctrls);
> > +			if (ret)
> > +				return ret;
> > +		}
> 
> This update to cppc_enable() should be part of Patch 4/7.


Correct,  fixed in V2.


> 
> >  	}
> >
> >  	return ret;
> > @@ -496,7 +698,7 @@ static void amd_pstate_boost_init(struct
> amd_cpudata *cpudata)
> >  		return;
> >
> >  	cpudata->boost_supported = true;
> > -	amd_pstate_driver.boost_enabled = true;
> > +	default_pstate_driver->boost_enabled = true;
> >  }
> >
> >  static int amd_pstate_cpu_init(struct cpufreq_policy *policy) @@
> > -671,10 +873,108 @@ static ssize_t show_amd_pstate_highest_perf(struct
> cpufreq_policy *policy,
> >  	return sprintf(&buf[0], "%u\n", perf);  }
> >
> > +static ssize_t show_energy_performance_available_preferences(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	int i = 0;
> > +	int ret = 0;
> > +
> > +	while (energy_perf_strings[i] != NULL)
> > +		ret += sprintf(&buf[ret], "%s ", energy_perf_strings[i++]);
> > +
> > +	ret += sprintf(&buf[ret], "\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t store_energy_performance_preference(
> > +		struct cpufreq_policy *policy, const char *buf, size_t count) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	char str_preference[21];
> > +	bool raw = false;
> > +	ssize_t ret;
> > +	u32 epp = 0;
> > +
> > +	ret = sscanf(buf, "%20s", str_preference);
> > +	if (ret != 1)
> > +		return -EINVAL;
> > +
> > +	ret = match_string(energy_perf_strings, -1, str_preference);
> > +	if (ret < 0) {
> > +		ret = kstrtouint(buf, 10, &epp);
> > +		if (ret)
> > +			return ret;
> > +
> > +		if ((epp > 255) || (epp < 0))
> > +			return -EINVAL;
> > +
> > +		raw = true;
> > +	}
> > +
> > +	mutex_lock(&amd_pstate_limits_lock);
> > +	ret = amd_pstate_set_energy_pref_index(cpudata, ret, raw, epp);
> > +	mutex_unlock(&amd_pstate_limits_lock);
> > +
> > +	return ret ?: count;
> > +}
> > +
> > +static ssize_t show_energy_performance_preference(
> > +				struct cpufreq_policy *policy, char *buf) {
> > +	struct amd_cpudata *cpudata = policy->driver_data;
> > +	int preference, raw_epp;
> > +
> > +	preference = amd_pstate_get_energy_pref_index(cpudata,
> &raw_epp);
> > +	if (preference < 0)
> > +		return preference;
> > +
> > +	if (raw_epp)
> > +		return  sprintf(buf, "%d\n", raw_epp);
> > +	else
> > +		return  sprintf(buf, "%s\n",
> energy_perf_strings[preference]); }
> > +
> > +static void amd_pstate_update_policies(void) {
> > +	int cpu;
> > +
> > +	for_each_possible_cpu(cpu)
> > +		cpufreq_update_policy(cpu);
> > +}
> > +
> > +static ssize_t show_pstate_dynamic_boost(struct kobject *kobj,
> > +				struct kobj_attribute *attr, char *buf) {
> > +	return sprintf(buf, "%u\n", cppc_boost); }
> > +
> > +static ssize_t store_pstate_dynamic_boost(struct kobject *a,
> > +				       struct kobj_attribute *b,
> > +				       const char *buf, size_t count) {
> > +	unsigned int input;
> > +	int ret;
> > +
> > +	ret = kstrtouint(buf, 10, &input);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	cppc_boost = !!input;
> > +	amd_pstate_update_policies();
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +
> > +	return count;
> > +}
> > +
> >  cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >  cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> >  cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> > +cpufreq_freq_attr_rw(energy_performance_preference);
> > +cpufreq_freq_attr_ro(energy_performance_available_preferences);
> > +define_one_global_rw(pstate_dynamic_boost);
> >
> >  static struct freq_attr *amd_pstate_attr[] = {
> >  	&amd_pstate_max_freq,
> > @@ -683,6 +983,424 @@ static struct freq_attr *amd_pstate_attr[] = {
> >  	NULL,
> >  };
> >
> > +static struct freq_attr *amd_pstate_epp_attr[] = {
> > +	&amd_pstate_max_freq,
> > +	&amd_pstate_lowest_nonlinear_freq,
> > +	&amd_pstate_highest_perf,
> > +	&energy_performance_preference,
> > +	&energy_performance_available_preferences,
> > +	NULL,
> > +};
> > +
> > +static struct attribute *pstate_global_attributes[] = {
> > +	&pstate_dynamic_boost.attr,
> > +	NULL
> > +};
> > +
> > +static const struct attribute_group amd_pstate_global_attr_group = {
> > +	.attrs = pstate_global_attributes,
> > +};
> > +
> > +static inline void update_boost_state(void) {
> > +	u64 misc_en;
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[0];
> > +	rdmsrl(MSR_AMD_CPPC_HW_CTL, misc_en);
> > +	global.cppc_boost_disabled = misc_en &
> > +AMD_CPPC_PRECISION_BOOST_ENABLED; }
> > +
> > +static int amd_pstate_init_cpu(unsigned int cpunum) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	cpudata = all_cpu_data[cpunum];
> > +	if (!cpudata) {
> > +		cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> > +		if (!cpudata)
> > +			return -ENOMEM;
> > +		WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> > +
> > +		cpudata->cpu = cpunum;
> > +	}
> > +	cpudata->epp_powersave = -EINVAL;
> > +	cpudata->epp_policy = 0;
> > +	pr_debug("controlling: cpu %d\n", cpunum);
> > +	return 0;
> > +}
> > +
> > +static int __amd_pstate_cpu_init(struct cpufreq_policy *policy) {
> > +	int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> > +	struct amd_cpudata *cpudata;
> > +	struct device *dev;
> > +	int rc;
> > +	u64 value;
> > +
> > +	rc = amd_pstate_init_cpu(policy->cpu);
> > +	if (rc)
> > +		return rc;
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +
> > +	dev = get_cpu_device(policy->cpu);
> > +	if (!dev)
> > +		goto free_cpudata1;
> > +
> > +	rc = amd_pstate_init_perf(cpudata);
> > +	if (rc)
> > +		goto free_cpudata1;
> > +
> > +	min_freq = amd_get_min_freq(cpudata);
> > +	max_freq = amd_get_max_freq(cpudata);
> > +	nominal_freq = amd_get_nominal_freq(cpudata);
> > +	lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> > +		if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > +		dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > +			min_freq, max_freq);
> > +		ret = -EINVAL;
> > +		goto free_cpudata1;
> > +	}
> > +
> > +	policy->min = min_freq;
> > +	policy->max = max_freq;
> > +
> > +	policy->cpuinfo.min_freq = min_freq;
> > +	policy->cpuinfo.max_freq = max_freq;
> > +	/* It will be updated by governor */
> > +	policy->cur = policy->cpuinfo.min_freq;
> > +
> > +	/* Initial processor data capability frequencies */
> > +	cpudata->max_freq = max_freq;
> > +	cpudata->min_freq = min_freq;
> > +	cpudata->nominal_freq = nominal_freq;
> > +	cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > +
> > +	policy->driver_data = cpudata;
> > +
> > +	update_boost_state();
> > +	cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > +
> > +	policy->min = policy->cpuinfo.min_freq;
> > +	policy->max = policy->cpuinfo.max_freq;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC))
> > +		policy->fast_switch_possible = true;
> > +
> > +	if (!shared_mem && boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > +		ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &value);
> > +		if (ret)
> > +			return ret;
> > +		WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> > +	}
> > +	amd_pstate_boost_init(cpudata);
> > +
> > +	return 0;
> > +
> > +free_cpudata1:
> > +	kfree(cpudata);
> > +	return ret;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> > +	int ret;
> > +
> > +	ret = __amd_pstate_cpu_init(policy);
> > +	if (ret)
> > +		return ret;
> > +	/*
> > +	 * Set the policy to powersave to provide a valid fallback value in case
> > +	 * the default cpufreq governor is neither powersave nor
> performance.
> > +	 */
> > +	policy->policy = CPUFREQ_POLICY_POWERSAVE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy) {
> > +	pr_debug("amd-pstate: CPU %d exiting\n", policy->cpu);
> > +	policy->fast_switch_possible = false;
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_update_max_freq(unsigned int cpu) {
> > +	struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu);
> > +
> > +	if (!policy)
> > +		return;
> > +
> > +	refresh_frequency_limits(policy);
> > +	cpufreq_cpu_release(policy);
> > +}
> > +
> > +static void amd_pstate_epp_update_limits(unsigned int cpu) {
> > +	mutex_lock(&amd_pstate_driver_lock);
> > +	update_boost_state();
> > +	if (global.cppc_boost_disabled) {
> > +		for_each_possible_cpu(cpu)
> > +			amd_pstate_update_max_freq(cpu);
> > +	} else {
> > +		cpufreq_update_policy(cpu);
> > +	}
> > +	mutex_unlock(&amd_pstate_driver_lock);
> > +}
> > +
> > +static int cppc_boost_hold_time_ns = 3 * NSEC_PER_MSEC;
> > +
> > +static inline void amd_pstate_boost_up(struct amd_cpudata *cpudata) {
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u32 max_limit = (hwp_req & 0xff);
> > +	u32 min_limit = (hwp_req & 0xff00) >> 8;
> > +	u32 boost_level1;
> > +
> > +	/* If max and min are equal or already at max, nothing to boost */
> > +	if (max_limit == min_limit)
> > +		return;
> > +
> > +	/* Set boost max and min to initial value */
> > +	if (!cpudata->cppc_boost_min)
> > +		cpudata->cppc_boost_min = min_limit;
> > +
> > +	boost_level1 = ((AMD_CPPC_NOMINAL_PERF(hwp_cap) +
> min_limit) >> 1);
> > +
> > +	if (cpudata->cppc_boost_min < boost_level1)
> > +		cpudata->cppc_boost_min = boost_level1;
> > +	else if (cpudata->cppc_boost_min <
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min =
> AMD_CPPC_NOMINAL_PERF(hwp_cap);
> > +	else if (cpudata->cppc_boost_min ==
> AMD_CPPC_NOMINAL_PERF(hwp_cap))
> > +		cpudata->cppc_boost_min = max_limit;
> > +	else
> > +		return;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(cpudata->cppc_boost_min);
> > +	wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> hwp_req);
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_down(struct amd_cpudata *cpudata)
> > +{
> > +	bool expired;
> > +
> > +	if (cpudata->cppc_boost_min) {
> > +		expired = time_after64(cpudata->sample.time, cpudata-
> >last_update +
> > +					cppc_boost_hold_time_ns);
> > +
> > +		if (expired) {
> > +			wrmsrl_safe_on_cpu(cpudata->cpu,
> MSR_AMD_CPPC_REQ, cpudata->cppc_req_cached);
> > +			cpudata->cppc_boost_min = 0;
> > +		}
> > +	}
> > +
> > +	cpudata->last_update = cpudata->sample.time; }
> > +
> > +static inline void amd_pstate_boost_update_util(struct amd_cpudata
> *cpudata,
> > +						      u64 time)
> > +{
> > +	cpudata->sample.time = time;
> > +	if (smp_processor_id() != cpudata->cpu)
> > +		return;
> > +
> > +	if (cpudata->sched_flags & SCHED_CPUFREQ_IOWAIT) {
> > +		bool do_io = false;
> > +
> > +		cpudata->sched_flags = 0;
> > +		/*
> > +		 * Set iowait_boost flag and update time. Since IO WAIT flag
> > +		 * is set all the time, we can't just conclude that there is
> > +		 * some IO bound activity is scheduled on this CPU with just
> > +		 * one occurrence. If we receive at least two in two
> > +		 * consecutive ticks, then we treat as boost candidate.
> > +		 * This is leveraged from Intel Pstate driver.
> > +		 */
> > +		if (time_before64(time, cpudata->last_io_update + 2 *
> TICK_NSEC))
> > +			do_io = true;
> > +
> > +		cpudata->last_io_update = time;
> > +
> > +		if (do_io)
> > +			amd_pstate_boost_up(cpudata);
> > +
> > +	} else {
> > +		amd_pstate_boost_down(cpudata);
> > +	}
> > +}
> > +
> > +static inline void amd_pstate_cppc_update_hook(struct update_util_data
> *data,
> > +						u64 time, unsigned int flags)
> > +{
> > +	struct amd_cpudata *cpudata = container_of(data,
> > +				struct amd_cpudata, update_util);
> > +
> > +	cpudata->sched_flags |= flags;
> > +
> > +	if (smp_processor_id() == cpudata->cpu)
> > +		amd_pstate_boost_update_util(cpudata, time); }
> > +
> > +static void amd_pstate_clear_update_util_hook(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +
> > +	if (!cpudata->update_util_set)
> > +		return;
> > +
> > +	cpufreq_remove_update_util_hook(cpu);
> > +	cpudata->update_util_set = false;
> > +	synchronize_rcu();
> > +}
> > +
> > +static void amd_pstate_set_update_util_hook(unsigned int cpu_num) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu_num];
> > +
> > +	if (!cppc_boost) {
> > +		if (cpudata->update_util_set)
> > +			amd_pstate_clear_update_util_hook(cpudata->cpu);
> > +		return;
> > +	}
> > +
> > +	if (cpudata->update_util_set)
> > +		return;
> > +
> > +	cpudata->sample.time = 0;
> > +	cpufreq_add_update_util_hook(cpu_num, &cpudata->update_util,
> > +
> 	amd_pstate_cppc_update_hook);
> 
> Here, and elsewhere, the param on the second line should line up under the
> first param on the previous line.

Fixed by V2. 
Thanks

> 
> > +	cpudata->update_util_set = true;
> > +}
> > +
> > +static void amd_pstate_epp_init(unsigned int cpu) {
> > +	struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > +	u32 max_perf, min_perf;
> > +	u64 value;
> > +	s16 epp;
> > +	int ret;
> > +
> > +	max_perf = READ_ONCE(cpudata->highest_perf);
> > +	min_perf = READ_ONCE(cpudata->lowest_perf);
> > +
> > +	value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> > +		min_perf = max_perf;
> > +
> > +	/* Initial min/max values for CPPC Performance Controls Register */
> > +	value &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	value |= AMD_CPPC_MIN_PERF(min_perf);
> > +
> > +	value &= ~AMD_CPPC_MAX_PERF(~0L);
> > +	value |= AMD_CPPC_MAX_PERF(max_perf);
> > +
> > +	/* CPPC EPP feature require to set zero to the desire perf bit */
> > +	value &= ~AMD_CPPC_DES_PERF(~0L);
> > +	value |= AMD_CPPC_DES_PERF(0);
> > +
> > +	if (cpudata->epp_policy == cpudata->policy)
> > +		goto skip_epp;
> > +
> > +	cpudata->epp_policy = cpudata->policy;
> > +
> > +	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		cpudata->epp_powersave = epp;
> > +		if (epp < 0)
> > +			goto skip_epp;
> 
> This looks wrong. The epp value is a s16, but reading the MSR can return an
> EPP setting between 0x00 - 0xFF. The epp value returned from
> amd_pstate_get_epp() can be both negative and a valid value.
> 
> Were your intentions to skip any EPP settings that tend toward POWERSAVE
> values? If so I think a comment explaining this or re-working the code to
> make that clear would be good.
> 
> > +
> > +		epp = 0;
> > +	} else {
> > +		if (cpudata->epp_powersave < 0)
> > +			goto skip_epp;
> > +		/* Get BIOS pre-defined epp value */
> > +		epp = amd_pstate_get_epp(cpudata, value);
> > +		if (epp)
> > +			goto skip_epp;
> > +		epp = cpudata->epp_powersave;
> > +	}
> > +	/* Set initial EPP value */
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		value &= ~GENMASK_ULL(31, 24);
> > +		value |= (u64)epp << 24;
> > +	}
> > +
> > +skip_epp:
> > +	WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +	ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> > +	if (!ret) {
> > +		cpudata->epp_cached = epp;
> > +	}
> 
> No need for curly braces here.
> 
> -Nathan

Fixed in V2, thanks. 

> 
> > +}
> > +
> > +static void amd_pstate_set_max_limits(struct amd_cpudata *cpudata) {
> > +	u64 hwp_cap = READ_ONCE(cpudata->cppc_cap1_cached);
> > +	u64 hwp_req = READ_ONCE(cpudata->cppc_req_cached);
> > +	u32 max_limit = (hwp_cap >> 24) & 0xff;
> > +
> > +	hwp_req &= ~AMD_CPPC_MIN_PERF(~0L);
> > +	hwp_req |= AMD_CPPC_MIN_PERF(max_limit);
> > +	wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, hwp_req); }
> > +
> > +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy) {
> > +	struct amd_cpudata *cpudata;
> > +
> > +	if (!policy->cpuinfo.max_freq)
> > +		return -ENODEV;
> > +
> > +	pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> > +				policy->cpuinfo.max_freq, policy->max);
> > +
> > +	cpudata = all_cpu_data[policy->cpu];
> > +	cpudata->policy = policy->policy;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		mutex_lock(&amd_pstate_limits_lock);
> > +
> > +		if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > +			amd_pstate_clear_update_util_hook(policy->cpu);
> > +			amd_pstate_set_max_limits(cpudata);
> > +		} else {
> > +			amd_pstate_set_update_util_hook(policy->cpu);
> > +		}
> > +
> > +		if (boot_cpu_has(X86_FEATURE_CPPC))
> > +			amd_pstate_epp_init(policy->cpu);
> > +
> > +		mutex_unlock(&amd_pstate_limits_lock);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void amd_pstate_verify_cpu_policy(struct amd_cpudata *cpudata,
> > +					   struct cpufreq_policy_data *policy)
> {
> > +	update_boost_state();
> > +	cpufreq_verify_within_cpu_limits(policy);
> > +}
> > +
> > +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data
> > +*policy) {
> > +	amd_pstate_verify_cpu_policy(all_cpu_data[policy->cpu], policy);
> > +	pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy-
> >min );
> > +	return 0;
> > +}
> > +
> >  static struct cpufreq_driver amd_pstate_driver = {
> >  	.flags		= CPUFREQ_CONST_LOOPS |
> CPUFREQ_NEED_UPDATE_LIMITS,
> >  	.verify		= amd_pstate_verify,
> > @@ -696,8 +1414,20 @@ static struct cpufreq_driver amd_pstate_driver =
> {
> >  	.attr		= amd_pstate_attr,
> >  };
> >
> > +static struct cpufreq_driver amd_pstate_epp_driver = {
> > +	.flags		= CPUFREQ_CONST_LOOPS,
> > +	.verify		= amd_pstate_epp_verify_policy,
> > +	.setpolicy	= amd_pstate_epp_set_policy,
> > +	.init		= amd_pstate_epp_cpu_init,
> > +	.exit		= amd_pstate_epp_cpu_exit,
> > +	.update_limits	= amd_pstate_epp_update_limits,
> > +	.name		= "amd_pstate_epp",
> > +	.attr		= amd_pstate_epp_attr,
> > +};
> > +
> >  static int __init amd_pstate_init(void)  {
> > +	static struct amd_cpudata **cpudata;
> >  	int ret;
> >
> >  	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) @@ -712,10
> +1442,25
> > @@ static int __init amd_pstate_init(void)
> >  	if (cpufreq_get_current_driver())
> >  		return -EEXIST;
> >
> > +	cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> > +	if (!cpudata)
> > +		return -ENOMEM;
> > +	WRITE_ONCE(all_cpu_data, cpudata);
> > +
> > +	if (epp_enabled) {
> > +		pr_info("AMD CPPC loading with amd_pstate_epp driver
> instance.\n");
> > +		default_pstate_driver = &amd_pstate_epp_driver;
> > +	} else {
> > +		pr_info("AMD CPPC loading with amd_pstate driver
> instance.\n");
> > +		default_pstate_driver = &amd_pstate_driver;
> > +	}
> > +
> >  	/* capability check */
> >  	if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > +		if (!epp_enabled) {
> > +			default_pstate_driver->adjust_perf =
> amd_pstate_adjust_perf;
> > +		}
> >  		pr_debug("AMD CPPC MSR based functionality is
> supported\n");
> > -		amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> >  	} else if (shared_mem) {
> >  		static_call_update(amd_pstate_enable, cppc_enable);
> >  		static_call_update(amd_pstate_init_perf, cppc_init_perf);
> @@
> > -732,19 +1477,56 @@ static int __init amd_pstate_init(void)
> >  		return ret;
> >  	}
> >
> > -	ret = cpufreq_register_driver(&amd_pstate_driver);
> > +	ret = cpufreq_register_driver(default_pstate_driver);
> >  	if (ret)
> > -		pr_err("failed to register amd_pstate_driver with
> return %d\n",
> > +		pr_err("failed to register amd pstate driver with
> return %d\n",
> >  		       ret);
> >
> > +	amd_pstate_kobj = kobject_create_and_add("amd-pstate",
> &cpu_subsys.dev_root->kobj);
> > +	if (!amd_pstate_kobj) {
> > +		pr_err("amd-pstate: Global sysfs registration failed.\n");
> > +	}
> > +
> > +	ret = sysfs_create_group(amd_pstate_kobj,
> &amd_pstate_global_attr_group);
> > +	if (ret) {
> > +		pr_err("amd-pstate: Sysfs attribute export failed with
> error %d.\n",
> > +		       ret);
> > +	}
> > +
> >  	return ret;
> >  }
> >
> > +static inline void amd_pstate_kobj_cleanup(struct kobject *kobj) {
> > +	kobject_del(kobj);
> > +	kobject_put(kobj);
> > +}
> > +
> >  static void __exit amd_pstate_exit(void)  {
> > -	cpufreq_unregister_driver(&amd_pstate_driver);
> > +	unsigned int cpu;
> > +
> > +	cpufreq_unregister_driver(default_pstate_driver);
> >
> >  	amd_pstate_enable(false);
> > +
> > +	sysfs_remove_group(amd_pstate_kobj,
> &amd_pstate_global_attr_group);
> > +	amd_pstate_kobj_cleanup(amd_pstate_kobj);
> > +
> > +	cpus_read_lock();
> > +	for_each_online_cpu(cpu) {
> > +		if (all_cpu_data[cpu]) {
> > +			if (default_pstate_driver ==
> &amd_pstate_epp_driver)
> > +				amd_pstate_clear_update_util_hook(cpu);
> > +
> > +			spin_lock(&amd_pstate_cpu_lock);
> > +			kfree(all_cpu_data[cpu]);
> > +			WRITE_ONCE(all_cpu_data[cpu], NULL);
> > +			spin_unlock(&amd_pstate_cpu_lock);
> > +		}
> > +	}
> > +	cpus_read_unlock();
> > +
> >  }
> >
> >  module_init(amd_pstate_init);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor
  2022-09-25 12:23     ` Yuan, Perry
@ 2022-09-29 14:08       ` Nathan Fontenot
  0 siblings, 0 replies; 22+ messages in thread
From: Nathan Fontenot @ 2022-09-29 14:08 UTC (permalink / raw)
  To: Yuan, Perry, rafael.j.wysocki, Huang, Ray, viresh.kumar
  Cc: Sharma, Deepak, Limonciello, Mario, Deucher, Alexander, Su,
	Jinzhou (Joe), Huang, Shimmer, Du, Xiaojian, Meng, Li (Jassmine),
	linux-pm, linux-kernel

On 9/25/22 07:23, Yuan, Perry wrote:
> [AMD Official Use Only - General]
> 
> Hi Nathan.
> 
> 
>> -----Original Message-----
>> From: Fontenot, Nathan <Nathan.Fontenot@amd.com>
>> Sent: Friday, September 16, 2022 12:24 AM
>> To: Yuan, Perry <Perry.Yuan@amd.com>; rafael.j.wysocki@intel.com; Huang,
>> Ray <Ray.Huang@amd.com>; viresh.kumar@linaro.org
>> Cc: Sharma, Deepak <Deepak.Sharma@amd.com>; Limonciello, Mario
>> <Mario.Limonciello@amd.com>; Deucher, Alexander
>> <Alexander.Deucher@amd.com>; Su, Jinzhou (Joe) <Jinzhou.Su@amd.com>;
>> Huang, Shimmer <Shimmer.Huang@amd.com>; Du, Xiaojian
>> <Xiaojian.Du@amd.com>; Meng, Li (Jassmine) <Li.Meng@amd.com>; linux-
>> pm@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support
>> for shared memory type processor
>>
>> On 9/9/22 11:45, Perry Yuan wrote:
>>> Add Energy Performance Preference support for AMD SOCs which only
>>> support the shared memory interface that implemented on Zen2 and Zen3
>>> processors, because this type CPU has no MSR supported, it will use
>>> ACPI PCC channel to enable EPP and reset desired perf to be zero.
>>>
>>> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
>>> ---
>>>  drivers/cpufreq/amd-pstate.c | 12 ++++++++++++
>>>  1 file changed, 12 insertions(+)
>>>
>>> diff --git a/drivers/cpufreq/amd-pstate.c
>>> b/drivers/cpufreq/amd-pstate.c index 451295284a26..fff298744a8e 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -133,12 +133,24 @@ static inline int pstate_enable(bool enable)
>>>
>>>  static int cppc_enable(bool enable)
>>>  {
>>> +	struct cppc_perf_ctrls perf_ctrls;
>>>  	int cpu, ret = 0;
>>>
>>>  	for_each_present_cpu(cpu) {
>>>  		ret = cppc_set_enable(cpu, enable);
>>>  		if (ret)
>>>  			return ret;
>>> +
>>> +	/* Enable active mode for EPP */
>>> +	ret = cppc_set_auto_epp(cpu, enable);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	/* Set zero to desired perf to enable EPP control*/
>>> +	perf_ctrls.desired_perf = 0;
>>> +	ret = cppc_set_perf(cpu, &perf_ctrls);
>>> +	if (ret)
>>> +		return ret;
>>
>> Shouldn't this entire block be indented one additional tab over since its part
>> of the for_each_present_cpu() loop?
>>
>> -Nathan
> 
> Yes, I can see the indent in my local vscode editor, I do not know why the patch didn`t show that.
> Is it correct as below ?
> 
> 	for_each_present_cpu(cpu) {
> 		ret = cppc_set_enable(cpu, enable);
> 		if (ret)
> 			return ret;
> 
> 		if (epp_enabled) {
> 			/* Enable autonomous mode for EPP */
> 			ret = cppc_set_auto_epp(cpu, enable);
> 			if (ret)
> 				return ret;
> 
> 			/* Set zero to desired perf to allow EPP firmware control*/
> 			perf_ctrls.desired_perf = 0;
> 			ret = cppc_set_perf(cpu, &perf_ctrls);
> 			if (ret)
> 				return ret;
> 		}
> 	}
> 
> Perry. 

That looks much better.

-Nathan

> 
>>
>>>  	}
>>>
>>>  	return ret;

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-09-29 14:08 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-09 16:45 [PATCH 0/7] Implement AMD Pstate EPP Driver Perry Yuan
2022-09-09 16:45 ` [PATCH 1/7] ACPI: CPPC: Add AMD pstate energy performance preference cppc control Perry Yuan
2022-09-09 16:45 ` [PATCH 2/7] cpufreq: amd_pstate: add module parameter to load amd pstate EPP driver Perry Yuan
2022-09-09 18:49   ` Limonciello, Mario
2022-09-13 15:14     ` Yuan, Perry
2022-09-25 16:58     ` Yuan, Perry
2022-09-09 16:45 ` [PATCH 3/7] Documentation: amd-pstate: add EPP profiles introduction Perry Yuan
2022-09-09 16:45 ` [PATCH 4/7] cpufreq: amd_pstate: add AMD pstate EPP support for shared memory type processor Perry Yuan
2022-09-09 18:52   ` Limonciello, Mario
2022-09-13 15:20     ` Yuan, Perry
2022-09-25 17:01     ` Yuan, Perry
2022-09-15 16:24   ` Nathan Fontenot
2022-09-25 12:23     ` Yuan, Perry
2022-09-29 14:08       ` Nathan Fontenot
2022-09-09 16:45 ` [PATCH 5/7] cpufreq: amd_pstate: add AMD Pstate EPP support for the MSR based processors Perry Yuan
2022-09-15 18:58   ` Nathan Fontenot
2022-09-25 17:12     ` Yuan, Perry
2022-09-09 16:45 ` [PATCH 6/7] cpufreq: amd_pstate: implement amd pstate cpu online and offline callback Perry Yuan
2022-09-09 19:02   ` Limonciello, Mario
2022-09-15 19:03   ` Nathan Fontenot
2022-09-09 16:45 ` [PATCH 7/7] cpufreq: amd-pstate: implement suspend and resume callbacks Perry Yuan
2022-09-09 19:00   ` Limonciello, Mario

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).