All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] Disable FIE on machines with slow counters
@ 2022-08-18 21:16 ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: rafael, lenb, viresh.kumar, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm, Jeremy Linton

FIE assumes the delivered/relative perf registers are fast to read so
it goes ahead and hits them quite frequently. On a couple Arm
platforms though they end up in PCC regions which require mailbox
handshaking with other parts of the platform.

This results in a lot of overhead in the cppc_fie task. As such lets
runtime disable FIE if we detect it enabled on one of those platforms.
Also allow the user to manually enable/disable it via a module parameter.

v1->v2:
	Apply Rafael's review comments.
	Move the MODULE_PARAM into the ifdef
	Fix compiler warning when ACPI_CPPC_LIB is disabled.
v2->v3:
	Tristate the module param so FIE can be forced on/off
	Bump pr_debug to pr_info if FIE is disabled due to PCC regions
	Switch ACPI_CPPC_CPUFREQ_FIE off by default

Jeremy Linton (2):
  ACPI: CPPC: Disable FIE if registers in PCC regions
  cpufreq: CPPC: Change FIE default

 drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
 drivers/cpufreq/Kconfig.arm    |  2 +-
 drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
 include/acpi/cppc_acpi.h       |  5 +++++
 4 files changed, 74 insertions(+), 5 deletions(-)

-- 
2.37.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] [PATCH v3 0/2] Disable FIE on machines with slow counters
@ 2022-08-18 21:16 ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 1187 bytes --]

FIE assumes the delivered/relative perf registers are fast to read so
it goes ahead and hits them quite frequently. On a couple Arm
platforms though they end up in PCC regions which require mailbox
handshaking with other parts of the platform.

This results in a lot of overhead in the cppc_fie task. As such lets
runtime disable FIE if we detect it enabled on one of those platforms.
Also allow the user to manually enable/disable it via a module parameter.

v1->v2:
	Apply Rafael's review comments.
	Move the MODULE_PARAM into the ifdef
	Fix compiler warning when ACPI_CPPC_LIB is disabled.
v2->v3:
	Tristate the module param so FIE can be forced on/off
	Bump pr_debug to pr_info if FIE is disabled due to PCC regions
	Switch ACPI_CPPC_CPUFREQ_FIE off by default

Jeremy Linton (2):
  ACPI: CPPC: Disable FIE if registers in PCC regions
  cpufreq: CPPC: Change FIE default

 drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
 drivers/cpufreq/Kconfig.arm    |  2 +-
 drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
 include/acpi/cppc_acpi.h       |  5 +++++
 4 files changed, 74 insertions(+), 5 deletions(-)

-- 
2.37.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-18 21:16   ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: rafael, lenb, viresh.kumar, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm, Jeremy Linton

PCC regions utilize a mailbox to set/retrieve register values used by
the CPPC code. This is fine as long as the operations are
infrequent. With the FIE code enabled though the overhead can range
from 2-11% of system CPU overhead (ex: as measured by top) on Arm
based machines.

So, before enabling FIE assure none of the registers used by
cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
enable a module parameter which can also disable it at boot or module
reload.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
 drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
 include/acpi/cppc_acpi.h       |  5 +++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 1e15a9f25ae9..c840bf606b30 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
 }
 EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
 
+/**
+ * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
+ *
+ * CPPC has flexibility about how counters describing CPU perf are delivered.
+ * One of the choices is PCC regions, which can have a high access latency. This
+ * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
+ *
+ * Return: true if any of the counters are in PCC regions, false otherwise
+ */
+bool cppc_perf_ctrs_in_pcc(void)
+{
+	int cpu;
+
+	for_each_present_cpu(cpu) {
+		struct cpc_register_resource *ref_perf_reg;
+		struct cpc_desc *cpc_desc;
+
+		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
+
+		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
+		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
+		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
+			return true;
+
+
+		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
+
+		/*
+		 * If reference perf register is not supported then we should
+		 * use the nominal perf value
+		 */
+		if (!CPC_SUPPORTED(ref_perf_reg))
+			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
+
+		if (CPC_IN_PCC(ref_perf_reg))
+			return true;
+	}
+	return false;
+}
+EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
+
 /**
  * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
  * @cpunum: CPU from which to read counters.
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 24eaf0ec344d..32fcb0bf74a4 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
 
 static struct cpufreq_driver cppc_cpufreq_driver;
 
+static enum {
+	FIE_UNSET = -1,
+	FIE_ENABLED,
+	FIE_DISABLED
+} fie_disabled = FIE_UNSET;
+
 #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
+module_param(fie_disabled, int, 0444);
+MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
 
 /* Frequency invariance support */
 struct cppc_freq_invariance {
@@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
 	struct cppc_freq_invariance *cppc_fi;
 	int cpu, ret;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	for_each_cpu(cpu, policy->cpus) {
@@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
 	struct cppc_freq_invariance *cppc_fi;
 	int cpu;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	/* policy->cpus will be empty here, use related_cpus instead */
@@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
 	};
 	int ret;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	switch (fie_disabled) {
+	/* honor user request */
+	case FIE_DISABLED:
+	case FIE_ENABLED:
+		break;
+	case FIE_UNSET:
+	default:
+		fie_disabled = FIE_ENABLED;
+		if (cppc_perf_ctrs_in_pcc()) {
+			pr_info("FIE not enabled on systems with registers in PCC\n");
+			fie_disabled = FIE_DISABLED;
+		}
+		break;
+	}
+	if (fie_disabled)
 		return;
 
 	kworker_fie = kthread_create_worker(0, "cppc_fie");
@@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
 
 static void cppc_freq_invariance_exit(void)
 {
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	kthread_destroy_worker(kworker_fie);
@@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
 		    wa_info[i].oem_revision == tbl->oem_revision) {
 			/* Overwrite the get() callback */
 			cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
+			fie_disabled = FIE_DISABLED;
 			break;
 		}
 	}
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index f73d357ecdf5..c5614444031f 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 extern int cppc_set_enable(int cpu, bool enable);
 extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
+extern bool cppc_perf_ctrs_in_pcc(void);
 extern bool acpi_cpc_valid(void);
 extern bool cppc_allow_fast_switch(void);
 extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
@@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
 {
 	return -ENOTSUPP;
 }
+static inline bool cppc_perf_ctrs_in_pcc(void)
+{
+	return false;
+}
 static inline bool acpi_cpc_valid(void)
 {
 	return false;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Devel] [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-18 21:16   ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5769 bytes --]

PCC regions utilize a mailbox to set/retrieve register values used by
the CPPC code. This is fine as long as the operations are
infrequent. With the FIE code enabled though the overhead can range
from 2-11% of system CPU overhead (ex: as measured by top) on Arm
based machines.

So, before enabling FIE assure none of the registers used by
cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
enable a module parameter which can also disable it at boot or module
reload.

Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
---
 drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
 drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
 include/acpi/cppc_acpi.h       |  5 +++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 1e15a9f25ae9..c840bf606b30 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
 }
 EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
 
+/**
+ * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
+ *
+ * CPPC has flexibility about how counters describing CPU perf are delivered.
+ * One of the choices is PCC regions, which can have a high access latency. This
+ * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
+ *
+ * Return: true if any of the counters are in PCC regions, false otherwise
+ */
+bool cppc_perf_ctrs_in_pcc(void)
+{
+	int cpu;
+
+	for_each_present_cpu(cpu) {
+		struct cpc_register_resource *ref_perf_reg;
+		struct cpc_desc *cpc_desc;
+
+		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
+
+		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
+		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
+		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
+			return true;
+
+
+		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
+
+		/*
+		 * If reference perf register is not supported then we should
+		 * use the nominal perf value
+		 */
+		if (!CPC_SUPPORTED(ref_perf_reg))
+			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
+
+		if (CPC_IN_PCC(ref_perf_reg))
+			return true;
+	}
+	return false;
+}
+EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
+
 /**
  * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
  * @cpunum: CPU from which to read counters.
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 24eaf0ec344d..32fcb0bf74a4 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
 
 static struct cpufreq_driver cppc_cpufreq_driver;
 
+static enum {
+	FIE_UNSET = -1,
+	FIE_ENABLED,
+	FIE_DISABLED
+} fie_disabled = FIE_UNSET;
+
 #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
+module_param(fie_disabled, int, 0444);
+MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
 
 /* Frequency invariance support */
 struct cppc_freq_invariance {
@@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
 	struct cppc_freq_invariance *cppc_fi;
 	int cpu, ret;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	for_each_cpu(cpu, policy->cpus) {
@@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
 	struct cppc_freq_invariance *cppc_fi;
 	int cpu;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	/* policy->cpus will be empty here, use related_cpus instead */
@@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
 	};
 	int ret;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	switch (fie_disabled) {
+	/* honor user request */
+	case FIE_DISABLED:
+	case FIE_ENABLED:
+		break;
+	case FIE_UNSET:
+	default:
+		fie_disabled = FIE_ENABLED;
+		if (cppc_perf_ctrs_in_pcc()) {
+			pr_info("FIE not enabled on systems with registers in PCC\n");
+			fie_disabled = FIE_DISABLED;
+		}
+		break;
+	}
+	if (fie_disabled)
 		return;
 
 	kworker_fie = kthread_create_worker(0, "cppc_fie");
@@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
 
 static void cppc_freq_invariance_exit(void)
 {
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+	if (fie_disabled)
 		return;
 
 	kthread_destroy_worker(kworker_fie);
@@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
 		    wa_info[i].oem_revision == tbl->oem_revision) {
 			/* Overwrite the get() callback */
 			cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
+			fie_disabled = FIE_DISABLED;
 			break;
 		}
 	}
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index f73d357ecdf5..c5614444031f 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 extern int cppc_set_enable(int cpu, bool enable);
 extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
+extern bool cppc_perf_ctrs_in_pcc(void);
 extern bool acpi_cpc_valid(void);
 extern bool cppc_allow_fast_switch(void);
 extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
@@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
 {
 	return -ENOTSUPP;
 }
+static inline bool cppc_perf_ctrs_in_pcc(void)
+{
+	return false;
+}
 static inline bool acpi_cpc_valid(void)
 {
 	return false;
-- 
2.37.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/2] cpufreq: CPPC: Change FIE default
@ 2022-08-18 21:16   ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: rafael, lenb, viresh.kumar, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm, Jeremy Linton

FIE is mostly implemented as PCC mailboxes on arm machines.  This was
enabled by default without any data suggesting that it does anything
but hurt system performance. Lets change the default to 'n' until
hardware appears which clearly benefits.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/cpufreq/Kconfig.arm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 954749afb5fe..ad66d8f15db0 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -22,7 +22,7 @@ config ACPI_CPPC_CPUFREQ
 config ACPI_CPPC_CPUFREQ_FIE
 	bool "Frequency Invariance support for CPPC cpufreq driver"
 	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
-	default y
+	default n
 	help
 	  This extends frequency invariance support in the CPPC cpufreq driver,
 	  by using CPPC delivered and reference performance counters.
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Devel] [PATCH v3 2/2] cpufreq: CPPC: Change FIE default
@ 2022-08-18 21:16   ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-18 21:16 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

FIE is mostly implemented as PCC mailboxes on arm machines.  This was
enabled by default without any data suggesting that it does anything
but hurt system performance. Lets change the default to 'n' until
hardware appears which clearly benefits.

Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
---
 drivers/cpufreq/Kconfig.arm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 954749afb5fe..ad66d8f15db0 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -22,7 +22,7 @@ config ACPI_CPPC_CPUFREQ
 config ACPI_CPPC_CPUFREQ_FIE
 	bool "Frequency Invariance support for CPPC cpufreq driver"
 	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
-	default y
+	default n
 	help
 	  This extends frequency invariance support in the CPPC cpufreq driver,
 	  by using CPPC delivered and reference performance counters.
-- 
2.37.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-23 17:10     ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2022-08-23 17:10 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: ACPI Devel Maling List, Rafael J. Wysocki, Len Brown,
	Viresh Kumar, Robert Moore, Punit Agrawal, Lukasz Luba,
	Ionela Voinescu, Pierre Gondois, Linux Kernel Mailing List,
	open list:ACPI COMPONENT ARCHITECTURE (ACPICA),
	Linux PM

On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton@arm.com> wrote:
>
> PCC regions utilize a mailbox to set/retrieve register values used by
> the CPPC code. This is fine as long as the operations are
> infrequent. With the FIE code enabled though the overhead can range
> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> based machines.
>
> So, before enabling FIE assure none of the registers used by
> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> enable a module parameter which can also disable it at boot or module
> reload.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>  drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>  include/acpi/cppc_acpi.h       |  5 +++++
>  3 files changed, 73 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 1e15a9f25ae9..c840bf606b30 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>  }
>  EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>
> +/**
> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> + *
> + * CPPC has flexibility about how counters describing CPU perf are delivered.

"CPU performance counters are accessed"


> + * One of the choices is PCC regions, which can have a high access latency. This
> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> + *
> + * Return: true if any of the counters are in PCC regions, false otherwise
> + */
> +bool cppc_perf_ctrs_in_pcc(void)
> +{
> +       int cpu;
> +
> +       for_each_present_cpu(cpu) {
> +               struct cpc_register_resource *ref_perf_reg;
> +               struct cpc_desc *cpc_desc;
> +
> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> +
> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> +                       return true;
> +
> +
> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> +
> +               /*
> +                * If reference perf register is not supported then we should
> +                * use the nominal perf value
> +                */
> +               if (!CPC_SUPPORTED(ref_perf_reg))
> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> +
> +               if (CPC_IN_PCC(ref_perf_reg))
> +                       return true;
> +       }
> +       return false;
> +}
> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> +
>  /**
>   * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>   * @cpunum: CPU from which to read counters.
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 24eaf0ec344d..32fcb0bf74a4 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>
>  static struct cpufreq_driver cppc_cpufreq_driver;
>
> +static enum {
> +       FIE_UNSET = -1,
> +       FIE_ENABLED,
> +       FIE_DISABLED
> +} fie_disabled = FIE_UNSET;
> +
>  #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> +module_param(fie_disabled, int, 0444);
> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>
>  /* Frequency invariance support */
>  struct cppc_freq_invariance {
> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>         struct cppc_freq_invariance *cppc_fi;
>         int cpu, ret;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         for_each_cpu(cpu, policy->cpus) {
> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>         struct cppc_freq_invariance *cppc_fi;
>         int cpu;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         /* policy->cpus will be empty here, use related_cpus instead */
> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>         };
>         int ret;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       switch (fie_disabled) {
> +       /* honor user request */
> +       case FIE_DISABLED:
> +       case FIE_ENABLED:
> +               break;
> +       case FIE_UNSET:
> +       default:

Would be more straightforward to do

if (fie_disabled == FIE_UNSET) {

here.

> +               fie_disabled = FIE_ENABLED;
> +               if (cppc_perf_ctrs_in_pcc()) {
> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
> +                       fie_disabled = FIE_DISABLED;
> +               }
> +               break;
> +       }
> +       if (fie_disabled)
>                 return;
>
>         kworker_fie = kthread_create_worker(0, "cppc_fie");
> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>
>  static void cppc_freq_invariance_exit(void)
>  {
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         kthread_destroy_worker(kworker_fie);
> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>                     wa_info[i].oem_revision == tbl->oem_revision) {
>                         /* Overwrite the get() callback */
>                         cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
> +                       fie_disabled = FIE_DISABLED;
>                         break;
>                 }
>         }
> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> index f73d357ecdf5..c5614444031f 100644
> --- a/include/acpi/cppc_acpi.h
> +++ b/include/acpi/cppc_acpi.h
> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>  extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>  extern int cppc_set_enable(int cpu, bool enable);
>  extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
> +extern bool cppc_perf_ctrs_in_pcc(void);
>  extern bool acpi_cpc_valid(void);
>  extern bool cppc_allow_fast_switch(void);
>  extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>  {
>         return -ENOTSUPP;
>  }
> +static inline bool cppc_perf_ctrs_in_pcc(void)
> +{
> +       return false;
> +}
>  static inline bool acpi_cpc_valid(void)
>  {
>         return false;
> --

Apart from the above it looks fine to me, but I would like to get an
ACK from Viresh on the second patch.

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-23 17:10     ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2022-08-23 17:10 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7121 bytes --]

On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton(a)arm.com> wrote:
>
> PCC regions utilize a mailbox to set/retrieve register values used by
> the CPPC code. This is fine as long as the operations are
> infrequent. With the FIE code enabled though the overhead can range
> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> based machines.
>
> So, before enabling FIE assure none of the registers used by
> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> enable a module parameter which can also disable it at boot or module
> reload.
>
> Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
> ---
>  drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>  drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>  include/acpi/cppc_acpi.h       |  5 +++++
>  3 files changed, 73 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 1e15a9f25ae9..c840bf606b30 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>  }
>  EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>
> +/**
> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> + *
> + * CPPC has flexibility about how counters describing CPU perf are delivered.

"CPU performance counters are accessed"


> + * One of the choices is PCC regions, which can have a high access latency. This
> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> + *
> + * Return: true if any of the counters are in PCC regions, false otherwise
> + */
> +bool cppc_perf_ctrs_in_pcc(void)
> +{
> +       int cpu;
> +
> +       for_each_present_cpu(cpu) {
> +               struct cpc_register_resource *ref_perf_reg;
> +               struct cpc_desc *cpc_desc;
> +
> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> +
> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> +                       return true;
> +
> +
> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> +
> +               /*
> +                * If reference perf register is not supported then we should
> +                * use the nominal perf value
> +                */
> +               if (!CPC_SUPPORTED(ref_perf_reg))
> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> +
> +               if (CPC_IN_PCC(ref_perf_reg))
> +                       return true;
> +       }
> +       return false;
> +}
> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> +
>  /**
>   * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>   * @cpunum: CPU from which to read counters.
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 24eaf0ec344d..32fcb0bf74a4 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>
>  static struct cpufreq_driver cppc_cpufreq_driver;
>
> +static enum {
> +       FIE_UNSET = -1,
> +       FIE_ENABLED,
> +       FIE_DISABLED
> +} fie_disabled = FIE_UNSET;
> +
>  #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> +module_param(fie_disabled, int, 0444);
> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>
>  /* Frequency invariance support */
>  struct cppc_freq_invariance {
> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>         struct cppc_freq_invariance *cppc_fi;
>         int cpu, ret;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         for_each_cpu(cpu, policy->cpus) {
> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>         struct cppc_freq_invariance *cppc_fi;
>         int cpu;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         /* policy->cpus will be empty here, use related_cpus instead */
> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>         };
>         int ret;
>
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       switch (fie_disabled) {
> +       /* honor user request */
> +       case FIE_DISABLED:
> +       case FIE_ENABLED:
> +               break;
> +       case FIE_UNSET:
> +       default:

Would be more straightforward to do

if (fie_disabled == FIE_UNSET) {

here.

> +               fie_disabled = FIE_ENABLED;
> +               if (cppc_perf_ctrs_in_pcc()) {
> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
> +                       fie_disabled = FIE_DISABLED;
> +               }
> +               break;
> +       }
> +       if (fie_disabled)
>                 return;
>
>         kworker_fie = kthread_create_worker(0, "cppc_fie");
> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>
>  static void cppc_freq_invariance_exit(void)
>  {
> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +       if (fie_disabled)
>                 return;
>
>         kthread_destroy_worker(kworker_fie);
> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>                     wa_info[i].oem_revision == tbl->oem_revision) {
>                         /* Overwrite the get() callback */
>                         cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
> +                       fie_disabled = FIE_DISABLED;
>                         break;
>                 }
>         }
> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> index f73d357ecdf5..c5614444031f 100644
> --- a/include/acpi/cppc_acpi.h
> +++ b/include/acpi/cppc_acpi.h
> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>  extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>  extern int cppc_set_enable(int cpu, bool enable);
>  extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
> +extern bool cppc_perf_ctrs_in_pcc(void);
>  extern bool acpi_cpc_valid(void);
>  extern bool cppc_allow_fast_switch(void);
>  extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>  {
>         return -ENOTSUPP;
>  }
> +static inline bool cppc_perf_ctrs_in_pcc(void)
> +{
> +       return false;
> +}
>  static inline bool acpi_cpc_valid(void)
>  {
>         return false;
> --

Apart from the above it looks fine to me, but I would like to get an
ACK from Viresh on the second patch.

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-23 18:46       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-23 18:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, Rafael J. Wysocki, Len Brown,
	Viresh Kumar, Robert Moore, Punit Agrawal, Lukasz Luba,
	Ionela Voinescu, Pierre Gondois, Linux Kernel Mailing List,
	open list:ACPI COMPONENT ARCHITECTURE (ACPICA),
	Linux PM

Hi,

On 8/23/22 12:10, Rafael J. Wysocki wrote:
> On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton@arm.com> wrote:
>>
>> PCC regions utilize a mailbox to set/retrieve register values used by
>> the CPPC code. This is fine as long as the operations are
>> infrequent. With the FIE code enabled though the overhead can range
>> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
>> based machines.
>>
>> So, before enabling FIE assure none of the registers used by
>> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
>> enable a module parameter which can also disable it at boot or module
>> reload.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>>   include/acpi/cppc_acpi.h       |  5 +++++
>>   3 files changed, 73 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index 1e15a9f25ae9..c840bf606b30 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>>
>> +/**
>> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
>> + *
>> + * CPPC has flexibility about how counters describing CPU perf are delivered.
> 
> "CPU performance counters are accessed"

Sure,

> 
> 
>> + * One of the choices is PCC regions, which can have a high access latency. This
>> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
>> + *
>> + * Return: true if any of the counters are in PCC regions, false otherwise
>> + */
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +       int cpu;
>> +
>> +       for_each_present_cpu(cpu) {
>> +               struct cpc_register_resource *ref_perf_reg;
>> +               struct cpc_desc *cpc_desc;
>> +
>> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +                       return true;
>> +
>> +
>> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +               /*
>> +                * If reference perf register is not supported then we should
>> +                * use the nominal perf value
>> +                */
>> +               if (!CPC_SUPPORTED(ref_perf_reg))
>> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +               if (CPC_IN_PCC(ref_perf_reg))
>> +                       return true;
>> +       }
>> +       return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>
>> +static enum {
>> +       FIE_UNSET = -1,
>> +       FIE_ENABLED,
>> +       FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>>
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>>          struct cppc_freq_invariance *cppc_fi;
>>          int cpu, ret;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>>          struct cppc_freq_invariance *cppc_fi;
>>          int cpu;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          /* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>          };
>>          int ret;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       switch (fie_disabled) {
>> +       /* honor user request */
>> +       case FIE_DISABLED:
>> +       case FIE_ENABLED:
>> +               break;
>> +       case FIE_UNSET:
>> +       default:
> 
> Would be more straightforward to do
> 
> if (fie_disabled == FIE_UNSET) {
> 
> here.

Right, but then it wouldn't catch the other billion+ values that are the 
result of not being able to export a limit (AFAIK) on the module 
parameter. I could use an if:

if !((fie_disabled == FIE_DISABLE) || (fie_disabled == FIE_ENABLED)) {

}


if that is preferable. I thought the case with the explict default: 
though made it clearer that it was treating all those other values as unset.

> 
>> +               fie_disabled = FIE_ENABLED;
>> +               if (cppc_perf_ctrs_in_pcc()) {
>> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
>> +                       fie_disabled = FIE_DISABLED;
>> +               }
>> +               break;
>> +       }
>> +       if (fie_disabled)
>>                  return;
>>
>>          kworker_fie = kthread_create_worker(0, "cppc_fie");
>> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>>
>>   static void cppc_freq_invariance_exit(void)
>>   {
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          kthread_destroy_worker(kworker_fie);
>> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>>                      wa_info[i].oem_revision == tbl->oem_revision) {
>>                          /* Overwrite the get() callback */
>>                          cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
>> +                       fie_disabled = FIE_DISABLED;
>>                          break;
>>                  }
>>          }
>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
>> index f73d357ecdf5..c5614444031f 100644
>> --- a/include/acpi/cppc_acpi.h
>> +++ b/include/acpi/cppc_acpi.h
>> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>>   extern int cppc_set_enable(int cpu, bool enable);
>>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
>> +extern bool cppc_perf_ctrs_in_pcc(void);
>>   extern bool acpi_cpc_valid(void);
>>   extern bool cppc_allow_fast_switch(void);
>>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
>> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>>   {
>>          return -ENOTSUPP;
>>   }
>> +static inline bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +       return false;
>> +}
>>   static inline bool acpi_cpc_valid(void)
>>   {
>>          return false;
>> --
> 
> Apart from the above it looks fine to me, but I would like to get an
> ACK from Viresh on the second patch.
> 
> Thanks!

Thanks for looking at this.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-23 18:46       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-23 18:46 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7880 bytes --]

Hi,

On 8/23/22 12:10, Rafael J. Wysocki wrote:
> On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton(a)arm.com> wrote:
>>
>> PCC regions utilize a mailbox to set/retrieve register values used by
>> the CPPC code. This is fine as long as the operations are
>> infrequent. With the FIE code enabled though the overhead can range
>> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
>> based machines.
>>
>> So, before enabling FIE assure none of the registers used by
>> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
>> enable a module parameter which can also disable it at boot or module
>> reload.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
>> ---
>>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>>   include/acpi/cppc_acpi.h       |  5 +++++
>>   3 files changed, 73 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index 1e15a9f25ae9..c840bf606b30 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>>
>> +/**
>> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
>> + *
>> + * CPPC has flexibility about how counters describing CPU perf are delivered.
> 
> "CPU performance counters are accessed"

Sure,

> 
> 
>> + * One of the choices is PCC regions, which can have a high access latency. This
>> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
>> + *
>> + * Return: true if any of the counters are in PCC regions, false otherwise
>> + */
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +       int cpu;
>> +
>> +       for_each_present_cpu(cpu) {
>> +               struct cpc_register_resource *ref_perf_reg;
>> +               struct cpc_desc *cpc_desc;
>> +
>> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +                       return true;
>> +
>> +
>> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +               /*
>> +                * If reference perf register is not supported then we should
>> +                * use the nominal perf value
>> +                */
>> +               if (!CPC_SUPPORTED(ref_perf_reg))
>> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +               if (CPC_IN_PCC(ref_perf_reg))
>> +                       return true;
>> +       }
>> +       return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>
>> +static enum {
>> +       FIE_UNSET = -1,
>> +       FIE_ENABLED,
>> +       FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>>
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>>          struct cppc_freq_invariance *cppc_fi;
>>          int cpu, ret;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>>          struct cppc_freq_invariance *cppc_fi;
>>          int cpu;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          /* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>          };
>>          int ret;
>>
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       switch (fie_disabled) {
>> +       /* honor user request */
>> +       case FIE_DISABLED:
>> +       case FIE_ENABLED:
>> +               break;
>> +       case FIE_UNSET:
>> +       default:
> 
> Would be more straightforward to do
> 
> if (fie_disabled == FIE_UNSET) {
> 
> here.

Right, but then it wouldn't catch the other billion+ values that are the 
result of not being able to export a limit (AFAIK) on the module 
parameter. I could use an if:

if !((fie_disabled == FIE_DISABLE) || (fie_disabled == FIE_ENABLED)) {

}


if that is preferable. I thought the case with the explict default: 
though made it clearer that it was treating all those other values as unset.

> 
>> +               fie_disabled = FIE_ENABLED;
>> +               if (cppc_perf_ctrs_in_pcc()) {
>> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
>> +                       fie_disabled = FIE_DISABLED;
>> +               }
>> +               break;
>> +       }
>> +       if (fie_disabled)
>>                  return;
>>
>>          kworker_fie = kthread_create_worker(0, "cppc_fie");
>> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>>
>>   static void cppc_freq_invariance_exit(void)
>>   {
>> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +       if (fie_disabled)
>>                  return;
>>
>>          kthread_destroy_worker(kworker_fie);
>> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>>                      wa_info[i].oem_revision == tbl->oem_revision) {
>>                          /* Overwrite the get() callback */
>>                          cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
>> +                       fie_disabled = FIE_DISABLED;
>>                          break;
>>                  }
>>          }
>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
>> index f73d357ecdf5..c5614444031f 100644
>> --- a/include/acpi/cppc_acpi.h
>> +++ b/include/acpi/cppc_acpi.h
>> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>>   extern int cppc_set_enable(int cpu, bool enable);
>>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
>> +extern bool cppc_perf_ctrs_in_pcc(void);
>>   extern bool acpi_cpc_valid(void);
>>   extern bool cppc_allow_fast_switch(void);
>>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
>> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>>   {
>>          return -ENOTSUPP;
>>   }
>> +static inline bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +       return false;
>> +}
>>   static inline bool acpi_cpc_valid(void)
>>   {
>>          return false;
>> --
> 
> Apart from the above it looks fine to me, but I would like to get an
> ACK from Viresh on the second patch.
> 
> Thanks!

Thanks for looking at this.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
  2022-08-18 21:16   ` [Devel] " Jeremy Linton
  (?)
  (?)
@ 2022-08-24  6:13   ` Viresh Kumar
  2022-08-24 16:14       ` [Devel] " Jeremy Linton
  -1 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2022-08-24  6:13 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, rafael, lenb, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm

On 18-08-22, 16:16, Jeremy Linton wrote:
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> +bool cppc_perf_ctrs_in_pcc(void)
> +{
> +	int cpu;
> +
> +	for_each_present_cpu(cpu) {
> +		struct cpc_register_resource *ref_perf_reg;
> +		struct cpc_desc *cpc_desc;
> +
> +		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> +
> +		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> +			return true;
> +
> +
> +		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> +
> +		/*
> +		 * If reference perf register is not supported then we should
> +		 * use the nominal perf value
> +		 */
> +		if (!CPC_SUPPORTED(ref_perf_reg))
> +			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> +
> +		if (CPC_IN_PCC(ref_perf_reg))
> +			return true;
> +	}

Add a blank line here please.

> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> +
>  /**
>   * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>   * @cpunum: CPU from which to read counters.
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 24eaf0ec344d..32fcb0bf74a4 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>  
>  static struct cpufreq_driver cppc_cpufreq_driver;
>  
> +static enum {
> +	FIE_UNSET = -1,
> +	FIE_ENABLED,
> +	FIE_DISABLED
> +} fie_disabled = FIE_UNSET;
> +
>  #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> +module_param(fie_disabled, int, 0444);
> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>  
>  /* Frequency invariance support */
>  struct cppc_freq_invariance {
> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>  	struct cppc_freq_invariance *cppc_fi;
>  	int cpu, ret;
>  
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	if (fie_disabled)
>  		return;
>  
>  	for_each_cpu(cpu, policy->cpus) {
> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>  	struct cppc_freq_invariance *cppc_fi;
>  	int cpu;
>  
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	if (fie_disabled)
>  		return;
>  
>  	/* policy->cpus will be empty here, use related_cpus instead */
> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>  	};
>  	int ret;
>  
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	switch (fie_disabled) {
> +	/* honor user request */
> +	case FIE_DISABLED:
> +	case FIE_ENABLED:
> +		break;
> +	case FIE_UNSET:
> +	default:
> +		fie_disabled = FIE_ENABLED;
> +		if (cppc_perf_ctrs_in_pcc()) {
> +			pr_info("FIE not enabled on systems with registers in PCC\n");
> +			fie_disabled = FIE_DISABLED;
> +		}
> +		break;
> +	}

here too.

> +	if (fie_disabled)
>  		return;
>  
>  	kworker_fie = kthread_create_worker(0, "cppc_fie");
> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>  
>  static void cppc_freq_invariance_exit(void)
>  {
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	if (fie_disabled)
>  		return;
>  
>  	kthread_destroy_worker(kworker_fie);
> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>  		    wa_info[i].oem_revision == tbl->oem_revision) {
>  			/* Overwrite the get() callback */
>  			cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
> +			fie_disabled = FIE_DISABLED;
>  			break;
>  		}
>  	}
> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> index f73d357ecdf5..c5614444031f 100644
> --- a/include/acpi/cppc_acpi.h
> +++ b/include/acpi/cppc_acpi.h
> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>  extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>  extern int cppc_set_enable(int cpu, bool enable);
>  extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
> +extern bool cppc_perf_ctrs_in_pcc(void);
>  extern bool acpi_cpc_valid(void);
>  extern bool cppc_allow_fast_switch(void);
>  extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>  {
>  	return -ENOTSUPP;
>  }
> +static inline bool cppc_perf_ctrs_in_pcc(void)
> +{
> +	return false;
> +}
>  static inline bool acpi_cpc_valid(void)
>  {
>  	return false;

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 2/2] cpufreq: CPPC: Change FIE default
  2022-08-18 21:16   ` [Devel] " Jeremy Linton
  (?)
@ 2022-08-24  6:14   ` Viresh Kumar
  2022-08-24 14:04     ` Lukasz Luba
  -1 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2022-08-24  6:14 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, rafael, lenb, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm

On 18-08-22, 16:16, Jeremy Linton wrote:
> FIE is mostly implemented as PCC mailboxes on arm machines.  This was
> enabled by default without any data suggesting that it does anything
> but hurt system performance. Lets change the default to 'n' until
> hardware appears which clearly benefits.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/cpufreq/Kconfig.arm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index 954749afb5fe..ad66d8f15db0 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -22,7 +22,7 @@ config ACPI_CPPC_CPUFREQ
>  config ACPI_CPPC_CPUFREQ_FIE
>  	bool "Frequency Invariance support for CPPC cpufreq driver"
>  	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
> -	default y
> +	default n
>  	help
>  	  This extends frequency invariance support in the CPPC cpufreq driver,
>  	  by using CPPC delivered and reference performance counters.

Why is this required after we have the first patch in ?

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 13:22         ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2022-08-24 13:22 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: ACPI Devel Maling List, Len Brown, Viresh Kumar, Robert Moore,
	Punit Agrawal, Lukasz Luba, Ionela Voinescu, Pierre Gondois,
	Linux Kernel Mailing List,
	open list:ACPI COMPONENT ARCHITECTURE (ACPICA),
	Linux PM

On Tue, Aug 23, 2022 at 8:46 PM Jeremy Linton <jeremy.linton@arm.com> wrote:
>
> Hi,
>
> On 8/23/22 12:10, Rafael J. Wysocki wrote:
> > On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton@arm.com> wrote:
> >>
> >> PCC regions utilize a mailbox to set/retrieve register values used by
> >> the CPPC code. This is fine as long as the operations are
> >> infrequent. With the FIE code enabled though the overhead can range
> >> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> >> based machines.
> >>
> >> So, before enabling FIE assure none of the registers used by
> >> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> >> enable a module parameter which can also disable it at boot or module
> >> reload.
> >>
> >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> >> ---
> >>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
> >>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
> >>   include/acpi/cppc_acpi.h       |  5 +++++
> >>   3 files changed, 73 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 1e15a9f25ae9..c840bf606b30 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
> >>   }
> >>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
> >>
> >> +/**
> >> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> >> + *
> >> + * CPPC has flexibility about how counters describing CPU perf are delivered.
> >
> > "CPU performance counters are accessed"
>
> Sure,
>
> >
> >
> >> + * One of the choices is PCC regions, which can have a high access latency. This
> >> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> >> + *
> >> + * Return: true if any of the counters are in PCC regions, false otherwise
> >> + */
> >> +bool cppc_perf_ctrs_in_pcc(void)
> >> +{
> >> +       int cpu;
> >> +
> >> +       for_each_present_cpu(cpu) {
> >> +               struct cpc_register_resource *ref_perf_reg;
> >> +               struct cpc_desc *cpc_desc;
> >> +
> >> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> >> +
> >> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> >> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> >> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> >> +                       return true;
> >> +
> >> +
> >> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> >> +
> >> +               /*
> >> +                * If reference perf register is not supported then we should
> >> +                * use the nominal perf value
> >> +                */
> >> +               if (!CPC_SUPPORTED(ref_perf_reg))
> >> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> >> +
> >> +               if (CPC_IN_PCC(ref_perf_reg))
> >> +                       return true;
> >> +       }
> >> +       return false;
> >> +}
> >> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> >> +
> >>   /**
> >>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
> >>    * @cpunum: CPU from which to read counters.
> >> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> >> index 24eaf0ec344d..32fcb0bf74a4 100644
> >> --- a/drivers/cpufreq/cppc_cpufreq.c
> >> +++ b/drivers/cpufreq/cppc_cpufreq.c
> >> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
> >>
> >>   static struct cpufreq_driver cppc_cpufreq_driver;
> >>
> >> +static enum {
> >> +       FIE_UNSET = -1,
> >> +       FIE_ENABLED,
> >> +       FIE_DISABLED
> >> +} fie_disabled = FIE_UNSET;
> >> +
> >>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> >> +module_param(fie_disabled, int, 0444);
> >> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
> >>
> >>   /* Frequency invariance support */
> >>   struct cppc_freq_invariance {
> >> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
> >>          struct cppc_freq_invariance *cppc_fi;
> >>          int cpu, ret;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          for_each_cpu(cpu, policy->cpus) {
> >> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
> >>          struct cppc_freq_invariance *cppc_fi;
> >>          int cpu;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          /* policy->cpus will be empty here, use related_cpus instead */
> >> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
> >>          };
> >>          int ret;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       switch (fie_disabled) {
> >> +       /* honor user request */
> >> +       case FIE_DISABLED:
> >> +       case FIE_ENABLED:
> >> +               break;
> >> +       case FIE_UNSET:
> >> +       default:
> >
> > Would be more straightforward to do
> >
> > if (fie_disabled == FIE_UNSET) {
> >
> > here.
>
> Right, but then it wouldn't catch the other billion+ values that are the
> result of not being able to export a limit (AFAIK) on the module
> parameter. I could use an if:

Hmm.

I've missed the module_param() part.

It doesn't even make sense to use enum for the variable type in that case.

Also you can always do

if (fie_disabled < 0) {
...
}

> if !((fie_disabled == FIE_DISABLE) || (fie_disabled == FIE_ENABLED)) {
>
> }
>
>
> if that is preferable. I thought the case with the explict default:
> though made it clearer that it was treating all those other values as unset.
>
> >
> >> +               fie_disabled = FIE_ENABLED;
> >> +               if (cppc_perf_ctrs_in_pcc()) {
> >> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
> >> +                       fie_disabled = FIE_DISABLED;
> >> +               }
> >> +               break;
> >> +       }
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          kworker_fie = kthread_create_worker(0, "cppc_fie");
> >> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
> >>
> >>   static void cppc_freq_invariance_exit(void)
> >>   {
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          kthread_destroy_worker(kworker_fie);
> >> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
> >>                      wa_info[i].oem_revision == tbl->oem_revision) {
> >>                          /* Overwrite the get() callback */
> >>                          cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
> >> +                       fie_disabled = FIE_DISABLED;
> >>                          break;
> >>                  }
> >>          }
> >> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> >> index f73d357ecdf5..c5614444031f 100644
> >> --- a/include/acpi/cppc_acpi.h
> >> +++ b/include/acpi/cppc_acpi.h
> >> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
> >>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
> >>   extern int cppc_set_enable(int cpu, bool enable);
> >>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
> >> +extern bool cppc_perf_ctrs_in_pcc(void);
> >>   extern bool acpi_cpc_valid(void);
> >>   extern bool cppc_allow_fast_switch(void);
> >>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
> >> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
> >>   {
> >>          return -ENOTSUPP;
> >>   }
> >> +static inline bool cppc_perf_ctrs_in_pcc(void)
> >> +{
> >> +       return false;
> >> +}
> >>   static inline bool acpi_cpc_valid(void)
> >>   {
> >>          return false;
> >> --
> >
> > Apart from the above it looks fine to me, but I would like to get an
> > ACK from Viresh on the second patch.
> >
> > Thanks!
>
> Thanks for looking at this.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 13:22         ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2022-08-24 13:22 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 8551 bytes --]

On Tue, Aug 23, 2022 at 8:46 PM Jeremy Linton <jeremy.linton(a)arm.com> wrote:
>
> Hi,
>
> On 8/23/22 12:10, Rafael J. Wysocki wrote:
> > On Thu, Aug 18, 2022 at 11:24 PM Jeremy Linton <jeremy.linton(a)arm.com> wrote:
> >>
> >> PCC regions utilize a mailbox to set/retrieve register values used by
> >> the CPPC code. This is fine as long as the operations are
> >> infrequent. With the FIE code enabled though the overhead can range
> >> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> >> based machines.
> >>
> >> So, before enabling FIE assure none of the registers used by
> >> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> >> enable a module parameter which can also disable it at boot or module
> >> reload.
> >>
> >> Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
> >> ---
> >>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
> >>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
> >>   include/acpi/cppc_acpi.h       |  5 +++++
> >>   3 files changed, 73 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> >> index 1e15a9f25ae9..c840bf606b30 100644
> >> --- a/drivers/acpi/cppc_acpi.c
> >> +++ b/drivers/acpi/cppc_acpi.c
> >> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
> >>   }
> >>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
> >>
> >> +/**
> >> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> >> + *
> >> + * CPPC has flexibility about how counters describing CPU perf are delivered.
> >
> > "CPU performance counters are accessed"
>
> Sure,
>
> >
> >
> >> + * One of the choices is PCC regions, which can have a high access latency. This
> >> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> >> + *
> >> + * Return: true if any of the counters are in PCC regions, false otherwise
> >> + */
> >> +bool cppc_perf_ctrs_in_pcc(void)
> >> +{
> >> +       int cpu;
> >> +
> >> +       for_each_present_cpu(cpu) {
> >> +               struct cpc_register_resource *ref_perf_reg;
> >> +               struct cpc_desc *cpc_desc;
> >> +
> >> +               cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> >> +
> >> +               if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> >> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> >> +                   CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> >> +                       return true;
> >> +
> >> +
> >> +               ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> >> +
> >> +               /*
> >> +                * If reference perf register is not supported then we should
> >> +                * use the nominal perf value
> >> +                */
> >> +               if (!CPC_SUPPORTED(ref_perf_reg))
> >> +                       ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> >> +
> >> +               if (CPC_IN_PCC(ref_perf_reg))
> >> +                       return true;
> >> +       }
> >> +       return false;
> >> +}
> >> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> >> +
> >>   /**
> >>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
> >>    * @cpunum: CPU from which to read counters.
> >> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> >> index 24eaf0ec344d..32fcb0bf74a4 100644
> >> --- a/drivers/cpufreq/cppc_cpufreq.c
> >> +++ b/drivers/cpufreq/cppc_cpufreq.c
> >> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
> >>
> >>   static struct cpufreq_driver cppc_cpufreq_driver;
> >>
> >> +static enum {
> >> +       FIE_UNSET = -1,
> >> +       FIE_ENABLED,
> >> +       FIE_DISABLED
> >> +} fie_disabled = FIE_UNSET;
> >> +
> >>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> >> +module_param(fie_disabled, int, 0444);
> >> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
> >>
> >>   /* Frequency invariance support */
> >>   struct cppc_freq_invariance {
> >> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
> >>          struct cppc_freq_invariance *cppc_fi;
> >>          int cpu, ret;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          for_each_cpu(cpu, policy->cpus) {
> >> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
> >>          struct cppc_freq_invariance *cppc_fi;
> >>          int cpu;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          /* policy->cpus will be empty here, use related_cpus instead */
> >> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
> >>          };
> >>          int ret;
> >>
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       switch (fie_disabled) {
> >> +       /* honor user request */
> >> +       case FIE_DISABLED:
> >> +       case FIE_ENABLED:
> >> +               break;
> >> +       case FIE_UNSET:
> >> +       default:
> >
> > Would be more straightforward to do
> >
> > if (fie_disabled == FIE_UNSET) {
> >
> > here.
>
> Right, but then it wouldn't catch the other billion+ values that are the
> result of not being able to export a limit (AFAIK) on the module
> parameter. I could use an if:

Hmm.

I've missed the module_param() part.

It doesn't even make sense to use enum for the variable type in that case.

Also you can always do

if (fie_disabled < 0) {
...
}

> if !((fie_disabled == FIE_DISABLE) || (fie_disabled == FIE_ENABLED)) {
>
> }
>
>
> if that is preferable. I thought the case with the explict default:
> though made it clearer that it was treating all those other values as unset.
>
> >
> >> +               fie_disabled = FIE_ENABLED;
> >> +               if (cppc_perf_ctrs_in_pcc()) {
> >> +                       pr_info("FIE not enabled on systems with registers in PCC\n");
> >> +                       fie_disabled = FIE_DISABLED;
> >> +               }
> >> +               break;
> >> +       }
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          kworker_fie = kthread_create_worker(0, "cppc_fie");
> >> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
> >>
> >>   static void cppc_freq_invariance_exit(void)
> >>   {
> >> -       if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> >> +       if (fie_disabled)
> >>                  return;
> >>
> >>          kthread_destroy_worker(kworker_fie);
> >> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
> >>                      wa_info[i].oem_revision == tbl->oem_revision) {
> >>                          /* Overwrite the get() callback */
> >>                          cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
> >> +                       fie_disabled = FIE_DISABLED;
> >>                          break;
> >>                  }
> >>          }
> >> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> >> index f73d357ecdf5..c5614444031f 100644
> >> --- a/include/acpi/cppc_acpi.h
> >> +++ b/include/acpi/cppc_acpi.h
> >> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
> >>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
> >>   extern int cppc_set_enable(int cpu, bool enable);
> >>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
> >> +extern bool cppc_perf_ctrs_in_pcc(void);
> >>   extern bool acpi_cpc_valid(void);
> >>   extern bool cppc_allow_fast_switch(void);
> >>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
> >> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
> >>   {
> >>          return -ENOTSUPP;
> >>   }
> >> +static inline bool cppc_perf_ctrs_in_pcc(void)
> >> +{
> >> +       return false;
> >> +}
> >>   static inline bool acpi_cpc_valid(void)
> >>   {
> >>          return false;
> >> --
> >
> > Apart from the above it looks fine to me, but I would like to get an
> > ACK from Viresh on the second patch.
> >
> > Thanks!
>
> Thanks for looking at this.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 2/2] cpufreq: CPPC: Change FIE default
  2022-08-24  6:14   ` Viresh Kumar
@ 2022-08-24 14:04     ` Lukasz Luba
  2022-08-30  5:44       ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Lukasz Luba @ 2022-08-24 14:04 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-acpi, rafael, lenb, robert.moore, punit.agrawal,
	ionela.voinescu, pierre.gondois, linux-kernel, devel, linux-pm,
	Dietmar Eggemann, Morten Rasmussen, Souvik Chakravarty,
	Jeremy Linton

Hi Viresh,

+CC Dietmar, Morten and Souvik

On 8/24/22 07:14, Viresh Kumar wrote:
> On 18-08-22, 16:16, Jeremy Linton wrote:
>> FIE is mostly implemented as PCC mailboxes on arm machines.  This was
>> enabled by default without any data suggesting that it does anything
>> but hurt system performance. Lets change the default to 'n' until
>> hardware appears which clearly benefits.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/cpufreq/Kconfig.arm | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
>> index 954749afb5fe..ad66d8f15db0 100644
>> --- a/drivers/cpufreq/Kconfig.arm
>> +++ b/drivers/cpufreq/Kconfig.arm
>> @@ -22,7 +22,7 @@ config ACPI_CPPC_CPUFREQ
>>   config ACPI_CPPC_CPUFREQ_FIE
>>   	bool "Frequency Invariance support for CPPC cpufreq driver"
>>   	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
>> -	default y
>> +	default n
>>   	help
>>   	  This extends frequency invariance support in the CPPC cpufreq driver,
>>   	  by using CPPC delivered and reference performance counters.
> 
> Why is this required after we have the first patch in ?
> 

There are a few issues with this ACPI_CPPC_CPUFREQ_FIE solution:
1. The design is very heavy and that kernel thread can be ~512 util
    (that's what we have been told by one of our partners from servers'
    world)
2. The HW & FW design is not suited for this task. Newer HW will just
    have AMU counters (on Arm64) for FIE
3. The patches haven't been tested in terms of performance overhead
    AFAIK. Although, it affects existing Arm64 servers with their
    workloads.
4. AFAIK non of our server partners wasn't complaining about issues with
    old FIE mechanism.

In our team we are not allowed to send code that we cannot prove in many
ways.

I would just not compile this at all (or even revert this feature).
If someone compiled in this by accident - make sure we disable it
after checks like in the patch 1/2. I'll add also some comments
to that patch.

Regards,
Lukasz




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
  2022-08-18 21:16   ` [Devel] " Jeremy Linton
                     ` (2 preceding siblings ...)
  (?)
@ 2022-08-24 14:41   ` Lukasz Luba
  2022-08-24 16:11       ` [Devel] " Jeremy Linton
  2022-08-30  3:27     ` Viresh Kumar
  -1 siblings, 2 replies; 22+ messages in thread
From: Lukasz Luba @ 2022-08-24 14:41 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: rafael, lenb, viresh.kumar, robert.moore, punit.agrawal,
	ionela.voinescu, pierre.gondois, linux-kernel, devel, linux-pm,
	linux-acpi, Dietmar Eggemann, Morten Rasmussen,
	Souvik Chakravarty

Hi Jeremy,

+CC Dietmar, Morten and Souvik

On 8/18/22 22:16, Jeremy Linton wrote:
> PCC regions utilize a mailbox to set/retrieve register values used by
> the CPPC code. This is fine as long as the operations are
> infrequent. With the FIE code enabled though the overhead can range
> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> based machines.
> 
> So, before enabling FIE assure none of the registers used by
> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> enable a module parameter which can also disable it at boot or module
> reload.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>   include/acpi/cppc_acpi.h       |  5 +++++
>   3 files changed, 73 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 1e15a9f25ae9..c840bf606b30 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>   }
>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>   
> +/**
> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> + *
> + * CPPC has flexibility about how counters describing CPU perf are delivered.
> + * One of the choices is PCC regions, which can have a high access latency. This
> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> + *
> + * Return: true if any of the counters are in PCC regions, false otherwise
> + */
> +bool cppc_perf_ctrs_in_pcc(void)
> +{
> +	int cpu;
> +
> +	for_each_present_cpu(cpu) {
> +		struct cpc_register_resource *ref_perf_reg;
> +		struct cpc_desc *cpc_desc;
> +
> +		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> +
> +		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> +			return true;
> +
> +
> +		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> +
> +		/*
> +		 * If reference perf register is not supported then we should
> +		 * use the nominal perf value
> +		 */
> +		if (!CPC_SUPPORTED(ref_perf_reg))
> +			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> +
> +		if (CPC_IN_PCC(ref_perf_reg))
> +			return true;
> +	}

Do we have a platform which returns false here?

> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> +
>   /**
>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>    * @cpunum: CPU from which to read counters.
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 24eaf0ec344d..32fcb0bf74a4 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>   
>   static struct cpufreq_driver cppc_cpufreq_driver;
>   
> +static enum {
> +	FIE_UNSET = -1,
> +	FIE_ENABLED,
> +	FIE_DISABLED
> +} fie_disabled = FIE_UNSET;
> +
>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> +module_param(fie_disabled, int, 0444);
> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");

Why we need the modules support?
I would drop this, since the fie_disabled would be set properly when
needed. The code would be cleaner (more below).

>   
>   /* Frequency invariance support */
>   struct cppc_freq_invariance {
> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>   	struct cppc_freq_invariance *cppc_fi;
>   	int cpu, ret;
>   
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	if (fie_disabled)
>   		return;
>   
>   	for_each_cpu(cpu, policy->cpus) {
> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>   	struct cppc_freq_invariance *cppc_fi;
>   	int cpu;
>   
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	if (fie_disabled)
>   		return;
>   
>   	/* policy->cpus will be empty here, use related_cpus instead */
> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>   	};
>   	int ret;
>   
> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> +	switch (fie_disabled) {
> +	/* honor user request */
> +	case FIE_DISABLED:
> +	case FIE_ENABLED:

This module's over-write doesn't look 'clean'.
Is it OK to allow a user to go with the poor performing
system (likely on many platforms)? Or we assume that there are
platforms which has a bit faster mailboxes and they already
have the FIE issue impacting task's utilization measurements.

It looks like we are not sure about the solution. On one hand
we implement those checks in the cppc_perf_ctrs_in_pcc()
which could set the flag, but on the other hand we allow user
to decide. IMO this creates diversity that we are not able to control.
It creates another tunable knob in the kernel, which then is forgotten
to check.

I still haven't seen information that the old FIE was an issue on those
servers and had impact on task utilization measurements. This should be
a main requirement for this new feature. This would be after we proved
that the utilization problem was due to the FIE and not something else 
(like uArch variation or workload variation).

IMO let's revert the ACPI_CPPC_CPUFREQ_FIE. When we get data that
FIE is an issue on those servers we can come back to this topic.

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 16:11       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-24 16:11 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: rafael, lenb, viresh.kumar, robert.moore, punit.agrawal,
	ionela.voinescu, pierre.gondois, linux-kernel, devel, linux-pm,
	linux-acpi, Dietmar Eggemann, Morten Rasmussen,
	Souvik Chakravarty

Hi,

On 8/24/22 09:41, Lukasz Luba wrote:
> Hi Jeremy,
> 
> +CC Dietmar, Morten and Souvik
> 
> On 8/18/22 22:16, Jeremy Linton wrote:
>> PCC regions utilize a mailbox to set/retrieve register values used by
>> the CPPC code. This is fine as long as the operations are
>> infrequent. With the FIE code enabled though the overhead can range
>> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
>> based machines.
>>
>> So, before enabling FIE assure none of the registers used by
>> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
>> enable a module parameter which can also disable it at boot or module
>> reload.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>>   include/acpi/cppc_acpi.h       |  5 +++++
>>   3 files changed, 73 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index 1e15a9f25ae9..c840bf606b30 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct 
>> cppc_perf_caps *perf_caps)
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>> +/**
>> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC 
>> region.
>> + *
>> + * CPPC has flexibility about how counters describing CPU perf are 
>> delivered.
>> + * One of the choices is PCC regions, which can have a high access 
>> latency. This
>> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead 
>> of time.
>> + *
>> + * Return: true if any of the counters are in PCC regions, false 
>> otherwise
>> + */
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +    int cpu;
>> +
>> +    for_each_present_cpu(cpu) {
>> +        struct cpc_register_resource *ref_perf_reg;
>> +        struct cpc_desc *cpc_desc;
>> +
>> +        cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +        if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +            CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +            CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +            return true;
>> +
>> +
>> +        ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +        /*
>> +         * If reference perf register is not supported then we should
>> +         * use the nominal perf value
>> +         */
>> +        if (!CPC_SUPPORTED(ref_perf_reg))
>> +            ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +        if (CPC_IN_PCC(ref_perf_reg))
>> +            return true;
>> +    }
> 
> Do we have a platform which returns false here?

I'm not aware of one, but I don't have access to every bit of HW either.


> 
>> +    return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>> +static enum {
>> +    FIE_UNSET = -1,
>> +    FIE_ENABLED,
>> +    FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine 
>> (FIE)");
> 
> Why we need the modules support?
> I would drop this, since the fie_disabled would be set properly when
> needed. The code would be cleaner (more below).

Well the original version was simpler, but I tend to agree with Ionela 
who proposed this version in a previous review. The module param at this 
point is a debugging/testing statment since it allows the user to force 
FIE on or off independent of the PCC decision. Until we have a clear 
statment about how/when/where this feature is useful, having the ability 
to make the choice dynamically at runtime is quite useful and less 
intrusive than having multiple kernels/modules on the machine with the 
config option flipped, and requiring a reboot.

> 
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct 
>> cpufreq_policy *policy)
>>       struct cppc_freq_invariance *cppc_fi;
>>       int cpu, ret;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    if (fie_disabled)
>>           return;
>>       for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct 
>> cpufreq_policy *policy)
>>       struct cppc_freq_invariance *cppc_fi;
>>       int cpu;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    if (fie_disabled)
>>           return;
>>       /* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>       };
>>       int ret;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    switch (fie_disabled) {
>> +    /* honor user request */
>> +    case FIE_DISABLED:
>> +    case FIE_ENABLED:
> 
> This module's over-write doesn't look 'clean'.
> Is it OK to allow a user to go with the poor performing
> system (likely on many platforms)? Or we assume that there are
> platforms which has a bit faster mailboxes and they already
> have the FIE issue impacting task's utilization measurements.

I think with this patch applied we aren't any worse than before, but 
that is based on the fact that I've not seen a machine that has actual 
CPPC hardware registers (rather than mailboxes).

So I think your suggesting that we will then have to revisit the code 
(to maybe avoid all the cppc_fie task/etc overhead) if a machine appears 
with hardware registers. And I tend to sorta agree, but that is what the 
second patch is for :) which will likely be what most distro's end up 
applying on generic kernels.

> 
> It looks like we are not sure about the solution. On one hand
> we implement those checks in the cppc_perf_ctrs_in_pcc()
> which could set the flag, but on the other hand we allow user
> to decide. IMO this creates diversity that we are not able to control.
> It creates another tunable knob in the kernel, which then is forgotten
> to check.

Your average user will never turn this knob, and if they do, its likely 
to solve a problem, or test for performace. The fact that we aren't 100% 
sure of where/when this feature is useful is the argument for making it 
a tunable.


> 
> I still haven't seen information that the old FIE was an issue on those
> servers and had impact on task utilization measurements. This should be
> a main requirement for this new feature. This would be after we proved
> that the utilization problem was due to the FIE and not something else 
> (like uArch variation or workload variation).
> 
> IMO let's revert the ACPI_CPPC_CPUFREQ_FIE. When we get data that
> FIE is an issue on those servers we can come back to this topic.

I don't really have an opinion about this, maybe someone else can 
comment :)

Although, with both of these patches applied we can kick the decision 
down the road and revisit it in a couple years, and maybe have a clearer 
view.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 16:11       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-24 16:11 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7957 bytes --]

Hi,

On 8/24/22 09:41, Lukasz Luba wrote:
> Hi Jeremy,
> 
> +CC Dietmar, Morten and Souvik
> 
> On 8/18/22 22:16, Jeremy Linton wrote:
>> PCC regions utilize a mailbox to set/retrieve register values used by
>> the CPPC code. This is fine as long as the operations are
>> infrequent. With the FIE code enabled though the overhead can range
>> from 2-11% of system CPU overhead (ex: as measured by top) on Arm
>> based machines.
>>
>> So, before enabling FIE assure none of the registers used by
>> cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
>> enable a module parameter which can also disable it at boot or module
>> reload.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
>> ---
>>   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
>>   include/acpi/cppc_acpi.h       |  5 +++++
>>   3 files changed, 73 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index 1e15a9f25ae9..c840bf606b30 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct 
>> cppc_perf_caps *perf_caps)
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
>> +/**
>> + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC 
>> region.
>> + *
>> + * CPPC has flexibility about how counters describing CPU perf are 
>> delivered.
>> + * One of the choices is PCC regions, which can have a high access 
>> latency. This
>> + * routine allows callers of cppc_get_perf_ctrs() to know this ahead 
>> of time.
>> + *
>> + * Return: true if any of the counters are in PCC regions, false 
>> otherwise
>> + */
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +    int cpu;
>> +
>> +    for_each_present_cpu(cpu) {
>> +        struct cpc_register_resource *ref_perf_reg;
>> +        struct cpc_desc *cpc_desc;
>> +
>> +        cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +        if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +            CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +            CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +            return true;
>> +
>> +
>> +        ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +        /*
>> +         * If reference perf register is not supported then we should
>> +         * use the nominal perf value
>> +         */
>> +        if (!CPC_SUPPORTED(ref_perf_reg))
>> +            ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +        if (CPC_IN_PCC(ref_perf_reg))
>> +            return true;
>> +    }
> 
> Do we have a platform which returns false here?

I'm not aware of one, but I don't have access to every bit of HW either.


> 
>> +    return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>> +static enum {
>> +    FIE_UNSET = -1,
>> +    FIE_ENABLED,
>> +    FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine 
>> (FIE)");
> 
> Why we need the modules support?
> I would drop this, since the fie_disabled would be set properly when
> needed. The code would be cleaner (more below).

Well the original version was simpler, but I tend to agree with Ionela 
who proposed this version in a previous review. The module param at this 
point is a debugging/testing statment since it allows the user to force 
FIE on or off independent of the PCC decision. Until we have a clear 
statment about how/when/where this feature is useful, having the ability 
to make the choice dynamically at runtime is quite useful and less 
intrusive than having multiple kernels/modules on the machine with the 
config option flipped, and requiring a reboot.

> 
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct 
>> cpufreq_policy *policy)
>>       struct cppc_freq_invariance *cppc_fi;
>>       int cpu, ret;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    if (fie_disabled)
>>           return;
>>       for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct 
>> cpufreq_policy *policy)
>>       struct cppc_freq_invariance *cppc_fi;
>>       int cpu;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    if (fie_disabled)
>>           return;
>>       /* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>       };
>>       int ret;
>> -    if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +    switch (fie_disabled) {
>> +    /* honor user request */
>> +    case FIE_DISABLED:
>> +    case FIE_ENABLED:
> 
> This module's over-write doesn't look 'clean'.
> Is it OK to allow a user to go with the poor performing
> system (likely on many platforms)? Or we assume that there are
> platforms which has a bit faster mailboxes and they already
> have the FIE issue impacting task's utilization measurements.

I think with this patch applied we aren't any worse than before, but 
that is based on the fact that I've not seen a machine that has actual 
CPPC hardware registers (rather than mailboxes).

So I think your suggesting that we will then have to revisit the code 
(to maybe avoid all the cppc_fie task/etc overhead) if a machine appears 
with hardware registers. And I tend to sorta agree, but that is what the 
second patch is for :) which will likely be what most distro's end up 
applying on generic kernels.

> 
> It looks like we are not sure about the solution. On one hand
> we implement those checks in the cppc_perf_ctrs_in_pcc()
> which could set the flag, but on the other hand we allow user
> to decide. IMO this creates diversity that we are not able to control.
> It creates another tunable knob in the kernel, which then is forgotten
> to check.

Your average user will never turn this knob, and if they do, its likely 
to solve a problem, or test for performace. The fact that we aren't 100% 
sure of where/when this feature is useful is the argument for making it 
a tunable.


> 
> I still haven't seen information that the old FIE was an issue on those
> servers and had impact on task utilization measurements. This should be
> a main requirement for this new feature. This would be after we proved
> that the utilization problem was due to the FIE and not something else 
> (like uArch variation or workload variation).
> 
> IMO let's revert the ACPI_CPPC_CPUFREQ_FIE. When we get data that
> FIE is an issue on those servers we can come back to this topic.

I don't really have an opinion about this, maybe someone else can 
comment :)

Although, with both of these patches applied we can kick the decision 
down the road and revisit it in a couple years, and maybe have a clearer 
view.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 16:14       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-24 16:14 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-acpi, rafael, lenb, robert.moore, punit.agrawal,
	lukasz.luba, ionela.voinescu, pierre.gondois, linux-kernel,
	devel, linux-pm

Hi,

On 8/24/22 01:13, Viresh Kumar wrote:
> On 18-08-22, 16:16, Jeremy Linton wrote:
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +	int cpu;
>> +
>> +	for_each_present_cpu(cpu) {
>> +		struct cpc_register_resource *ref_perf_reg;
>> +		struct cpc_desc *cpc_desc;
>> +
>> +		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +			return true;
>> +
>> +
>> +		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +		/*
>> +		 * If reference perf register is not supported then we should
>> +		 * use the nominal perf value
>> +		 */
>> +		if (!CPC_SUPPORTED(ref_perf_reg))
>> +			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +		if (CPC_IN_PCC(ref_perf_reg))
>> +			return true;
>> +	}
> 
> Add a blank line here please.

Sure,

> 
>> +	return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>   
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>   
>> +static enum {
>> +	FIE_UNSET = -1,
>> +	FIE_ENABLED,
>> +	FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>>   
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>>   	struct cppc_freq_invariance *cppc_fi;
>>   	int cpu, ret;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>>   	struct cppc_freq_invariance *cppc_fi;
>>   	int cpu;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	/* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>   	};
>>   	int ret;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	switch (fie_disabled) {
>> +	/* honor user request */
>> +	case FIE_DISABLED:
>> +	case FIE_ENABLED:
>> +		break;
>> +	case FIE_UNSET:
>> +	default:
>> +		fie_disabled = FIE_ENABLED;
>> +		if (cppc_perf_ctrs_in_pcc()) {
>> +			pr_info("FIE not enabled on systems with registers in PCC\n");
>> +			fie_disabled = FIE_DISABLED;
>> +		}
>> +		break;
>> +	}
> 
> here too.

Sure,

> 
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	kworker_fie = kthread_create_worker(0, "cppc_fie");
>> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>>   
>>   static void cppc_freq_invariance_exit(void)
>>   {
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	kthread_destroy_worker(kworker_fie);
>> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>>   		    wa_info[i].oem_revision == tbl->oem_revision) {
>>   			/* Overwrite the get() callback */
>>   			cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
>> +			fie_disabled = FIE_DISABLED;
>>   			break;
>>   		}
>>   	}
>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
>> index f73d357ecdf5..c5614444031f 100644
>> --- a/include/acpi/cppc_acpi.h
>> +++ b/include/acpi/cppc_acpi.h
>> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>>   extern int cppc_set_enable(int cpu, bool enable);
>>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
>> +extern bool cppc_perf_ctrs_in_pcc(void);
>>   extern bool acpi_cpc_valid(void);
>>   extern bool cppc_allow_fast_switch(void);
>>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
>> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>>   {
>>   	return -ENOTSUPP;
>>   }
>> +static inline bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +	return false;
>> +}
>>   static inline bool acpi_cpc_valid(void)
>>   {
>>   	return false;
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> 


Thanks for looking at this.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Devel] Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
@ 2022-08-24 16:14       ` Jeremy Linton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Linton @ 2022-08-24 16:14 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5029 bytes --]

Hi,

On 8/24/22 01:13, Viresh Kumar wrote:
> On 18-08-22, 16:16, Jeremy Linton wrote:
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> +bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +	int cpu;
>> +
>> +	for_each_present_cpu(cpu) {
>> +		struct cpc_register_resource *ref_perf_reg;
>> +		struct cpc_desc *cpc_desc;
>> +
>> +		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +
>> +		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
>> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
>> +		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
>> +			return true;
>> +
>> +
>> +		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
>> +
>> +		/*
>> +		 * If reference perf register is not supported then we should
>> +		 * use the nominal perf value
>> +		 */
>> +		if (!CPC_SUPPORTED(ref_perf_reg))
>> +			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
>> +
>> +		if (CPC_IN_PCC(ref_perf_reg))
>> +			return true;
>> +	}
> 
> Add a blank line here please.

Sure,

> 
>> +	return false;
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
>> +
>>   /**
>>    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
>>    * @cpunum: CPU from which to read counters.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0ec344d..32fcb0bf74a4 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
>>   
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>   
>> +static enum {
>> +	FIE_UNSET = -1,
>> +	FIE_ENABLED,
>> +	FIE_DISABLED
>> +} fie_disabled = FIE_UNSET;
>> +
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>> +module_param(fie_disabled, int, 0444);
>> +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
>>   
>>   /* Frequency invariance support */
>>   struct cppc_freq_invariance {
>> @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
>>   	struct cppc_freq_invariance *cppc_fi;
>>   	int cpu, ret;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	for_each_cpu(cpu, policy->cpus) {
>> @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
>>   	struct cppc_freq_invariance *cppc_fi;
>>   	int cpu;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	/* policy->cpus will be empty here, use related_cpus instead */
>> @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
>>   	};
>>   	int ret;
>>   
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	switch (fie_disabled) {
>> +	/* honor user request */
>> +	case FIE_DISABLED:
>> +	case FIE_ENABLED:
>> +		break;
>> +	case FIE_UNSET:
>> +	default:
>> +		fie_disabled = FIE_ENABLED;
>> +		if (cppc_perf_ctrs_in_pcc()) {
>> +			pr_info("FIE not enabled on systems with registers in PCC\n");
>> +			fie_disabled = FIE_DISABLED;
>> +		}
>> +		break;
>> +	}
> 
> here too.

Sure,

> 
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	kworker_fie = kthread_create_worker(0, "cppc_fie");
>> @@ -247,7 +269,7 @@ static void __init cppc_freq_invariance_init(void)
>>   
>>   static void cppc_freq_invariance_exit(void)
>>   {
>> -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
>> +	if (fie_disabled)
>>   		return;
>>   
>>   	kthread_destroy_worker(kworker_fie);
>> @@ -936,6 +958,7 @@ static void cppc_check_hisi_workaround(void)
>>   		    wa_info[i].oem_revision == tbl->oem_revision) {
>>   			/* Overwrite the get() callback */
>>   			cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate;
>> +			fie_disabled = FIE_DISABLED;
>>   			break;
>>   		}
>>   	}
>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
>> index f73d357ecdf5..c5614444031f 100644
>> --- a/include/acpi/cppc_acpi.h
>> +++ b/include/acpi/cppc_acpi.h
>> @@ -140,6 +140,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
>>   extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
>>   extern int cppc_set_enable(int cpu, bool enable);
>>   extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
>> +extern bool cppc_perf_ctrs_in_pcc(void);
>>   extern bool acpi_cpc_valid(void);
>>   extern bool cppc_allow_fast_switch(void);
>>   extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
>> @@ -173,6 +174,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
>>   {
>>   	return -ENOTSUPP;
>>   }
>> +static inline bool cppc_perf_ctrs_in_pcc(void)
>> +{
>> +	return false;
>> +}
>>   static inline bool acpi_cpc_valid(void)
>>   {
>>   	return false;
> 
> Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
> 


Thanks for looking at this.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions
  2022-08-24 14:41   ` Lukasz Luba
  2022-08-24 16:11       ` [Devel] " Jeremy Linton
@ 2022-08-30  3:27     ` Viresh Kumar
  1 sibling, 0 replies; 22+ messages in thread
From: Viresh Kumar @ 2022-08-30  3:27 UTC (permalink / raw)
  To: Lukasz Luba, vincent.guittot
  Cc: Jeremy Linton, rafael, lenb, robert.moore, punit.agrawal,
	ionela.voinescu, pierre.gondois, linux-kernel, devel, linux-pm,
	linux-acpi, Dietmar Eggemann, Morten Rasmussen,
	Souvik Chakravarty

+Vincent.

On 24-08-22, 15:41, Lukasz Luba wrote:
> Hi Jeremy,
> 
> +CC Dietmar, Morten and Souvik
> 
> On 8/18/22 22:16, Jeremy Linton wrote:
> > PCC regions utilize a mailbox to set/retrieve register values used by
> > the CPPC code. This is fine as long as the operations are
> > infrequent. With the FIE code enabled though the overhead can range
> > from 2-11% of system CPU overhead (ex: as measured by top) on Arm
> > based machines.
> > 
> > So, before enabling FIE assure none of the registers used by
> > cppc_get_perf_ctrs() are in the PCC region. Furthermore lets also
> > enable a module parameter which can also disable it at boot or module
> > reload.
> > 
> > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > ---
> >   drivers/acpi/cppc_acpi.c       | 41 ++++++++++++++++++++++++++++++++++
> >   drivers/cpufreq/cppc_cpufreq.c | 31 +++++++++++++++++++++----
> >   include/acpi/cppc_acpi.h       |  5 +++++
> >   3 files changed, 73 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> > index 1e15a9f25ae9..c840bf606b30 100644
> > --- a/drivers/acpi/cppc_acpi.c
> > +++ b/drivers/acpi/cppc_acpi.c
> > @@ -1240,6 +1240,47 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
> >   }
> >   EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
> > +/**
> > + * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region.
> > + *
> > + * CPPC has flexibility about how counters describing CPU perf are delivered.
> > + * One of the choices is PCC regions, which can have a high access latency. This
> > + * routine allows callers of cppc_get_perf_ctrs() to know this ahead of time.
> > + *
> > + * Return: true if any of the counters are in PCC regions, false otherwise
> > + */
> > +bool cppc_perf_ctrs_in_pcc(void)
> > +{
> > +	int cpu;
> > +
> > +	for_each_present_cpu(cpu) {
> > +		struct cpc_register_resource *ref_perf_reg;
> > +		struct cpc_desc *cpc_desc;
> > +
> > +		cpc_desc = per_cpu(cpc_desc_ptr, cpu);
> > +
> > +		if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) ||
> > +		    CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) ||
> > +		    CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]))
> > +			return true;
> > +
> > +
> > +		ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF];
> > +
> > +		/*
> > +		 * If reference perf register is not supported then we should
> > +		 * use the nominal perf value
> > +		 */
> > +		if (!CPC_SUPPORTED(ref_perf_reg))
> > +			ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
> > +
> > +		if (CPC_IN_PCC(ref_perf_reg))
> > +			return true;
> > +	}
> 
> Do we have a platform which returns false here?
> 
> > +	return false;
> > +}
> > +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc);
> > +
> >   /**
> >    * cppc_get_perf_ctrs - Read a CPU's performance feedback counters.
> >    * @cpunum: CPU from which to read counters.
> > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> > index 24eaf0ec344d..32fcb0bf74a4 100644
> > --- a/drivers/cpufreq/cppc_cpufreq.c
> > +++ b/drivers/cpufreq/cppc_cpufreq.c
> > @@ -63,7 +63,15 @@ static struct cppc_workaround_oem_info wa_info[] = {
> >   static struct cpufreq_driver cppc_cpufreq_driver;
> > +static enum {
> > +	FIE_UNSET = -1,
> > +	FIE_ENABLED,
> > +	FIE_DISABLED
> > +} fie_disabled = FIE_UNSET;
> > +
> >   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
> > +module_param(fie_disabled, int, 0444);
> > +MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)");
> 
> Why we need the modules support?
> I would drop this, since the fie_disabled would be set properly when
> needed. The code would be cleaner (more below).
> 
> >   /* Frequency invariance support */
> >   struct cppc_freq_invariance {
> > @@ -158,7 +166,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
> >   	struct cppc_freq_invariance *cppc_fi;
> >   	int cpu, ret;
> > -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> > +	if (fie_disabled)
> >   		return;
> >   	for_each_cpu(cpu, policy->cpus) {
> > @@ -199,7 +207,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
> >   	struct cppc_freq_invariance *cppc_fi;
> >   	int cpu;
> > -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> > +	if (fie_disabled)
> >   		return;
> >   	/* policy->cpus will be empty here, use related_cpus instead */
> > @@ -229,7 +237,21 @@ static void __init cppc_freq_invariance_init(void)
> >   	};
> >   	int ret;
> > -	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
> > +	switch (fie_disabled) {
> > +	/* honor user request */
> > +	case FIE_DISABLED:
> > +	case FIE_ENABLED:
> 
> This module's over-write doesn't look 'clean'.
> Is it OK to allow a user to go with the poor performing
> system (likely on many platforms)? Or we assume that there are
> platforms which has a bit faster mailboxes and they already
> have the FIE issue impacting task's utilization measurements.
> 
> It looks like we are not sure about the solution. On one hand
> we implement those checks in the cppc_perf_ctrs_in_pcc()
> which could set the flag, but on the other hand we allow user
> to decide. IMO this creates diversity that we are not able to control.
> It creates another tunable knob in the kernel, which then is forgotten
> to check.
> 
> I still haven't seen information that the old FIE was an issue on those
> servers and had impact on task utilization measurements. This should be
> a main requirement for this new feature. This would be after we proved
> that the utilization problem was due to the FIE and not something else (like
> uArch variation or workload variation).
> 
> IMO let's revert the ACPI_CPPC_CPUFREQ_FIE. When we get data that
> FIE is an issue on those servers we can come back to this topic.
> 
> Regards,
> Lukasz

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 2/2] cpufreq: CPPC: Change FIE default
  2022-08-24 14:04     ` Lukasz Luba
@ 2022-08-30  5:44       ` Viresh Kumar
  0 siblings, 0 replies; 22+ messages in thread
From: Viresh Kumar @ 2022-08-30  5:44 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: linux-acpi, rafael, lenb, robert.moore, punit.agrawal,
	ionela.voinescu, pierre.gondois, linux-kernel, devel, linux-pm,
	Dietmar Eggemann, Morten Rasmussen, Souvik Chakravarty,
	Jeremy Linton, vincent.guittot

On 24-08-22, 15:04, Lukasz Luba wrote:
> On 8/24/22 07:14, Viresh Kumar wrote:
> > On 18-08-22, 16:16, Jeremy Linton wrote:
> > > FIE is mostly implemented as PCC mailboxes on arm machines.  This was
> > > enabled by default without any data suggesting that it does anything
> > > but hurt system performance. Lets change the default to 'n' until
> > > hardware appears which clearly benefits.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/cpufreq/Kconfig.arm | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> > > index 954749afb5fe..ad66d8f15db0 100644
> > > --- a/drivers/cpufreq/Kconfig.arm
> > > +++ b/drivers/cpufreq/Kconfig.arm
> > > @@ -22,7 +22,7 @@ config ACPI_CPPC_CPUFREQ
> > >   config ACPI_CPPC_CPUFREQ_FIE
> > >   	bool "Frequency Invariance support for CPPC cpufreq driver"
> > >   	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
> > > -	default y
> > > +	default n
> > >   	help
> > >   	  This extends frequency invariance support in the CPPC cpufreq driver,
> > >   	  by using CPPC delivered and reference performance counters.
> > 
> > Why is this required after we have the first patch in ?
> > 

Well, my question was for the way the patches were added. If we are
disabling the feature based on platform, then there is no need to
disable the feature by default.

> There are a few issues with this ACPI_CPPC_CPUFREQ_FIE solution:
> 1. The design is very heavy and that kernel thread can be ~512 util
>    (that's what we have been told by one of our partners from servers'
>    world)
> 2. The HW & FW design is not suited for this task. Newer HW will just
>    have AMU counters (on Arm64) for FIE
> 3. The patches haven't been tested in terms of performance overhead
>    AFAIK. Although, it affects existing Arm64 servers with their
>    workloads.
> 4. AFAIK non of our server partners wasn't complaining about issues with
>    old FIE mechanism.
> 
> In our team we are not allowed to send code that we cannot prove in many
> ways.
> 
> I would just not compile this at all (or even revert this feature).
> If someone compiled in this by accident - make sure we disable it
> after checks like in the patch 1/2. I'll add also some comments
> to that patch.

If we don't really want the feature, which is open for discussion
(added Vincent to have a look as well), then we better mark it BROKEN
in Kconfig.

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-08-30  5:45 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-18 21:16 [PATCH v3 0/2] Disable FIE on machines with slow counters Jeremy Linton
2022-08-18 21:16 ` [Devel] " Jeremy Linton
2022-08-18 21:16 ` [PATCH v3 1/2] ACPI: CPPC: Disable FIE if registers in PCC regions Jeremy Linton
2022-08-18 21:16   ` [Devel] " Jeremy Linton
2022-08-23 17:10   ` Rafael J. Wysocki
2022-08-23 17:10     ` [Devel] " Rafael J. Wysocki
2022-08-23 18:46     ` Jeremy Linton
2022-08-23 18:46       ` [Devel] " Jeremy Linton
2022-08-24 13:22       ` Rafael J. Wysocki
2022-08-24 13:22         ` [Devel] " Rafael J. Wysocki
2022-08-24  6:13   ` Viresh Kumar
2022-08-24 16:14     ` Jeremy Linton
2022-08-24 16:14       ` [Devel] " Jeremy Linton
2022-08-24 14:41   ` Lukasz Luba
2022-08-24 16:11     ` Jeremy Linton
2022-08-24 16:11       ` [Devel] " Jeremy Linton
2022-08-30  3:27     ` Viresh Kumar
2022-08-18 21:16 ` [PATCH v3 2/2] cpufreq: CPPC: Change FIE default Jeremy Linton
2022-08-18 21:16   ` [Devel] " Jeremy Linton
2022-08-24  6:14   ` Viresh Kumar
2022-08-24 14:04     ` Lukasz Luba
2022-08-30  5:44       ` Viresh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.