All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sumit Gupta <sumitg@nvidia.com>
To: <viresh.kumar@linaro.org>, <rafael@kernel.org>,
	<ionela.voinescu@arm.com>, <mark.rutland@arm.com>,
	<sudeep.holla@arm.com>, <lpieralisi@kernel.org>,
	<catalin.marinas@arm.com>, <will@kernel.org>
Cc: <linux-pm@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, <linux-tegra@vger.kernel.org>,
	<treding@nvidia.com>, <jonathanh@nvidia.com>, <vsethi@nvidia.com>,
	<sdonthineni@nvidia.com>, <sanjayc@nvidia.com>,
	<ksitaraman@nvidia.com>, <bbasu@nvidia.com>, <sumitg@nvidia.com>
Subject: [Patch 6/6] cpufreq: CPPC: use wq to read amu counters on target cpu
Date: Tue, 18 Apr 2023 17:04:59 +0530	[thread overview]
Message-ID: <20230418113459.12860-7-sumitg@nvidia.com> (raw)
In-Reply-To: <20230418113459.12860-1-sumitg@nvidia.com>

ARM cores which implement the Activity Monitor Unit (AMU)
use Functional Fixed Hardware (FFH) to map AMU counters to
Delivered_Counter and Reference_Counter registers. Each
sysreg is read separately with a smp_call_function_single
call. So, total four IPI's are used, one per register.
Due to this, the AMU's core counter and constant counter
sampling can happen at a non-consistent time interval if
an IPI is handled late. This results in unstable frequency
value from "cpuinfo_cur_req" node sometimes. To fix, queue
work on target CPU to read all counters synchronously in
sequence. This helps to remove the inter-IPI latency and
make sure that both the counters are sampled at a close
time interval.
Without this change we observed that the re-generated value
of CPU Frequency from AMU counters sometimes deviates by
~25% as the counters are read at non-determenistic time.
Currently, kept the change specific to Tegra241. It can be
applied to other SoC's having AMU if same issue is observed.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 53 +++++++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 5e6a132a525e..52b93ac6225e 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -46,6 +46,8 @@ static bool boost_supported;
 /* default 2usec delay between sampling */
 static unsigned int sampling_delay_us = 2;
 
+static bool get_rate_use_wq;
+
 static void cppc_check_hisi_workaround(void);
 static void cppc_nvidia_workaround(void);
 
@@ -99,6 +101,12 @@ struct cppc_freq_invariance {
 static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
 static struct kthread_worker *kworker_fie;
 
+struct feedback_ctrs {
+	u32 cpu;
+	struct cppc_perf_fb_ctrs fb_ctrs_t0;
+	struct cppc_perf_fb_ctrs fb_ctrs_t1;
+};
+
 static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu);
 static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
 				 struct cppc_perf_fb_ctrs *fb_ctrs_t0,
@@ -851,28 +859,44 @@ static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
 	return (reference_perf * delta_delivered) / delta_reference;
 }
 
+static int cppc_get_perf_ctrs_sync(void *fb_ctrs)
+{
+	struct feedback_ctrs *ctrs = fb_ctrs;
+	int ret;
+
+	ret = cppc_get_perf_ctrs(ctrs->cpu, &(ctrs->fb_ctrs_t0));
+	if (ret)
+		return ret;
+
+	udelay(sampling_delay_us);
+
+	ret = cppc_get_perf_ctrs(ctrs->cpu, &(ctrs->fb_ctrs_t1));
+	if (ret)
+		return ret;
+
+	return ret;
+}
+
 static unsigned int cppc_cpufreq_get_rate(unsigned int cpu)
 {
-	struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 	struct cppc_cpudata *cpu_data = policy->driver_data;
+	struct feedback_ctrs fb_ctrs = {0};
 	u64 delivered_perf;
 	int ret;
 
 	cpufreq_cpu_put(policy);
+	fb_ctrs.cpu = cpu;
 
-	ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t0);
-	if (ret)
-		return ret;
-
-	udelay(sampling_delay_us);
-
-	ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t1);
+	if (get_rate_use_wq)
+		ret = smp_call_on_cpu(cpu, cppc_get_perf_ctrs_sync, &fb_ctrs, false);
+	else
+		ret = cppc_get_perf_ctrs_sync(&fb_ctrs);
 	if (ret)
 		return ret;
 
-	delivered_perf = cppc_perf_from_fbctrs(cpu_data, &fb_ctrs_t0,
-					       &fb_ctrs_t1);
+	delivered_perf = cppc_perf_from_fbctrs(cpu_data, &(fb_ctrs.fb_ctrs_t0),
+					       &(fb_ctrs.fb_ctrs_t1));
 
 	return cppc_cpufreq_perf_to_khz(cpu_data, delivered_perf);
 }
@@ -953,7 +977,16 @@ static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
 
 static void cppc_nvidia_workaround(void)
 {
+	int cpu;
+
 	sampling_delay_us = 25;
+
+#ifdef CONFIG_ARM64_AMU_EXTN
+	cpu = get_cpu_with_amu_feat();
+
+	if (cpu < nr_cpu_ids)
+		get_rate_use_wq = true;
+#endif
 }
 
 static void cppc_check_hisi_workaround(void)
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Sumit Gupta <sumitg@nvidia.com>
To: <viresh.kumar@linaro.org>, <rafael@kernel.org>,
	<ionela.voinescu@arm.com>, <mark.rutland@arm.com>,
	<sudeep.holla@arm.com>, <lpieralisi@kernel.org>,
	<catalin.marinas@arm.com>, <will@kernel.org>
Cc: <linux-pm@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, <linux-tegra@vger.kernel.org>,
	<treding@nvidia.com>, <jonathanh@nvidia.com>, <vsethi@nvidia.com>,
	<sdonthineni@nvidia.com>, <sanjayc@nvidia.com>,
	<ksitaraman@nvidia.com>, <bbasu@nvidia.com>, <sumitg@nvidia.com>
Subject: [Patch 6/6] cpufreq: CPPC: use wq to read amu counters on target cpu
Date: Tue, 18 Apr 2023 17:04:59 +0530	[thread overview]
Message-ID: <20230418113459.12860-7-sumitg@nvidia.com> (raw)
In-Reply-To: <20230418113459.12860-1-sumitg@nvidia.com>

ARM cores which implement the Activity Monitor Unit (AMU)
use Functional Fixed Hardware (FFH) to map AMU counters to
Delivered_Counter and Reference_Counter registers. Each
sysreg is read separately with a smp_call_function_single
call. So, total four IPI's are used, one per register.
Due to this, the AMU's core counter and constant counter
sampling can happen at a non-consistent time interval if
an IPI is handled late. This results in unstable frequency
value from "cpuinfo_cur_req" node sometimes. To fix, queue
work on target CPU to read all counters synchronously in
sequence. This helps to remove the inter-IPI latency and
make sure that both the counters are sampled at a close
time interval.
Without this change we observed that the re-generated value
of CPU Frequency from AMU counters sometimes deviates by
~25% as the counters are read at non-determenistic time.
Currently, kept the change specific to Tegra241. It can be
applied to other SoC's having AMU if same issue is observed.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 53 +++++++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 5e6a132a525e..52b93ac6225e 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -46,6 +46,8 @@ static bool boost_supported;
 /* default 2usec delay between sampling */
 static unsigned int sampling_delay_us = 2;
 
+static bool get_rate_use_wq;
+
 static void cppc_check_hisi_workaround(void);
 static void cppc_nvidia_workaround(void);
 
@@ -99,6 +101,12 @@ struct cppc_freq_invariance {
 static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
 static struct kthread_worker *kworker_fie;
 
+struct feedback_ctrs {
+	u32 cpu;
+	struct cppc_perf_fb_ctrs fb_ctrs_t0;
+	struct cppc_perf_fb_ctrs fb_ctrs_t1;
+};
+
 static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu);
 static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
 				 struct cppc_perf_fb_ctrs *fb_ctrs_t0,
@@ -851,28 +859,44 @@ static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
 	return (reference_perf * delta_delivered) / delta_reference;
 }
 
+static int cppc_get_perf_ctrs_sync(void *fb_ctrs)
+{
+	struct feedback_ctrs *ctrs = fb_ctrs;
+	int ret;
+
+	ret = cppc_get_perf_ctrs(ctrs->cpu, &(ctrs->fb_ctrs_t0));
+	if (ret)
+		return ret;
+
+	udelay(sampling_delay_us);
+
+	ret = cppc_get_perf_ctrs(ctrs->cpu, &(ctrs->fb_ctrs_t1));
+	if (ret)
+		return ret;
+
+	return ret;
+}
+
 static unsigned int cppc_cpufreq_get_rate(unsigned int cpu)
 {
-	struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 	struct cppc_cpudata *cpu_data = policy->driver_data;
+	struct feedback_ctrs fb_ctrs = {0};
 	u64 delivered_perf;
 	int ret;
 
 	cpufreq_cpu_put(policy);
+	fb_ctrs.cpu = cpu;
 
-	ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t0);
-	if (ret)
-		return ret;
-
-	udelay(sampling_delay_us);
-
-	ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t1);
+	if (get_rate_use_wq)
+		ret = smp_call_on_cpu(cpu, cppc_get_perf_ctrs_sync, &fb_ctrs, false);
+	else
+		ret = cppc_get_perf_ctrs_sync(&fb_ctrs);
 	if (ret)
 		return ret;
 
-	delivered_perf = cppc_perf_from_fbctrs(cpu_data, &fb_ctrs_t0,
-					       &fb_ctrs_t1);
+	delivered_perf = cppc_perf_from_fbctrs(cpu_data, &(fb_ctrs.fb_ctrs_t0),
+					       &(fb_ctrs.fb_ctrs_t1));
 
 	return cppc_cpufreq_perf_to_khz(cpu_data, delivered_perf);
 }
@@ -953,7 +977,16 @@ static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
 
 static void cppc_nvidia_workaround(void)
 {
+	int cpu;
+
 	sampling_delay_us = 25;
+
+#ifdef CONFIG_ARM64_AMU_EXTN
+	cpu = get_cpu_with_amu_feat();
+
+	if (cpu < nr_cpu_ids)
+		get_rate_use_wq = true;
+#endif
 }
 
 static void cppc_check_hisi_workaround(void)
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2023-04-18 11:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18 11:34 [Patch 0/6] CPPC_CPUFREQ improvements for Tegra241 Sumit Gupta
2023-04-18 11:34 ` Sumit Gupta
2023-04-18 11:34 ` [Patch 1/6] cpufreq: use correct unit when verify cur freq Sumit Gupta
2023-04-18 11:34   ` Sumit Gupta
2023-04-18 12:57   ` Rafael J. Wysocki
2023-04-18 12:57     ` Rafael J. Wysocki
2023-04-18 13:31     ` Sumit Gupta
2023-04-18 13:31       ` Sumit Gupta
2023-04-18 15:47       ` Rafael J. Wysocki
2023-04-18 15:47         ` Rafael J. Wysocki
2023-04-18 11:34 ` [Patch 2/6] cpufreq: CPPC: make workaround apply code generic Sumit Gupta
2023-04-18 11:34   ` Sumit Gupta
2023-04-18 11:34 ` [Patch 3/6] irqchip/gicv3: Export arm_smccc_get_soc_id_xx funcs Sumit Gupta
2023-04-18 11:34   ` Sumit Gupta
2023-04-26 19:33   ` Florian Fainelli
2023-04-26 19:33     ` Florian Fainelli
2023-04-18 11:34 ` [Patch 4/6] cpufreq: CPPC: update sampling window for Tegra241 Sumit Gupta
2023-04-18 11:34   ` Sumit Gupta
2023-04-18 11:34 ` [Patch 5/6] arm64: cpufeature: Export get_cpu_with_amu_feat func Sumit Gupta
2023-04-18 11:34   ` Sumit Gupta
2023-04-18 11:34 ` Sumit Gupta [this message]
2023-04-18 11:34   ` [Patch 6/6] cpufreq: CPPC: use wq to read amu counters on target cpu Sumit Gupta
2023-04-24  8:32   ` Ionela Voinescu
2023-04-24  8:32     ` Ionela Voinescu
2023-04-26 15:52     ` Sumit Gupta
2023-04-26 15:52       ` Sumit Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230418113459.12860-7-sumitg@nvidia.com \
    --to=sumitg@nvidia.com \
    --cc=bbasu@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=rafael@kernel.org \
    --cc=sanjayc@nvidia.com \
    --cc=sdonthineni@nvidia.com \
    --cc=sudeep.holla@arm.com \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.