All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support
@ 2022-03-09 22:39 Dmitry Baryshkov
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-09 22:39 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

This patchseries aims at fixing and improving CPU hotplug support on
Qualcomm platforms. First three patches are the fixes for the LMH
support in the cpufreq driver. The last patch adds support for
lightweight offline() and online() callbacks which are used instead of
exit() and init() each time the CPU is put offline or back online.

Changes since v1:
- Update commit message for the first patch to describe why dropping
  affinity hint is required (before calling free_irq()),
- Fixed commit message for the second patch to include messages
  generated using the mainline kernel w/o additional patches,
- Changed third patch to use dev_warn_ratelimited(),
- Reworked last patch to move request_irq/free_irq to online()/offline()
  to make sure that the IRQ isn't left enabled after the CPU has been
  put offline.

Dmitry Baryshkov (4):
  cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
  cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  cpufreq: qcom-hw: fix the opp entries refcounting
  cpufreq: qcom-hw: provide online/offline operations

 drivers/cpufreq/qcom-cpufreq-hw.c | 78 +++++++++++++++++++++++++------
 1 file changed, 64 insertions(+), 14 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
  2022-03-09 22:39 [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support Dmitry Baryshkov
@ 2022-03-09 22:39 ` Dmitry Baryshkov
  2022-03-11  8:03   ` Vladimir Zapolskiy
                     ` (2 more replies)
  2022-03-09 22:39 ` [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-09 22:39 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Drop affinity hint before freeing the throttling IRQ to fix the
following trace. One is not allowed to call free_irq() with an affinity
hint in place (which was set by qcom_cpufreq_hw_lmh_init()).

[  185.114773] ------------[ cut here ]------------
[  185.119517] WARNING: CPU: 7 PID: 43 at kernel/irq/manage.c:1887 free_irq+0x3a4/0x3dc
[  185.127474] Modules linked in:
[  185.130618] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S      W         5.17.0-rc6-00386-g67382a5b705d-dirty #690
[  185.147125] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  185.154269] pc : free_irq+0x3a4/0x3dc
[  185.158031] lr : free_irq+0x33c/0x3dc
[  185.161792] sp : ffff80000841bc90
[  185.165195] x29: ffff80000841bc90 x28: ffffa6edc5c3d000 x27: ffff6d93729e5908
[  185.172515] x26: 0000000000000000 x25: ffff6d910109fc00 x24: ffff6d91011490e0
[  185.179838] x23: ffff6d9101149218 x22: 0000000000000080 x21: 0000000000000000
[  185.187163] x20: ffff6d9101149000 x19: ffff6d910ab61500 x18: ffffffffffffffff
[  185.194487] x17: 2e35202020202020 x16: 2020202020202020 x15: ffff80008841b9a7
[  185.201805] x14: 00000000000003c9 x13: 0000000000000001 x12: 0000000000000040
[  185.209135] x11: ffff6d91005aab58 x10: ffff6d91005aab5a x9 : ffffc6a5ad1c5408
[  185.216455] x8 : ffff6d91005adb88 x7 : 0000000000000000 x6 : ffffc6a5ab5a91f4
[  185.223776] x5 : 0000000000000000 x4 : ffff6d91011490a8 x3 : ffffc6a5ad266108
[  185.231098] x2 : 0000000013033204 x1 : ffff6d9101149000 x0 : ffff6d910a9cc000
[  185.238421] Call trace:
[  185.240932]  free_irq+0x3a4/0x3dc
[  185.244334]  qcom_cpufreq_hw_cpu_exit+0x78/0xcc
[  185.248985]  cpufreq_offline.isra.0+0x228/0x270
[  185.253639]  cpuhp_cpufreq_offline+0x10/0x20
[  185.258027]  cpuhp_invoke_callback+0x16c/0x2b0
[  185.262592]  cpuhp_thread_fun+0x190/0x250
[  185.266710]  smpboot_thread_fn+0x12c/0x230
[  185.270914]  kthread+0xfc/0x100
[  185.274145]  ret_from_fork+0x10/0x20
[  185.277820] irq event stamp: 212
[  185.281136] hardirqs last  enabled at (211): [<ffffc6a5ac57973c>] _raw_spin_unlock_irqrestore+0x8c/0xa0
[  185.290775] hardirqs last disabled at (212): [<ffffc6a5ac572100>] __schedule+0x710/0xa10
[  185.299081] softirqs last  enabled at (0): [<ffffc6a5ab50f7b0>] copy_process+0x7d0/0x1a14
[  185.307475] softirqs last disabled at (0): [<0000000000000000>] 0x0

Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 618e436018c0..44d46e52baea 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -427,6 +427,7 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
 	mutex_unlock(&data->throttle_lock);
 
 	cancel_delayed_work_sync(&data->throttle_work);
+	irq_set_affinity_hint(data->throttle_irq, NULL);
 	free_irq(data->throttle_irq, data);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  2022-03-09 22:39 [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support Dmitry Baryshkov
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
@ 2022-03-09 22:39 ` Dmitry Baryshkov
  2022-03-17 23:10   ` Vladimir Zapolskiy
  2022-03-24 16:00   ` Bjorn Andersson
  2022-03-09 22:39 ` [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
  2022-03-09 22:39 ` [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
  3 siblings, 2 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-09 22:39 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
This results in the following warnings and an oops. The driver would
disable the worker, but it happens closer to the end of
cpufreq_offline(). Change the locking in the qcom_lmh_dcvs_poll(), so
that the worker can not run in parallel with cpufreq_offline() call.

[   55.650435] (NULL device *): dev_pm_opp_find_freq_floor: Invalid argument freq=00000000709ccbf9
[   55.659420] (NULL device *): Can't find the OPP for throttling: -EINVAL!
[   55.666329] Unable to handle kernel paging request at virtual address ffffadfba4bb6d81
[   55.674491] Mem abort info:
[   55.677363]   ESR = 0x96000004
[   55.680527]   EC = 0x25: DABT (current EL), IL = 32 bits
[   55.686001]   SET = 0, FnV = 0
[   55.689164]   EA = 0, S1PTW = 0
[   55.692418]   FSC = 0x04: level 0 translation fault
[   55.697449] Data abort info:
[   55.700426]   ISV = 0, ISS = 0x00000004
[   55.704383]   CM = 0, WnR = 0
[   55.707455] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a98e9000
[   55.714354] [ffffadfba4bb6d81] pgd=0000000000000000, p4d=0000000000000000
[   55.721388] Internal error: Oops: 96000004 [#1] SMP
[   55.726397] Modules linked in:
[   55.729542] CPU: 7 PID: 162 Comm: kworker/7:1H Tainted: G S      W         5.17.0-rc6-00100-g04890a1d9672 #724
[   55.746066] Workqueue: events_highpri qcom_lmh_dcvs_poll
[   55.751527] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   55.758669] pc : cpufreq_cpu_get_raw+0x20/0x44
[   55.763233] lr : qcom_cpufreq_hw_get+0x10/0x64
[   55.767799] sp : ffff800009983d10
[   55.771207] x29: ffff800009983d10 x28: ffffaa13a4f2b000 x27: ffff7b31329f9305
[   55.778530] x26: ffffaa13a4f30af8 x25: ffffaa13a4f4e4c8 x24: ffff7b2ec2eda000
[   55.785851] x23: ffffffffffffffea x22: ffff7b2ec2e9fc18 x21: ffff7b2ec2e9fc00
[   55.793170] x20: 0000000000000100 x19: ffff7b2ec2e9fcc0 x18: ffffffffffffffff
[   55.800490] x17: 726620746e656d75 x16: 6772612064696c61 x15: ffff8000899839c7
[   55.807812] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[   55.815140] x11: ffff7b2ec2e9fc80 x10: ffffaa13a59a1a70 x9 : 0000000000000000
[   55.822468] x8 : ffff7b2eca6917c0 x7 : ffffaa13a528b000 x6 : 0000000000000001
[   55.829788] x5 : 0000000000040000 x4 : 000000000000024f x3 : 0000000000000000
[   55.837116] x2 : 0000000000000100 x1 : ffffaa13a4bb6d80 x0 : 000003e800000001
[   55.844439] Call trace:
[   55.846951]  cpufreq_cpu_get_raw+0x20/0x44
[   55.851155]  qcom_lmh_dcvs_poll+0x104/0x160
[   55.855452]  process_one_work+0x288/0x69c
[   55.859574]  worker_thread+0x74/0x470
[   55.863336]  kthread+0xfc/0x100
[   55.866568]  ret_from_fork+0x10/0x20
[   55.870246] Code: b00065c1 91360021 d503233f f8625800 (f8616800)
[   55.876501] ---[ end trace 0000000000000000 ]---

Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 44d46e52baea..7c1bb002e1c3 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -296,6 +296,23 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
 	struct dev_pm_opp *opp;
 	unsigned int freq;
 
+	/*
+	 * Synchronize against CPU going offline.
+	 * cpufreq_offline() will get the write lock on policy->rwsem.
+	 */
+retry:
+	if (unlikely(!down_read_trylock(&policy->rwsem))) {
+		mutex_lock(&data->throttle_lock);
+		if (data->cancel_throttle) {
+			mutex_unlock(&data->throttle_lock);
+			return;
+		}
+
+		mutex_unlock(&data->throttle_lock);
+
+		schedule();
+		goto retry;
+	}
 	/*
 	 * Get the h/w throttled frequency, normalize it using the
 	 * registered opp table and use it to calculate thermal pressure.
@@ -314,9 +331,10 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
 
 	/*
 	 * In the unlikely case policy is unregistered do not enable
-	 * polling or h/w interrupt
+	 * polling or h/w interrupt.
+	 * If we are here, we have the policy->rwsem read lock,
+	 * cancel_throttle can be toggled only with the write lock.
 	 */
-	mutex_lock(&data->throttle_lock);
 	if (data->cancel_throttle)
 		goto out;
 
@@ -331,7 +349,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
 				 msecs_to_jiffies(10));
 
 out:
-	mutex_unlock(&data->throttle_lock);
+	up_read(&policy->rwsem);
 }
 
 static void qcom_lmh_dcvs_poll(struct work_struct *work)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting
  2022-03-09 22:39 [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support Dmitry Baryshkov
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
  2022-03-09 22:39 ` [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
@ 2022-03-09 22:39 ` Dmitry Baryshkov
  2022-03-17 23:04   ` Vladimir Zapolskiy
  2022-03-24 16:06   ` Bjorn Andersson
  2022-03-09 22:39 ` [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
  3 siblings, 2 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-09 22:39 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

The qcom_lmh_dcvs_notify() will get the dev_pm_opp instance for
throttling, but will not put it, ending up with leaking a reference
count and the following backtrace when putting the CPU offline.

Correctly put the reference count of the returned opp instance.

[   84.418025] ------------[ cut here ]------------
[   84.422770] WARNING: CPU: 7 PID: 43 at drivers/opp/core.c:1396 _opp_table_kref_release+0x188/0x190
[   84.431966] Modules linked in:
[   84.435106] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S                5.17.0-rc6-00388-g7cf3c0d89c44-dirty #721
[   84.451631] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
[   84.458781] pc : _opp_table_kref_release+0x188/0x190
[   84.463878] lr : _opp_table_kref_release+0x78/0x190
[   84.468885] sp : ffff80000841bc70
[   84.472294] x29: ffff80000841bc70 x28: ffff6664afe3d000 x27: ffff1db6729e5908
[   84.479621] x26: 0000000000000000 x25: 0000000000000000 x24: ffff1db6729e58e0
[   84.486946] x23: ffff8000080a5000 x22: ffff1db40aad80e0 x21: ffff1db4002fec80
[   84.494277] x20: ffff1db40aad8000 x19: ffffb751c3186300 x18: ffffffffffffffff
[   84.501603] x17: 5300326563697665 x16: 645f676e696c6f6f x15: 00001186c1df5448
[   84.508928] x14: 00000000000002e9 x13: 0000000000000000 x12: 0000000000000000
[   84.516256] x11: ffffb751c3186368 x10: ffffb751c39a2a70 x9 : 0000000000000000
[   84.523585] x8 : ffff1db4008edf00 x7 : ffffb751c328c000 x6 : 0000000000000001
[   84.530916] x5 : 0000000000040000 x4 : 0000000000000001 x3 : ffff1db4008edf00
[   84.538247] x2 : 0000000000000000 x1 : ffff1db400aa6100 x0 : ffff1db40aad80d0
[   84.545579] Call trace:
[   84.548101]  _opp_table_kref_release+0x188/0x190
[   84.552842]  dev_pm_opp_remove_all_dynamic+0x8c/0xc0
[   84.557949]  qcom_cpufreq_hw_cpu_exit+0x30/0xdc
[   84.562608]  cpufreq_offline.isra.0+0x1b4/0x1d8
[   84.567270]  cpuhp_cpufreq_offline+0x10/0x6c
[   84.571663]  cpuhp_invoke_callback+0x16c/0x2b0
[   84.576231]  cpuhp_thread_fun+0x190/0x250
[   84.580353]  smpboot_thread_fn+0x12c/0x230
[   84.584568]  kthread+0xfc/0x100
[   84.587810]  ret_from_fork+0x10/0x20
[   84.591490] irq event stamp: 3482
[   84.594901] hardirqs last  enabled at (3481): [<ffffb751c13c3db0>] call_rcu+0x39c/0x50c
[   84.603119] hardirqs last disabled at (3482): [<ffffb751c236b518>] el1_dbg+0x24/0x8c
[   84.611074] softirqs last  enabled at (310): [<ffffb751c1290410>] _stext+0x410/0x588
[   84.619028] softirqs last disabled at (305): [<ffffb751c131bf68>] __irq_exit_rcu+0x158/0x174
[   84.627691] ---[ end trace 0000000000000000 ]---

Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 7c1bb002e1c3..fe638e141003 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -322,12 +322,18 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
 
 	opp = dev_pm_opp_find_freq_floor(dev, &freq_hz);
 	if (IS_ERR(opp) && PTR_ERR(opp) == -ERANGE)
-		dev_pm_opp_find_freq_ceil(dev, &freq_hz);
+		opp = dev_pm_opp_find_freq_ceil(dev, &freq_hz);
 
-	throttled_freq = freq_hz / HZ_PER_KHZ;
+	if (IS_ERR(opp)) {
+		dev_warn_ratelimited(dev, "Can't find the OPP for throttling: %pe!\n", opp);
+	} else {
+		throttled_freq = freq_hz / HZ_PER_KHZ;
+
+		/* Update thermal pressure (the boost frequencies are accepted) */
+		arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
 
-	/* Update thermal pressure (the boost frequencies are accepted) */
-	arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
+		dev_pm_opp_put(opp);
+	}
 
 	/*
 	 * In the unlikely case policy is unregistered do not enable
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations
  2022-03-09 22:39 [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support Dmitry Baryshkov
                   ` (2 preceding siblings ...)
  2022-03-09 22:39 ` [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
@ 2022-03-09 22:39 ` Dmitry Baryshkov
  2022-03-17 23:05   ` Vladimir Zapolskiy
  2022-03-24 16:08   ` Bjorn Andersson
  3 siblings, 2 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-09 22:39 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Provide lightweight online and offline operations. This saves us from
parsing all the resources each time the CPU is put online.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 39 +++++++++++++++++++++++++------
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index fe638e141003..d38b1552ec13 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -403,11 +403,12 @@ static const struct of_device_id qcom_cpufreq_hw_match[] = {
 };
 MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match);
 
+static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy);
+
 static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
 {
 	struct qcom_cpufreq_data *data = policy->driver_data;
 	struct platform_device *pdev = cpufreq_get_driver_data();
-	int ret;
 
 	/*
 	 * Look for LMh interrupt. If no interrupt line is specified /
@@ -419,12 +420,21 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
 	if (data->throttle_irq < 0)
 		return data->throttle_irq;
 
-	data->cancel_throttle = false;
-	data->policy = policy;
-
 	mutex_init(&data->throttle_lock);
 	INIT_DEFERRABLE_WORK(&data->throttle_work, qcom_lmh_dcvs_poll);
 
+	return qcom_cpufreq_hw_lmh_online(policy);
+}
+
+static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy)
+{
+	struct qcom_cpufreq_data *data = policy->driver_data;
+	struct platform_device *pdev = cpufreq_get_driver_data();
+	int ret;
+
+	data->cancel_throttle = false;
+	data->policy = policy;
+
 	snprintf(data->irq_name, sizeof(data->irq_name), "dcvsh-irq-%u", policy->cpu);
 	ret = request_threaded_irq(data->throttle_irq, NULL, qcom_lmh_dcvs_handle_irq,
 				   IRQF_ONESHOT | IRQF_NO_AUTOEN, data->irq_name, data);
@@ -441,10 +451,12 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
 	return 0;
 }
 
-static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
+static int qcom_cpufreq_hw_lmh_offline(struct cpufreq_policy *policy)
 {
+	struct qcom_cpufreq_data *data = policy->driver_data;
+
 	if (data->throttle_irq <= 0)
-		return;
+		return 0;
 
 	mutex_lock(&data->throttle_lock);
 	data->cancel_throttle = true;
@@ -453,6 +465,8 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
 	cancel_delayed_work_sync(&data->throttle_work);
 	irq_set_affinity_hint(data->throttle_irq, NULL);
 	free_irq(data->throttle_irq, data);
+
+	return 0;
 }
 
 static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
@@ -567,6 +581,16 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 	return ret;
 }
 
+static int qcom_cpufreq_hw_cpu_online(struct cpufreq_policy *policy)
+{
+	return qcom_cpufreq_hw_lmh_online(policy);
+}
+
+static int qcom_cpufreq_hw_cpu_offline(struct cpufreq_policy *policy)
+{
+	return qcom_cpufreq_hw_lmh_offline(policy);
+}
+
 static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
 {
 	struct device *cpu_dev = get_cpu_device(policy->cpu);
@@ -576,7 +600,6 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
 
 	dev_pm_opp_remove_all_dynamic(cpu_dev);
 	dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
-	qcom_cpufreq_hw_lmh_exit(data);
 	kfree(policy->freq_table);
 	kfree(data);
 	iounmap(base);
@@ -608,6 +631,8 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
 	.get		= qcom_cpufreq_hw_get,
 	.init		= qcom_cpufreq_hw_cpu_init,
 	.exit		= qcom_cpufreq_hw_cpu_exit,
+	.online		= qcom_cpufreq_hw_cpu_online,
+	.offline	= qcom_cpufreq_hw_cpu_offline,
 	.register_em	= cpufreq_register_em_with_opp,
 	.fast_switch    = qcom_cpufreq_hw_fast_switch,
 	.name		= "qcom-cpufreq-hw",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
@ 2022-03-11  8:03   ` Vladimir Zapolskiy
  2022-03-17 23:03   ` Vladimir Zapolskiy
  2022-03-24 16:05   ` Bjorn Andersson
  2 siblings, 0 replies; 16+ messages in thread
From: Vladimir Zapolskiy @ 2022-03-11  8:03 UTC (permalink / raw)
  To: Dmitry Baryshkov, Andy Gross, Bjorn Andersson, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Hi Dmitry,

On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
> Drop affinity hint before freeing the throttling IRQ to fix the
> following trace. One is not allowed to call free_irq() with an affinity
> hint in place (which was set by qcom_cpufreq_hw_lmh_init()).
> 
> [  185.114773] ------------[ cut here ]------------
> [  185.119517] WARNING: CPU: 7 PID: 43 at kernel/irq/manage.c:1887 free_irq+0x3a4/0x3dc
> [  185.127474] Modules linked in:
> [  185.130618] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S      W         5.17.0-rc6-00386-g67382a5b705d-dirty #690
> [  185.147125] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  185.154269] pc : free_irq+0x3a4/0x3dc
> [  185.158031] lr : free_irq+0x33c/0x3dc
> [  185.161792] sp : ffff80000841bc90
> [  185.165195] x29: ffff80000841bc90 x28: ffffa6edc5c3d000 x27: ffff6d93729e5908
> [  185.172515] x26: 0000000000000000 x25: ffff6d910109fc00 x24: ffff6d91011490e0
> [  185.179838] x23: ffff6d9101149218 x22: 0000000000000080 x21: 0000000000000000
> [  185.187163] x20: ffff6d9101149000 x19: ffff6d910ab61500 x18: ffffffffffffffff
> [  185.194487] x17: 2e35202020202020 x16: 2020202020202020 x15: ffff80008841b9a7
> [  185.201805] x14: 00000000000003c9 x13: 0000000000000001 x12: 0000000000000040
> [  185.209135] x11: ffff6d91005aab58 x10: ffff6d91005aab5a x9 : ffffc6a5ad1c5408
> [  185.216455] x8 : ffff6d91005adb88 x7 : 0000000000000000 x6 : ffffc6a5ab5a91f4
> [  185.223776] x5 : 0000000000000000 x4 : ffff6d91011490a8 x3 : ffffc6a5ad266108
> [  185.231098] x2 : 0000000013033204 x1 : ffff6d9101149000 x0 : ffff6d910a9cc000
> [  185.238421] Call trace:
> [  185.240932]  free_irq+0x3a4/0x3dc
> [  185.244334]  qcom_cpufreq_hw_cpu_exit+0x78/0xcc
> [  185.248985]  cpufreq_offline.isra.0+0x228/0x270
> [  185.253639]  cpuhp_cpufreq_offline+0x10/0x20
> [  185.258027]  cpuhp_invoke_callback+0x16c/0x2b0
> [  185.262592]  cpuhp_thread_fun+0x190/0x250
> [  185.266710]  smpboot_thread_fn+0x12c/0x230
> [  185.270914]  kthread+0xfc/0x100
> [  185.274145]  ret_from_fork+0x10/0x20
> [  185.277820] irq event stamp: 212
> [  185.281136] hardirqs last  enabled at (211): [<ffffc6a5ac57973c>] _raw_spin_unlock_irqrestore+0x8c/0xa0
> [  185.290775] hardirqs last disabled at (212): [<ffffc6a5ac572100>] __schedule+0x710/0xa10
> [  185.299081] softirqs last  enabled at (0): [<ffffc6a5ab50f7b0>] copy_process+0x7d0/0x1a14
> [  185.307475] softirqs last disabled at (0): [<0000000000000000>] 0x0
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")

I suppose the fixed commit is 3ed6dfbd3bb98 ("cpufreq: qcom-hw: Set CPU affinity of dcvsh interrupts")

Otherwise looks correct.

> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>   drivers/cpufreq/qcom-cpufreq-hw.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 618e436018c0..44d46e52baea 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -427,6 +427,7 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
>   	mutex_unlock(&data->throttle_lock);
>   
>   	cancel_delayed_work_sync(&data->throttle_work);
> +	irq_set_affinity_hint(data->throttle_irq, NULL);
>   	free_irq(data->throttle_irq, data);
>   }
>   
> 

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
  2022-03-11  8:03   ` Vladimir Zapolskiy
@ 2022-03-17 23:03   ` Vladimir Zapolskiy
  2022-03-24 16:05   ` Bjorn Andersson
  2 siblings, 0 replies; 16+ messages in thread
From: Vladimir Zapolskiy @ 2022-03-17 23:03 UTC (permalink / raw)
  To: Dmitry Baryshkov, Andy Gross, Bjorn Andersson, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Hi Dmitry,

On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
> Drop affinity hint before freeing the throttling IRQ to fix the
> following trace. One is not allowed to call free_irq() with an affinity
> hint in place (which was set by qcom_cpufreq_hw_lmh_init()).
> 
> [  185.114773] ------------[ cut here ]------------
> [  185.119517] WARNING: CPU: 7 PID: 43 at kernel/irq/manage.c:1887 free_irq+0x3a4/0x3dc
> [  185.127474] Modules linked in:
> [  185.130618] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S      W         5.17.0-rc6-00386-g67382a5b705d-dirty #690
> [  185.147125] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  185.154269] pc : free_irq+0x3a4/0x3dc
> [  185.158031] lr : free_irq+0x33c/0x3dc
> [  185.161792] sp : ffff80000841bc90
> [  185.165195] x29: ffff80000841bc90 x28: ffffa6edc5c3d000 x27: ffff6d93729e5908
> [  185.172515] x26: 0000000000000000 x25: ffff6d910109fc00 x24: ffff6d91011490e0
> [  185.179838] x23: ffff6d9101149218 x22: 0000000000000080 x21: 0000000000000000
> [  185.187163] x20: ffff6d9101149000 x19: ffff6d910ab61500 x18: ffffffffffffffff
> [  185.194487] x17: 2e35202020202020 x16: 2020202020202020 x15: ffff80008841b9a7
> [  185.201805] x14: 00000000000003c9 x13: 0000000000000001 x12: 0000000000000040
> [  185.209135] x11: ffff6d91005aab58 x10: ffff6d91005aab5a x9 : ffffc6a5ad1c5408
> [  185.216455] x8 : ffff6d91005adb88 x7 : 0000000000000000 x6 : ffffc6a5ab5a91f4
> [  185.223776] x5 : 0000000000000000 x4 : ffff6d91011490a8 x3 : ffffc6a5ad266108
> [  185.231098] x2 : 0000000013033204 x1 : ffff6d9101149000 x0 : ffff6d910a9cc000
> [  185.238421] Call trace:
> [  185.240932]  free_irq+0x3a4/0x3dc
> [  185.244334]  qcom_cpufreq_hw_cpu_exit+0x78/0xcc
> [  185.248985]  cpufreq_offline.isra.0+0x228/0x270
> [  185.253639]  cpuhp_cpufreq_offline+0x10/0x20
> [  185.258027]  cpuhp_invoke_callback+0x16c/0x2b0
> [  185.262592]  cpuhp_thread_fun+0x190/0x250
> [  185.266710]  smpboot_thread_fn+0x12c/0x230
> [  185.270914]  kthread+0xfc/0x100
> [  185.274145]  ret_from_fork+0x10/0x20
> [  185.277820] irq event stamp: 212
> [  185.281136] hardirqs last  enabled at (211): [<ffffc6a5ac57973c>] _raw_spin_unlock_irqrestore+0x8c/0xa0
> [  185.290775] hardirqs last disabled at (212): [<ffffc6a5ac572100>] __schedule+0x710/0xa10
> [  185.299081] softirqs last  enabled at (0): [<ffffc6a5ab50f7b0>] copy_process+0x7d0/0x1a14
> [  185.307475] softirqs last disabled at (0): [<0000000000000000>] 0x0
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>   drivers/cpufreq/qcom-cpufreq-hw.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 618e436018c0..44d46e52baea 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -427,6 +427,7 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
>   	mutex_unlock(&data->throttle_lock);
>   
>   	cancel_delayed_work_sync(&data->throttle_work);
> +	irq_set_affinity_hint(data->throttle_irq, NULL);
>   	free_irq(data->throttle_irq, data);
>   }
>   
> 

Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting
  2022-03-09 22:39 ` [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
@ 2022-03-17 23:04   ` Vladimir Zapolskiy
  2022-03-24 16:06   ` Bjorn Andersson
  1 sibling, 0 replies; 16+ messages in thread
From: Vladimir Zapolskiy @ 2022-03-17 23:04 UTC (permalink / raw)
  To: Dmitry Baryshkov, Andy Gross, Bjorn Andersson, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Hi Dmitry,

On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
> The qcom_lmh_dcvs_notify() will get the dev_pm_opp instance for
> throttling, but will not put it, ending up with leaking a reference
> count and the following backtrace when putting the CPU offline.
> 
> Correctly put the reference count of the returned opp instance.
> 
> [   84.418025] ------------[ cut here ]------------
> [   84.422770] WARNING: CPU: 7 PID: 43 at drivers/opp/core.c:1396 _opp_table_kref_release+0x188/0x190
> [   84.431966] Modules linked in:
> [   84.435106] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S                5.17.0-rc6-00388-g7cf3c0d89c44-dirty #721
> [   84.451631] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> [   84.458781] pc : _opp_table_kref_release+0x188/0x190
> [   84.463878] lr : _opp_table_kref_release+0x78/0x190
> [   84.468885] sp : ffff80000841bc70
> [   84.472294] x29: ffff80000841bc70 x28: ffff6664afe3d000 x27: ffff1db6729e5908
> [   84.479621] x26: 0000000000000000 x25: 0000000000000000 x24: ffff1db6729e58e0
> [   84.486946] x23: ffff8000080a5000 x22: ffff1db40aad80e0 x21: ffff1db4002fec80
> [   84.494277] x20: ffff1db40aad8000 x19: ffffb751c3186300 x18: ffffffffffffffff
> [   84.501603] x17: 5300326563697665 x16: 645f676e696c6f6f x15: 00001186c1df5448
> [   84.508928] x14: 00000000000002e9 x13: 0000000000000000 x12: 0000000000000000
> [   84.516256] x11: ffffb751c3186368 x10: ffffb751c39a2a70 x9 : 0000000000000000
> [   84.523585] x8 : ffff1db4008edf00 x7 : ffffb751c328c000 x6 : 0000000000000001
> [   84.530916] x5 : 0000000000040000 x4 : 0000000000000001 x3 : ffff1db4008edf00
> [   84.538247] x2 : 0000000000000000 x1 : ffff1db400aa6100 x0 : ffff1db40aad80d0
> [   84.545579] Call trace:
> [   84.548101]  _opp_table_kref_release+0x188/0x190
> [   84.552842]  dev_pm_opp_remove_all_dynamic+0x8c/0xc0
> [   84.557949]  qcom_cpufreq_hw_cpu_exit+0x30/0xdc
> [   84.562608]  cpufreq_offline.isra.0+0x1b4/0x1d8
> [   84.567270]  cpuhp_cpufreq_offline+0x10/0x6c
> [   84.571663]  cpuhp_invoke_callback+0x16c/0x2b0
> [   84.576231]  cpuhp_thread_fun+0x190/0x250
> [   84.580353]  smpboot_thread_fn+0x12c/0x230
> [   84.584568]  kthread+0xfc/0x100
> [   84.587810]  ret_from_fork+0x10/0x20
> [   84.591490] irq event stamp: 3482
> [   84.594901] hardirqs last  enabled at (3481): [<ffffb751c13c3db0>] call_rcu+0x39c/0x50c
> [   84.603119] hardirqs last disabled at (3482): [<ffffb751c236b518>] el1_dbg+0x24/0x8c
> [   84.611074] softirqs last  enabled at (310): [<ffffb751c1290410>] _stext+0x410/0x588
> [   84.619028] softirqs last disabled at (305): [<ffffb751c131bf68>] __irq_exit_rcu+0x158/0x174
> [   84.627691] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>   drivers/cpufreq/qcom-cpufreq-hw.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 7c1bb002e1c3..fe638e141003 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -322,12 +322,18 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>   
>   	opp = dev_pm_opp_find_freq_floor(dev, &freq_hz);
>   	if (IS_ERR(opp) && PTR_ERR(opp) == -ERANGE)
> -		dev_pm_opp_find_freq_ceil(dev, &freq_hz);
> +		opp = dev_pm_opp_find_freq_ceil(dev, &freq_hz);
>   
> -	throttled_freq = freq_hz / HZ_PER_KHZ;
> +	if (IS_ERR(opp)) {
> +		dev_warn_ratelimited(dev, "Can't find the OPP for throttling: %pe!\n", opp);
> +	} else {
> +		throttled_freq = freq_hz / HZ_PER_KHZ;
> +
> +		/* Update thermal pressure (the boost frequencies are accepted) */
> +		arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
>   
> -	/* Update thermal pressure (the boost frequencies are accepted) */
> -	arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
> +		dev_pm_opp_put(opp);
> +	}
>   
>   	/*
>   	 * In the unlikely case policy is unregistered do not enable
> 

Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations
  2022-03-09 22:39 ` [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
@ 2022-03-17 23:05   ` Vladimir Zapolskiy
  2022-03-24 16:08   ` Bjorn Andersson
  1 sibling, 0 replies; 16+ messages in thread
From: Vladimir Zapolskiy @ 2022-03-17 23:05 UTC (permalink / raw)
  To: Dmitry Baryshkov, Andy Gross, Bjorn Andersson, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Hi Dmitry,

On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
> Provide lightweight online and offline operations. This saves us from
> parsing all the resources each time the CPU is put online.
> 
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>   drivers/cpufreq/qcom-cpufreq-hw.c | 39 +++++++++++++++++++++++++------
>   1 file changed, 32 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index fe638e141003..d38b1552ec13 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -403,11 +403,12 @@ static const struct of_device_id qcom_cpufreq_hw_match[] = {
>   };
>   MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match);
>   
> +static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy);
> +
>   static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>   {
>   	struct qcom_cpufreq_data *data = policy->driver_data;
>   	struct platform_device *pdev = cpufreq_get_driver_data();
> -	int ret;
>   
>   	/*
>   	 * Look for LMh interrupt. If no interrupt line is specified /
> @@ -419,12 +420,21 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>   	if (data->throttle_irq < 0)
>   		return data->throttle_irq;
>   
> -	data->cancel_throttle = false;
> -	data->policy = policy;
> -
>   	mutex_init(&data->throttle_lock);
>   	INIT_DEFERRABLE_WORK(&data->throttle_work, qcom_lmh_dcvs_poll);
>   
> +	return qcom_cpufreq_hw_lmh_online(policy);
> +}
> +
> +static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy)
> +{
> +	struct qcom_cpufreq_data *data = policy->driver_data;
> +	struct platform_device *pdev = cpufreq_get_driver_data();
> +	int ret;
> +
> +	data->cancel_throttle = false;
> +	data->policy = policy;
> +
>   	snprintf(data->irq_name, sizeof(data->irq_name), "dcvsh-irq-%u", policy->cpu);
>   	ret = request_threaded_irq(data->throttle_irq, NULL, qcom_lmh_dcvs_handle_irq,
>   				   IRQF_ONESHOT | IRQF_NO_AUTOEN, data->irq_name, data);
> @@ -441,10 +451,12 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>   	return 0;
>   }
>   
> -static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
> +static int qcom_cpufreq_hw_lmh_offline(struct cpufreq_policy *policy)
>   {
> +	struct qcom_cpufreq_data *data = policy->driver_data;
> +
>   	if (data->throttle_irq <= 0)
> -		return;
> +		return 0;
>   
>   	mutex_lock(&data->throttle_lock);
>   	data->cancel_throttle = true;
> @@ -453,6 +465,8 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
>   	cancel_delayed_work_sync(&data->throttle_work);
>   	irq_set_affinity_hint(data->throttle_irq, NULL);
>   	free_irq(data->throttle_irq, data);
> +
> +	return 0;
>   }
>   
>   static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> @@ -567,6 +581,16 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
>   	return ret;
>   }
>   
> +static int qcom_cpufreq_hw_cpu_online(struct cpufreq_policy *policy)
> +{
> +	return qcom_cpufreq_hw_lmh_online(policy);
> +}
> +
> +static int qcom_cpufreq_hw_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	return qcom_cpufreq_hw_lmh_offline(policy);
> +}
> +
>   static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>   {
>   	struct device *cpu_dev = get_cpu_device(policy->cpu);
> @@ -576,7 +600,6 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>   
>   	dev_pm_opp_remove_all_dynamic(cpu_dev);
>   	dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
> -	qcom_cpufreq_hw_lmh_exit(data);
>   	kfree(policy->freq_table);
>   	kfree(data);
>   	iounmap(base);
> @@ -608,6 +631,8 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
>   	.get		= qcom_cpufreq_hw_get,
>   	.init		= qcom_cpufreq_hw_cpu_init,
>   	.exit		= qcom_cpufreq_hw_cpu_exit,
> +	.online		= qcom_cpufreq_hw_cpu_online,
> +	.offline	= qcom_cpufreq_hw_cpu_offline,
>   	.register_em	= cpufreq_register_em_with_opp,
>   	.fast_switch    = qcom_cpufreq_hw_fast_switch,
>   	.name		= "qcom-cpufreq-hw",
> 

Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  2022-03-09 22:39 ` [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
@ 2022-03-17 23:10   ` Vladimir Zapolskiy
  2022-03-17 23:21     ` Dmitry Baryshkov
  2022-03-24 16:00   ` Bjorn Andersson
  1 sibling, 1 reply; 16+ messages in thread
From: Vladimir Zapolskiy @ 2022-03-17 23:10 UTC (permalink / raw)
  To: Dmitry Baryshkov, Andy Gross, Bjorn Andersson, Rafael J. Wysocki,
	Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

Hi Dmitry,

On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
> qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
> This results in the following warnings and an oops. The driver would
> disable the worker, but it happens closer to the end of
> cpufreq_offline(). Change the locking in the qcom_lmh_dcvs_poll(), so
> that the worker can not run in parallel with cpufreq_offline() call.
> 
> [   55.650435] (NULL device *): dev_pm_opp_find_freq_floor: Invalid argument freq=00000000709ccbf9
> [   55.659420] (NULL device *): Can't find the OPP for throttling: -EINVAL!
> [   55.666329] Unable to handle kernel paging request at virtual address ffffadfba4bb6d81
> [   55.674491] Mem abort info:
> [   55.677363]   ESR = 0x96000004
> [   55.680527]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   55.686001]   SET = 0, FnV = 0
> [   55.689164]   EA = 0, S1PTW = 0
> [   55.692418]   FSC = 0x04: level 0 translation fault
> [   55.697449] Data abort info:
> [   55.700426]   ISV = 0, ISS = 0x00000004
> [   55.704383]   CM = 0, WnR = 0
> [   55.707455] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a98e9000
> [   55.714354] [ffffadfba4bb6d81] pgd=0000000000000000, p4d=0000000000000000
> [   55.721388] Internal error: Oops: 96000004 [#1] SMP
> [   55.726397] Modules linked in:
> [   55.729542] CPU: 7 PID: 162 Comm: kworker/7:1H Tainted: G S      W         5.17.0-rc6-00100-g04890a1d9672 #724
> [   55.746066] Workqueue: events_highpri qcom_lmh_dcvs_poll
> [   55.751527] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   55.758669] pc : cpufreq_cpu_get_raw+0x20/0x44
> [   55.763233] lr : qcom_cpufreq_hw_get+0x10/0x64
> [   55.767799] sp : ffff800009983d10
> [   55.771207] x29: ffff800009983d10 x28: ffffaa13a4f2b000 x27: ffff7b31329f9305
> [   55.778530] x26: ffffaa13a4f30af8 x25: ffffaa13a4f4e4c8 x24: ffff7b2ec2eda000
> [   55.785851] x23: ffffffffffffffea x22: ffff7b2ec2e9fc18 x21: ffff7b2ec2e9fc00
> [   55.793170] x20: 0000000000000100 x19: ffff7b2ec2e9fcc0 x18: ffffffffffffffff
> [   55.800490] x17: 726620746e656d75 x16: 6772612064696c61 x15: ffff8000899839c7
> [   55.807812] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [   55.815140] x11: ffff7b2ec2e9fc80 x10: ffffaa13a59a1a70 x9 : 0000000000000000
> [   55.822468] x8 : ffff7b2eca6917c0 x7 : ffffaa13a528b000 x6 : 0000000000000001
> [   55.829788] x5 : 0000000000040000 x4 : 000000000000024f x3 : 0000000000000000
> [   55.837116] x2 : 0000000000000100 x1 : ffffaa13a4bb6d80 x0 : 000003e800000001
> [   55.844439] Call trace:
> [   55.846951]  cpufreq_cpu_get_raw+0x20/0x44
> [   55.851155]  qcom_lmh_dcvs_poll+0x104/0x160
> [   55.855452]  process_one_work+0x288/0x69c
> [   55.859574]  worker_thread+0x74/0x470
> [   55.863336]  kthread+0xfc/0x100
> [   55.866568]  ret_from_fork+0x10/0x20
> [   55.870246] Code: b00065c1 91360021 d503233f f8625800 (f8616800)
> [   55.876501] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>   drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 44d46e52baea..7c1bb002e1c3 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -296,6 +296,23 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>   	struct dev_pm_opp *opp;
>   	unsigned int freq;
>   
> +	/*
> +	 * Synchronize against CPU going offline.
> +	 * cpufreq_offline() will get the write lock on policy->rwsem.
> +	 */
> +retry:
> +	if (unlikely(!down_read_trylock(&policy->rwsem))) {
> +		mutex_lock(&data->throttle_lock);
> +		if (data->cancel_throttle) {
> +			mutex_unlock(&data->throttle_lock);
> +			return;
> +		}
> +
> +		mutex_unlock(&data->throttle_lock);
> +
> +		schedule();
> +		goto retry;
> +	}

do you see an option here to base the logic on policy->cpus mask value instead of
the semaphore lock status? I believe it should be possible, and, if so, likely
it would be preferable.

>   	/*
>   	 * Get the h/w throttled frequency, normalize it using the
>   	 * registered opp table and use it to calculate thermal pressure.
> @@ -314,9 +331,10 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>   
>   	/*
>   	 * In the unlikely case policy is unregistered do not enable
> -	 * polling or h/w interrupt
> +	 * polling or h/w interrupt.
> +	 * If we are here, we have the policy->rwsem read lock,
> +	 * cancel_throttle can be toggled only with the write lock.
>   	 */
> -	mutex_lock(&data->throttle_lock);
>   	if (data->cancel_throttle)
>   		goto out;
>   
> @@ -331,7 +349,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>   				 msecs_to_jiffies(10));
>   
>   out:
> -	mutex_unlock(&data->throttle_lock);
> +	up_read(&policy->rwsem);
>   }
>   
>   static void qcom_lmh_dcvs_poll(struct work_struct *work)
> 

FWIW I can add my:

Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  2022-03-17 23:10   ` Vladimir Zapolskiy
@ 2022-03-17 23:21     ` Dmitry Baryshkov
  0 siblings, 0 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-17 23:21 UTC (permalink / raw)
  To: Vladimir Zapolskiy, Andy Gross, Bjorn Andersson,
	Rafael J. Wysocki, Viresh Kumar
  Cc: linux-arm-msm, linux-pm, Thara Gopinath

On 18/03/2022 02:10, Vladimir Zapolskiy wrote:
> Hi Dmitry,
> 
> On 3/10/22 12:39 AM, Dmitry Baryshkov wrote:
>> qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
>> This results in the following warnings and an oops. The driver would
>> disable the worker, but it happens closer to the end of
>> cpufreq_offline(). Change the locking in the qcom_lmh_dcvs_poll(), so
>> that the worker can not run in parallel with cpufreq_offline() call.
>>
>> [   55.650435] (NULL device *): dev_pm_opp_find_freq_floor: Invalid 
>> argument freq=00000000709ccbf9
>> [   55.659420] (NULL device *): Can't find the OPP for throttling: 
>> -EINVAL!
>> [   55.666329] Unable to handle kernel paging request at virtual 
>> address ffffadfba4bb6d81
>> [   55.674491] Mem abort info:
>> [   55.677363]   ESR = 0x96000004
>> [   55.680527]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   55.686001]   SET = 0, FnV = 0
>> [   55.689164]   EA = 0, S1PTW = 0
>> [   55.692418]   FSC = 0x04: level 0 translation fault
>> [   55.697449] Data abort info:
>> [   55.700426]   ISV = 0, ISS = 0x00000004
>> [   55.704383]   CM = 0, WnR = 0
>> [   55.707455] swapper pgtable: 4k pages, 48-bit VAs, 
>> pgdp=00000000a98e9000
>> [   55.714354] [ffffadfba4bb6d81] pgd=0000000000000000, 
>> p4d=0000000000000000
>> [   55.721388] Internal error: Oops: 96000004 [#1] SMP
>> [   55.726397] Modules linked in:
>> [   55.729542] CPU: 7 PID: 162 Comm: kworker/7:1H Tainted: G S      
>> W         5.17.0-rc6-00100-g04890a1d9672 #724
>> [   55.746066] Workqueue: events_highpri qcom_lmh_dcvs_poll
>> [   55.751527] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS 
>> BTYPE=--)
>> [   55.758669] pc : cpufreq_cpu_get_raw+0x20/0x44
>> [   55.763233] lr : qcom_cpufreq_hw_get+0x10/0x64
>> [   55.767799] sp : ffff800009983d10
>> [   55.771207] x29: ffff800009983d10 x28: ffffaa13a4f2b000 x27: 
>> ffff7b31329f9305
>> [   55.778530] x26: ffffaa13a4f30af8 x25: ffffaa13a4f4e4c8 x24: 
>> ffff7b2ec2eda000
>> [   55.785851] x23: ffffffffffffffea x22: ffff7b2ec2e9fc18 x21: 
>> ffff7b2ec2e9fc00
>> [   55.793170] x20: 0000000000000100 x19: ffff7b2ec2e9fcc0 x18: 
>> ffffffffffffffff
>> [   55.800490] x17: 726620746e656d75 x16: 6772612064696c61 x15: 
>> ffff8000899839c7
>> [   55.807812] x14: 0000000000000000 x13: 0000000000000000 x12: 
>> 0000000000000000
>> [   55.815140] x11: ffff7b2ec2e9fc80 x10: ffffaa13a59a1a70 x9 : 
>> 0000000000000000
>> [   55.822468] x8 : ffff7b2eca6917c0 x7 : ffffaa13a528b000 x6 : 
>> 0000000000000001
>> [   55.829788] x5 : 0000000000040000 x4 : 000000000000024f x3 : 
>> 0000000000000000
>> [   55.837116] x2 : 0000000000000100 x1 : ffffaa13a4bb6d80 x0 : 
>> 000003e800000001
>> [   55.844439] Call trace:
>> [   55.846951]  cpufreq_cpu_get_raw+0x20/0x44
>> [   55.851155]  qcom_lmh_dcvs_poll+0x104/0x160
>> [   55.855452]  process_one_work+0x288/0x69c
>> [   55.859574]  worker_thread+0x74/0x470
>> [   55.863336]  kthread+0xfc/0x100
>> [   55.866568]  ret_from_fork+0x10/0x20
>> [   55.870246] Code: b00065c1 91360021 d503233f f8625800 (f8616800)
>> [   55.876501] ---[ end trace 0000000000000000 ]---
>>
>> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt 
>> support")
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>   drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
>>   1 file changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
>> b/drivers/cpufreq/qcom-cpufreq-hw.c
>> index 44d46e52baea..7c1bb002e1c3 100644
>> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
>> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
>> @@ -296,6 +296,23 @@ static void qcom_lmh_dcvs_notify(struct 
>> qcom_cpufreq_data *data)
>>       struct dev_pm_opp *opp;
>>       unsigned int freq;
>> +    /*
>> +     * Synchronize against CPU going offline.
>> +     * cpufreq_offline() will get the write lock on policy->rwsem.
>> +     */
>> +retry:
>> +    if (unlikely(!down_read_trylock(&policy->rwsem))) {
>> +        mutex_lock(&data->throttle_lock);
>> +        if (data->cancel_throttle) {
>> +            mutex_unlock(&data->throttle_lock);
>> +            return;
>> +        }
>> +
>> +        mutex_unlock(&data->throttle_lock);
>> +
>> +        schedule();
>> +        goto retry;
>> +    }
> 
> do you see an option here to base the logic on policy->cpus mask value 
> instead of
> the semaphore lock status? I believe it should be possible, and, if so, 
> likely
> it would be preferable.

I used semaphore to remove a race with cpufreq_offline() code.
Using policy->cpus seems like a hack.

> 
>>       /*
>>        * Get the h/w throttled frequency, normalize it using the
>>        * registered opp table and use it to calculate thermal pressure.
>> @@ -314,9 +331,10 @@ static void qcom_lmh_dcvs_notify(struct 
>> qcom_cpufreq_data *data)
>>       /*
>>        * In the unlikely case policy is unregistered do not enable
>> -     * polling or h/w interrupt
>> +     * polling or h/w interrupt.
>> +     * If we are here, we have the policy->rwsem read lock,
>> +     * cancel_throttle can be toggled only with the write lock.
>>        */
>> -    mutex_lock(&data->throttle_lock);
>>       if (data->cancel_throttle)
>>           goto out;
>> @@ -331,7 +349,7 @@ static void qcom_lmh_dcvs_notify(struct 
>> qcom_cpufreq_data *data)
>>                    msecs_to_jiffies(10));
>>   out:
>> -    mutex_unlock(&data->throttle_lock);
>> +    up_read(&policy->rwsem);
>>   }
>>   static void qcom_lmh_dcvs_poll(struct work_struct *work)
>>
> 
> FWIW I can add my:
> 
> Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
> 
> -- 
> Best wishes,
> Vladimir


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  2022-03-09 22:39 ` [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
  2022-03-17 23:10   ` Vladimir Zapolskiy
@ 2022-03-24 16:00   ` Bjorn Andersson
  2022-03-25 18:52     ` Dmitry Baryshkov
  1 sibling, 1 reply; 16+ messages in thread
From: Bjorn Andersson @ 2022-03-24 16:00 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Rafael J. Wysocki, Viresh Kumar, linux-arm-msm,
	linux-pm, Thara Gopinath

On Wed 09 Mar 14:39 PST 2022, Dmitry Baryshkov wrote:

> qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
> This results in the following warnings and an oops. The driver would
> disable the worker, but it happens closer to the end of
> cpufreq_offline(). Change the locking in the qcom_lmh_dcvs_poll(), so
> that the worker can not run in parallel with cpufreq_offline() call.
> 
> [   55.650435] (NULL device *): dev_pm_opp_find_freq_floor: Invalid argument freq=00000000709ccbf9
> [   55.659420] (NULL device *): Can't find the OPP for throttling: -EINVAL!
> [   55.666329] Unable to handle kernel paging request at virtual address ffffadfba4bb6d81
> [   55.674491] Mem abort info:
> [   55.677363]   ESR = 0x96000004
> [   55.680527]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   55.686001]   SET = 0, FnV = 0
> [   55.689164]   EA = 0, S1PTW = 0
> [   55.692418]   FSC = 0x04: level 0 translation fault
> [   55.697449] Data abort info:
> [   55.700426]   ISV = 0, ISS = 0x00000004
> [   55.704383]   CM = 0, WnR = 0
> [   55.707455] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a98e9000
> [   55.714354] [ffffadfba4bb6d81] pgd=0000000000000000, p4d=0000000000000000
> [   55.721388] Internal error: Oops: 96000004 [#1] SMP
> [   55.726397] Modules linked in:
> [   55.729542] CPU: 7 PID: 162 Comm: kworker/7:1H Tainted: G S      W         5.17.0-rc6-00100-g04890a1d9672 #724
> [   55.746066] Workqueue: events_highpri qcom_lmh_dcvs_poll
> [   55.751527] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   55.758669] pc : cpufreq_cpu_get_raw+0x20/0x44
> [   55.763233] lr : qcom_cpufreq_hw_get+0x10/0x64
> [   55.767799] sp : ffff800009983d10
> [   55.771207] x29: ffff800009983d10 x28: ffffaa13a4f2b000 x27: ffff7b31329f9305
> [   55.778530] x26: ffffaa13a4f30af8 x25: ffffaa13a4f4e4c8 x24: ffff7b2ec2eda000
> [   55.785851] x23: ffffffffffffffea x22: ffff7b2ec2e9fc18 x21: ffff7b2ec2e9fc00
> [   55.793170] x20: 0000000000000100 x19: ffff7b2ec2e9fcc0 x18: ffffffffffffffff
> [   55.800490] x17: 726620746e656d75 x16: 6772612064696c61 x15: ffff8000899839c7
> [   55.807812] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [   55.815140] x11: ffff7b2ec2e9fc80 x10: ffffaa13a59a1a70 x9 : 0000000000000000
> [   55.822468] x8 : ffff7b2eca6917c0 x7 : ffffaa13a528b000 x6 : 0000000000000001
> [   55.829788] x5 : 0000000000040000 x4 : 000000000000024f x3 : 0000000000000000
> [   55.837116] x2 : 0000000000000100 x1 : ffffaa13a4bb6d80 x0 : 000003e800000001
> [   55.844439] Call trace:
> [   55.846951]  cpufreq_cpu_get_raw+0x20/0x44

While I don't have the line numbers, I presume this would be cause by
policy->cpus being empty and:

   int cpu = cpumask_first(policy->cpus);

returning >= nr_cpu_ids, which means that the get_cpu_device(cpu); on
the next line will return NULL and then we keep playing opp on that
NULL?


Seems like this would be exactly the same mistake as I did wrt
policy->cpus vs policy->related_cpus and we don't actually need the
specific CPU, we just need a cpu device in the frequency domain.

As such we should actually do cpumaks_first(policy->related_cpus)
instead.

> [   55.851155]  qcom_lmh_dcvs_poll+0x104/0x160
> [   55.855452]  process_one_work+0x288/0x69c
> [   55.859574]  worker_thread+0x74/0x470
> [   55.863336]  kthread+0xfc/0x100
> [   55.866568]  ret_from_fork+0x10/0x20
> [   55.870246] Code: b00065c1 91360021 d503233f f8625800 (f8616800)
> [   55.876501] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 44d46e52baea..7c1bb002e1c3 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -296,6 +296,23 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  	struct dev_pm_opp *opp;
>  	unsigned int freq;
>  
> +	/*
> +	 * Synchronize against CPU going offline.
> +	 * cpufreq_offline() will get the write lock on policy->rwsem.
> +	 */
> +retry:
> +	if (unlikely(!down_read_trylock(&policy->rwsem))) {
> +		mutex_lock(&data->throttle_lock);
> +		if (data->cancel_throttle) {
> +			mutex_unlock(&data->throttle_lock);
> +			return;
> +		}
> +
> +		mutex_unlock(&data->throttle_lock);
> +
> +		schedule();
> +		goto retry;
> +	}

And doing that instead would remove the need for doing this crazy
locking between the cpufreq core and qcom driver.

Above change would be trivial and -rc material.

Regards,
Bjorn

>  	/*
>  	 * Get the h/w throttled frequency, normalize it using the
>  	 * registered opp table and use it to calculate thermal pressure.
> @@ -314,9 +331,10 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  
>  	/*
>  	 * In the unlikely case policy is unregistered do not enable
> -	 * polling or h/w interrupt
> +	 * polling or h/w interrupt.
> +	 * If we are here, we have the policy->rwsem read lock,
> +	 * cancel_throttle can be toggled only with the write lock.
>  	 */
> -	mutex_lock(&data->throttle_lock);
>  	if (data->cancel_throttle)
>  		goto out;
>  
> @@ -331,7 +349,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  				 msecs_to_jiffies(10));
>  
>  out:
> -	mutex_unlock(&data->throttle_lock);
> +	up_read(&policy->rwsem);
>  }
>  
>  static void qcom_lmh_dcvs_poll(struct work_struct *work)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
  2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
  2022-03-11  8:03   ` Vladimir Zapolskiy
  2022-03-17 23:03   ` Vladimir Zapolskiy
@ 2022-03-24 16:05   ` Bjorn Andersson
  2 siblings, 0 replies; 16+ messages in thread
From: Bjorn Andersson @ 2022-03-24 16:05 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Rafael J. Wysocki, Viresh Kumar, linux-arm-msm,
	linux-pm, Thara Gopinath

On Wed 09 Mar 14:39 PST 2022, Dmitry Baryshkov wrote:

> Drop affinity hint before freeing the throttling IRQ to fix the
> following trace. One is not allowed to call free_irq() with an affinity
> hint in place (which was set by qcom_cpufreq_hw_lmh_init()).
> 
> [  185.114773] ------------[ cut here ]------------
> [  185.119517] WARNING: CPU: 7 PID: 43 at kernel/irq/manage.c:1887 free_irq+0x3a4/0x3dc
> [  185.127474] Modules linked in:
> [  185.130618] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S      W         5.17.0-rc6-00386-g67382a5b705d-dirty #690
> [  185.147125] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  185.154269] pc : free_irq+0x3a4/0x3dc
> [  185.158031] lr : free_irq+0x33c/0x3dc
> [  185.161792] sp : ffff80000841bc90
> [  185.165195] x29: ffff80000841bc90 x28: ffffa6edc5c3d000 x27: ffff6d93729e5908
> [  185.172515] x26: 0000000000000000 x25: ffff6d910109fc00 x24: ffff6d91011490e0
> [  185.179838] x23: ffff6d9101149218 x22: 0000000000000080 x21: 0000000000000000
> [  185.187163] x20: ffff6d9101149000 x19: ffff6d910ab61500 x18: ffffffffffffffff
> [  185.194487] x17: 2e35202020202020 x16: 2020202020202020 x15: ffff80008841b9a7
> [  185.201805] x14: 00000000000003c9 x13: 0000000000000001 x12: 0000000000000040
> [  185.209135] x11: ffff6d91005aab58 x10: ffff6d91005aab5a x9 : ffffc6a5ad1c5408
> [  185.216455] x8 : ffff6d91005adb88 x7 : 0000000000000000 x6 : ffffc6a5ab5a91f4
> [  185.223776] x5 : 0000000000000000 x4 : ffff6d91011490a8 x3 : ffffc6a5ad266108
> [  185.231098] x2 : 0000000013033204 x1 : ffff6d9101149000 x0 : ffff6d910a9cc000
> [  185.238421] Call trace:
> [  185.240932]  free_irq+0x3a4/0x3dc
> [  185.244334]  qcom_cpufreq_hw_cpu_exit+0x78/0xcc
> [  185.248985]  cpufreq_offline.isra.0+0x228/0x270
> [  185.253639]  cpuhp_cpufreq_offline+0x10/0x20
> [  185.258027]  cpuhp_invoke_callback+0x16c/0x2b0
> [  185.262592]  cpuhp_thread_fun+0x190/0x250
> [  185.266710]  smpboot_thread_fn+0x12c/0x230
> [  185.270914]  kthread+0xfc/0x100
> [  185.274145]  ret_from_fork+0x10/0x20
> [  185.277820] irq event stamp: 212
> [  185.281136] hardirqs last  enabled at (211): [<ffffc6a5ac57973c>] _raw_spin_unlock_irqrestore+0x8c/0xa0
> [  185.290775] hardirqs last disabled at (212): [<ffffc6a5ac572100>] __schedule+0x710/0xa10
> [  185.299081] softirqs last  enabled at (0): [<ffffc6a5ab50f7b0>] copy_process+0x7d0/0x1a14
> [  185.307475] softirqs last disabled at (0): [<0000000000000000>] 0x0
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 618e436018c0..44d46e52baea 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -427,6 +427,7 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
>  	mutex_unlock(&data->throttle_lock);
>  
>  	cancel_delayed_work_sync(&data->throttle_work);
> +	irq_set_affinity_hint(data->throttle_irq, NULL);
>  	free_irq(data->throttle_irq, data);
>  }
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting
  2022-03-09 22:39 ` [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
  2022-03-17 23:04   ` Vladimir Zapolskiy
@ 2022-03-24 16:06   ` Bjorn Andersson
  1 sibling, 0 replies; 16+ messages in thread
From: Bjorn Andersson @ 2022-03-24 16:06 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Rafael J. Wysocki, Viresh Kumar, linux-arm-msm,
	linux-pm, Thara Gopinath

On Wed 09 Mar 14:39 PST 2022, Dmitry Baryshkov wrote:

> The qcom_lmh_dcvs_notify() will get the dev_pm_opp instance for
> throttling, but will not put it, ending up with leaking a reference
> count and the following backtrace when putting the CPU offline.
> 
> Correctly put the reference count of the returned opp instance.
> 
> [   84.418025] ------------[ cut here ]------------
> [   84.422770] WARNING: CPU: 7 PID: 43 at drivers/opp/core.c:1396 _opp_table_kref_release+0x188/0x190
> [   84.431966] Modules linked in:
> [   84.435106] CPU: 7 PID: 43 Comm: cpuhp/7 Tainted: G S                5.17.0-rc6-00388-g7cf3c0d89c44-dirty #721
> [   84.451631] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> [   84.458781] pc : _opp_table_kref_release+0x188/0x190
> [   84.463878] lr : _opp_table_kref_release+0x78/0x190
> [   84.468885] sp : ffff80000841bc70
> [   84.472294] x29: ffff80000841bc70 x28: ffff6664afe3d000 x27: ffff1db6729e5908
> [   84.479621] x26: 0000000000000000 x25: 0000000000000000 x24: ffff1db6729e58e0
> [   84.486946] x23: ffff8000080a5000 x22: ffff1db40aad80e0 x21: ffff1db4002fec80
> [   84.494277] x20: ffff1db40aad8000 x19: ffffb751c3186300 x18: ffffffffffffffff
> [   84.501603] x17: 5300326563697665 x16: 645f676e696c6f6f x15: 00001186c1df5448
> [   84.508928] x14: 00000000000002e9 x13: 0000000000000000 x12: 0000000000000000
> [   84.516256] x11: ffffb751c3186368 x10: ffffb751c39a2a70 x9 : 0000000000000000
> [   84.523585] x8 : ffff1db4008edf00 x7 : ffffb751c328c000 x6 : 0000000000000001
> [   84.530916] x5 : 0000000000040000 x4 : 0000000000000001 x3 : ffff1db4008edf00
> [   84.538247] x2 : 0000000000000000 x1 : ffff1db400aa6100 x0 : ffff1db40aad80d0
> [   84.545579] Call trace:
> [   84.548101]  _opp_table_kref_release+0x188/0x190
> [   84.552842]  dev_pm_opp_remove_all_dynamic+0x8c/0xc0
> [   84.557949]  qcom_cpufreq_hw_cpu_exit+0x30/0xdc
> [   84.562608]  cpufreq_offline.isra.0+0x1b4/0x1d8
> [   84.567270]  cpuhp_cpufreq_offline+0x10/0x6c
> [   84.571663]  cpuhp_invoke_callback+0x16c/0x2b0
> [   84.576231]  cpuhp_thread_fun+0x190/0x250
> [   84.580353]  smpboot_thread_fn+0x12c/0x230
> [   84.584568]  kthread+0xfc/0x100
> [   84.587810]  ret_from_fork+0x10/0x20
> [   84.591490] irq event stamp: 3482
> [   84.594901] hardirqs last  enabled at (3481): [<ffffb751c13c3db0>] call_rcu+0x39c/0x50c
> [   84.603119] hardirqs last disabled at (3482): [<ffffb751c236b518>] el1_dbg+0x24/0x8c
> [   84.611074] softirqs last  enabled at (310): [<ffffb751c1290410>] _stext+0x410/0x588
> [   84.619028] softirqs last disabled at (305): [<ffffb751c131bf68>] __irq_exit_rcu+0x158/0x174
> [   84.627691] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 7c1bb002e1c3..fe638e141003 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -322,12 +322,18 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  
>  	opp = dev_pm_opp_find_freq_floor(dev, &freq_hz);
>  	if (IS_ERR(opp) && PTR_ERR(opp) == -ERANGE)
> -		dev_pm_opp_find_freq_ceil(dev, &freq_hz);
> +		opp = dev_pm_opp_find_freq_ceil(dev, &freq_hz);
>  
> -	throttled_freq = freq_hz / HZ_PER_KHZ;
> +	if (IS_ERR(opp)) {
> +		dev_warn_ratelimited(dev, "Can't find the OPP for throttling: %pe!\n", opp);
> +	} else {
> +		throttled_freq = freq_hz / HZ_PER_KHZ;
> +
> +		/* Update thermal pressure (the boost frequencies are accepted) */
> +		arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
>  
> -	/* Update thermal pressure (the boost frequencies are accepted) */
> -	arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
> +		dev_pm_opp_put(opp);
> +	}
>  
>  	/*
>  	 * In the unlikely case policy is unregistered do not enable
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations
  2022-03-09 22:39 ` [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
  2022-03-17 23:05   ` Vladimir Zapolskiy
@ 2022-03-24 16:08   ` Bjorn Andersson
  1 sibling, 0 replies; 16+ messages in thread
From: Bjorn Andersson @ 2022-03-24 16:08 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Rafael J. Wysocki, Viresh Kumar, linux-arm-msm,
	linux-pm, Thara Gopinath

On Wed 09 Mar 14:39 PST 2022, Dmitry Baryshkov wrote:

> Provide lightweight online and offline operations. This saves us from
> parsing all the resources each time the CPU is put online.
> 
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

I think the other 3 patches are -rc material, this seems like v5.19.

Regards,
Bjorn

> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 39 +++++++++++++++++++++++++------
>  1 file changed, 32 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index fe638e141003..d38b1552ec13 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -403,11 +403,12 @@ static const struct of_device_id qcom_cpufreq_hw_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match);
>  
> +static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy);
> +
>  static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>  {
>  	struct qcom_cpufreq_data *data = policy->driver_data;
>  	struct platform_device *pdev = cpufreq_get_driver_data();
> -	int ret;
>  
>  	/*
>  	 * Look for LMh interrupt. If no interrupt line is specified /
> @@ -419,12 +420,21 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>  	if (data->throttle_irq < 0)
>  		return data->throttle_irq;
>  
> -	data->cancel_throttle = false;
> -	data->policy = policy;
> -
>  	mutex_init(&data->throttle_lock);
>  	INIT_DEFERRABLE_WORK(&data->throttle_work, qcom_lmh_dcvs_poll);
>  
> +	return qcom_cpufreq_hw_lmh_online(policy);
> +}
> +
> +static int qcom_cpufreq_hw_lmh_online(struct cpufreq_policy *policy)
> +{
> +	struct qcom_cpufreq_data *data = policy->driver_data;
> +	struct platform_device *pdev = cpufreq_get_driver_data();
> +	int ret;
> +
> +	data->cancel_throttle = false;
> +	data->policy = policy;
> +
>  	snprintf(data->irq_name, sizeof(data->irq_name), "dcvsh-irq-%u", policy->cpu);
>  	ret = request_threaded_irq(data->throttle_irq, NULL, qcom_lmh_dcvs_handle_irq,
>  				   IRQF_ONESHOT | IRQF_NO_AUTOEN, data->irq_name, data);
> @@ -441,10 +451,12 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
>  	return 0;
>  }
>  
> -static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
> +static int qcom_cpufreq_hw_lmh_offline(struct cpufreq_policy *policy)
>  {
> +	struct qcom_cpufreq_data *data = policy->driver_data;
> +
>  	if (data->throttle_irq <= 0)
> -		return;
> +		return 0;
>  
>  	mutex_lock(&data->throttle_lock);
>  	data->cancel_throttle = true;
> @@ -453,6 +465,8 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
>  	cancel_delayed_work_sync(&data->throttle_work);
>  	irq_set_affinity_hint(data->throttle_irq, NULL);
>  	free_irq(data->throttle_irq, data);
> +
> +	return 0;
>  }
>  
>  static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> @@ -567,6 +581,16 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
>  	return ret;
>  }
>  
> +static int qcom_cpufreq_hw_cpu_online(struct cpufreq_policy *policy)
> +{
> +	return qcom_cpufreq_hw_lmh_online(policy);
> +}
> +
> +static int qcom_cpufreq_hw_cpu_offline(struct cpufreq_policy *policy)
> +{
> +	return qcom_cpufreq_hw_lmh_offline(policy);
> +}
> +
>  static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>  {
>  	struct device *cpu_dev = get_cpu_device(policy->cpu);
> @@ -576,7 +600,6 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>  
>  	dev_pm_opp_remove_all_dynamic(cpu_dev);
>  	dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
> -	qcom_cpufreq_hw_lmh_exit(data);
>  	kfree(policy->freq_table);
>  	kfree(data);
>  	iounmap(base);
> @@ -608,6 +631,8 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
>  	.get		= qcom_cpufreq_hw_get,
>  	.init		= qcom_cpufreq_hw_cpu_init,
>  	.exit		= qcom_cpufreq_hw_cpu_exit,
> +	.online		= qcom_cpufreq_hw_cpu_online,
> +	.offline	= qcom_cpufreq_hw_cpu_offline,
>  	.register_em	= cpufreq_register_em_with_opp,
>  	.fast_switch    = qcom_cpufreq_hw_fast_switch,
>  	.name		= "qcom-cpufreq-hw",
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  2022-03-24 16:00   ` Bjorn Andersson
@ 2022-03-25 18:52     ` Dmitry Baryshkov
  0 siblings, 0 replies; 16+ messages in thread
From: Dmitry Baryshkov @ 2022-03-25 18:52 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Andy Gross, Rafael J. Wysocki, Viresh Kumar, linux-arm-msm,
	linux-pm, Thara Gopinath

On 24/03/2022 19:00, Bjorn Andersson wrote:
> On Wed 09 Mar 14:39 PST 2022, Dmitry Baryshkov wrote:
> 
>> qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
>> This results in the following warnings and an oops. The driver would
>> disable the worker, but it happens closer to the end of
>> cpufreq_offline(). Change the locking in the qcom_lmh_dcvs_poll(), so
>> that the worker can not run in parallel with cpufreq_offline() call.
>>
>> [   55.650435] (NULL device *): dev_pm_opp_find_freq_floor: Invalid argument freq=00000000709ccbf9
>> [   55.659420] (NULL device *): Can't find the OPP for throttling: -EINVAL!
>> [   55.666329] Unable to handle kernel paging request at virtual address ffffadfba4bb6d81
>> [   55.674491] Mem abort info:
>> [   55.677363]   ESR = 0x96000004
>> [   55.680527]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   55.686001]   SET = 0, FnV = 0
>> [   55.689164]   EA = 0, S1PTW = 0
>> [   55.692418]   FSC = 0x04: level 0 translation fault
>> [   55.697449] Data abort info:
>> [   55.700426]   ISV = 0, ISS = 0x00000004
>> [   55.704383]   CM = 0, WnR = 0
>> [   55.707455] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a98e9000
>> [   55.714354] [ffffadfba4bb6d81] pgd=0000000000000000, p4d=0000000000000000
>> [   55.721388] Internal error: Oops: 96000004 [#1] SMP
>> [   55.726397] Modules linked in:
>> [   55.729542] CPU: 7 PID: 162 Comm: kworker/7:1H Tainted: G S      W         5.17.0-rc6-00100-g04890a1d9672 #724
>> [   55.746066] Workqueue: events_highpri qcom_lmh_dcvs_poll
>> [   55.751527] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [   55.758669] pc : cpufreq_cpu_get_raw+0x20/0x44
>> [   55.763233] lr : qcom_cpufreq_hw_get+0x10/0x64
>> [   55.767799] sp : ffff800009983d10
>> [   55.771207] x29: ffff800009983d10 x28: ffffaa13a4f2b000 x27: ffff7b31329f9305
>> [   55.778530] x26: ffffaa13a4f30af8 x25: ffffaa13a4f4e4c8 x24: ffff7b2ec2eda000
>> [   55.785851] x23: ffffffffffffffea x22: ffff7b2ec2e9fc18 x21: ffff7b2ec2e9fc00
>> [   55.793170] x20: 0000000000000100 x19: ffff7b2ec2e9fcc0 x18: ffffffffffffffff
>> [   55.800490] x17: 726620746e656d75 x16: 6772612064696c61 x15: ffff8000899839c7
>> [   55.807812] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>> [   55.815140] x11: ffff7b2ec2e9fc80 x10: ffffaa13a59a1a70 x9 : 0000000000000000
>> [   55.822468] x8 : ffff7b2eca6917c0 x7 : ffffaa13a528b000 x6 : 0000000000000001
>> [   55.829788] x5 : 0000000000040000 x4 : 000000000000024f x3 : 0000000000000000
>> [   55.837116] x2 : 0000000000000100 x1 : ffffaa13a4bb6d80 x0 : 000003e800000001
>> [   55.844439] Call trace:
>> [   55.846951]  cpufreq_cpu_get_raw+0x20/0x44
> 
> While I don't have the line numbers, I presume this would be cause by
> policy->cpus being empty and:
> 
>     int cpu = cpumask_first(policy->cpus);
> 
> returning >= nr_cpu_ids, which means that the get_cpu_device(cpu); on
> the next line will return NULL and then we keep playing opp on that
> NULL?
> 
> 
> Seems like this would be exactly the same mistake as I did wrt
> policy->cpus vs policy->related_cpus and we don't actually need the
> specific CPU, we just need a cpu device in the frequency domain.
> 
> As such we should actually do cpumaks_first(policy->related_cpus)
> instead.
> 
>> [   55.851155]  qcom_lmh_dcvs_poll+0x104/0x160
>> [   55.855452]  process_one_work+0x288/0x69c
>> [   55.859574]  worker_thread+0x74/0x470
>> [   55.863336]  kthread+0xfc/0x100
>> [   55.866568]  ret_from_fork+0x10/0x20
>> [   55.870246] Code: b00065c1 91360021 d503233f f8625800 (f8616800)
>> [   55.876501] ---[ end trace 0000000000000000 ]---
>>
>> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>   drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
>>   1 file changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
>> index 44d46e52baea..7c1bb002e1c3 100644
>> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
>> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
>> @@ -296,6 +296,23 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>>   	struct dev_pm_opp *opp;
>>   	unsigned int freq;
>>   
>> +	/*
>> +	 * Synchronize against CPU going offline.
>> +	 * cpufreq_offline() will get the write lock on policy->rwsem.
>> +	 */
>> +retry:
>> +	if (unlikely(!down_read_trylock(&policy->rwsem))) {
>> +		mutex_lock(&data->throttle_lock);
>> +		if (data->cancel_throttle) {
>> +			mutex_unlock(&data->throttle_lock);
>> +			return;
>> +		}
>> +
>> +		mutex_unlock(&data->throttle_lock);
>> +
>> +		schedule();
>> +		goto retry;
>> +	}
> 
> And doing that instead would remove the need for doing this crazy
> locking between the cpufreq core and qcom driver.

I didn't like the idea that the notifier is running in parallel with the 
cpufreq code modifying the policy. We can check if that's an issue 
separately.

> 
> Above change would be trivial and -rc material.
> 
> Regards,
> Bjorn
> 
>>   	/*
>>   	 * Get the h/w throttled frequency, normalize it using the
>>   	 * registered opp table and use it to calculate thermal pressure.
>> @@ -314,9 +331,10 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>>   
>>   	/*
>>   	 * In the unlikely case policy is unregistered do not enable
>> -	 * polling or h/w interrupt
>> +	 * polling or h/w interrupt.
>> +	 * If we are here, we have the policy->rwsem read lock,
>> +	 * cancel_throttle can be toggled only with the write lock.
>>   	 */
>> -	mutex_lock(&data->throttle_lock);
>>   	if (data->cancel_throttle)
>>   		goto out;
>>   
>> @@ -331,7 +349,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>>   				 msecs_to_jiffies(10));
>>   
>>   out:
>> -	mutex_unlock(&data->throttle_lock);
>> +	up_read(&policy->rwsem);
>>   }
>>   
>>   static void qcom_lmh_dcvs_poll(struct work_struct *work)
>> -- 
>> 2.34.1
>>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-03-25 19:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-09 22:39 [PATCH v2 0/4] cpufreq: qcom-hw: Fixes for cpu hotplug support Dmitry Baryshkov
2022-03-09 22:39 ` [PATCH v2 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
2022-03-11  8:03   ` Vladimir Zapolskiy
2022-03-17 23:03   ` Vladimir Zapolskiy
2022-03-24 16:05   ` Bjorn Andersson
2022-03-09 22:39 ` [PATCH v2 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
2022-03-17 23:10   ` Vladimir Zapolskiy
2022-03-17 23:21     ` Dmitry Baryshkov
2022-03-24 16:00   ` Bjorn Andersson
2022-03-25 18:52     ` Dmitry Baryshkov
2022-03-09 22:39 ` [PATCH v2 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
2022-03-17 23:04   ` Vladimir Zapolskiy
2022-03-24 16:06   ` Bjorn Andersson
2022-03-09 22:39 ` [PATCH v2 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
2022-03-17 23:05   ` Vladimir Zapolskiy
2022-03-24 16:08   ` Bjorn Andersson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.