linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] dev_pm_opp refcount issue on Arm Juno r0
@ 2018-12-20 15:27 Valentin Schneider
  2019-01-03  7:05 ` Viresh Kumar
  2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
  0 siblings, 2 replies; 11+ messages in thread
From: Valentin Schneider @ 2018-12-20 15:27 UTC (permalink / raw)
  To: linux-kernel, Linux PM, LAK
  Cc: Sudeep Holla, Rafael J. Wysocki, Viresh Kumar, nm, sboyd,
	Quentin Perret, Dietmar Eggemann, Douglas Raillard

Hi,

While running some hotplug torture test [1] on my Juno r0 I came across
the follow splat:

[  716.561862] ------------[ cut here ]------------
[  716.566451] refcount_t: underflow; use-after-free.
[  716.571240] WARNING: CPU: 2 PID: 18 at lib/refcount.c:280 refcount_dec_not_one+0x9c/0xc0
[  716.579246] Modules linked in:
[  716.582269] CPU: 2 PID: 18 Comm: cpuhp/2 Not tainted 4.20.0-rc7 #39
[  716.588469] Hardware name: ARM Juno development board (r0) (DT)
[  716.594326] pstate: 40000005 (nZcv daif -PAN -UAO)
[  716.599065] pc : refcount_dec_not_one+0x9c/0xc0
[  716.603546] lr : refcount_dec_not_one+0x9c/0xc0
[  716.608024] sp : ffff00000a063c70
[  716.611299] x29: ffff00000a063c70 x28: 0000000000000000 
[  716.616555] x27: 0000000000000000 x26: 0000000000000002 
[  716.621810] x25: ffff000009169000 x24: ffff000008f8e1b0 
[  716.627065] x23: ffff000008ce0920 x22: 00000000ffffffff 
[  716.632319] x21: ffff000009169000 x20: ffff8009762a2664 
[  716.637574] x19: ffff000009294a90 x18: 0000000000000400 
[  716.642828] x17: 0000000000000000 x16: 0000000000000000 
[  716.648082] x15: 0000000000000000 x14: 0000000000000400 
[  716.653336] x13: 000000000000023f x12: 0000000000043705 
[  716.658590] x11: 0000000000000108 x10: 0000000000000960 
[  716.663844] x9 : ffff00000a063970 x8 : ffff800976943ec0 
[  716.669098] x7 : 0000000000000000 x6 : ffff80097ff720b8 
[  716.674353] x5 : ffff80097ff720b8 x4 : 0000000000000000 
[  716.679607] x3 : ffff80097ff78e68 x2 : ffff80097ff720b8 
[  716.684861] x1 : 6374e2a7925c1100 x0 : 0000000000000000 
[  716.690115] Call trace:
[  716.692532]  refcount_dec_not_one+0x9c/0xc0
[  716.696669]  refcount_dec_and_mutex_lock+0x18/0x70
[  716.701409]  _put_opp_list_kref+0x28/0x50
[  716.705373]  _dev_pm_opp_find_and_remove_table+0x24/0x88
[  716.710628]  _dev_pm_opp_cpumask_remove_table+0x50/0xa0
[  716.715796]  dev_pm_opp_cpumask_remove_table+0x10/0x18
[  716.720879]  scpi_cpufreq_exit+0x40/0x50
[  716.724758]  cpufreq_offline+0x108/0x1e0
[  716.728637]  cpuhp_cpufreq_offline+0xc/0x18
[  716.732775]  cpuhp_invoke_callback+0x84/0x248
[  716.737084]  cpuhp_thread_fun+0xc4/0x148
[  716.740963]  smpboot_thread_fn+0x168/0x268
[  716.745013]  kthread+0x128/0x130
[  716.748204]  ret_from_fork+0x10/0x18
[  716.751738] ---[ end trace 0c658e0103aac29d ]---

The test produces a script [2] that can be found at the end of this email.

Kernel:
	7566ec393f41 ("Linux 4.20-rc7")
Config:
	arm64 defconfig w/ CONFIG_MOUSE_PS2=n
Firmware:
	ARM V2M_Juno Firmware v1.4.4
	Build Date: Jul 26 2016

	NOTICE:  BL31: v1.3(debug):v1.3-567-g3fb340a2
	NOTICE:  BL31: Built : 18:52:35, Apr 25 2017

Cheers,
Valentin

---

[1]: https://github.com/ARM-software/lisa/blob/next/lisa/tests/kernel/hotplug/torture.py

[2]: random_cpuhp.sh
#!/bin/sh
set -e
while true
do
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.055
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.084
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.014
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.069
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.037
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.075
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.088
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.064
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.049
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.024
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.097
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.013
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.094
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.073
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.022
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.057
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.054
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.022
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.018
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.057
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.046
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.018
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.016
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.016
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.044
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.046
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.093
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.098
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.072
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.013
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.099
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.07
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.022
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.041
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.098
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.032
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.043
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.076
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.072
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.036
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.042
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.016
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.07
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.018
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.055
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.096
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.012
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.093
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.086
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.09
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.077
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.01
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.026
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.049
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.083
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.096
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.067
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.083
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.089
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.065
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.066
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.099
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.08
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.076
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.01
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.017
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.049
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.057
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.083
    echo 0 > /sys/devices/system/cpu/cpu4/online
    sleep 0.037
    echo 0 > /sys/devices/system/cpu/cpu0/online
    sleep 0.04
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.051
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.03
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.067
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.011
    echo 1 > /sys/devices/system/cpu/cpu4/online
    sleep 0.041
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.057
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.082
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.067
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.069
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.062
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.074
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.025
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.016
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.017
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.025
    echo 0 > /sys/devices/system/cpu/cpu3/online
    sleep 0.016
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.082
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.021
    echo 0 > /sys/devices/system/cpu/cpu5/online
    sleep 0.02
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.035
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.063
    echo 0 > /sys/devices/system/cpu/cpu1/online
    sleep 0.064
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.029
    echo 1 > /sys/devices/system/cpu/cpu1/online
    sleep 0.096
    echo 1 > /sys/devices/system/cpu/cpu3/online
    sleep 0.073
    echo 1 > /sys/devices/system/cpu/cpu0/online
    sleep 0.049
    echo 1 > /sys/devices/system/cpu/cpu5/online
    sleep 0.065
    echo 0 > /sys/devices/system/cpu/cpu2/online
    sleep 0.024
    echo 1 > /sys/devices/system/cpu/cpu2/online
    sleep 0.092
done &
LOOP_PID=$!
sleep 10
[ $(ps -q $LOOP_PID | wc -l) -gt 1 ] && kill -9 $LOOP_PID
set +e

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] dev_pm_opp refcount issue on Arm Juno r0
  2018-12-20 15:27 [BUG] dev_pm_opp refcount issue on Arm Juno r0 Valentin Schneider
@ 2019-01-03  7:05 ` Viresh Kumar
  2019-01-03 10:33   ` Sudeep Holla
  2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
  1 sibling, 1 reply; 11+ messages in thread
From: Viresh Kumar @ 2019-01-03  7:05 UTC (permalink / raw)
  To: Valentin Schneider, Sudeep Holla
  Cc: linux-kernel, Linux PM, LAK, Rafael J. Wysocki, nm, sboyd,
	Quentin Perret, Dietmar Eggemann, Douglas Raillard

On 20-12-18, 15:27, Valentin Schneider wrote:
> Hi,
> 
> While running some hotplug torture test [1] on my Juno r0 I came across
> the follow splat:
> 
> [  716.561862] ------------[ cut here ]------------
> [  716.566451] refcount_t: underflow; use-after-free.
> [  716.571240] WARNING: CPU: 2 PID: 18 at lib/refcount.c:280 refcount_dec_not_one+0x9c/0xc0
> [  716.579246] Modules linked in:
> [  716.582269] CPU: 2 PID: 18 Comm: cpuhp/2 Not tainted 4.20.0-rc7 #39
> [  716.588469] Hardware name: ARM Juno development board (r0) (DT)
> [  716.594326] pstate: 40000005 (nZcv daif -PAN -UAO)
> [  716.599065] pc : refcount_dec_not_one+0x9c/0xc0
> [  716.603546] lr : refcount_dec_not_one+0x9c/0xc0
> [  716.608024] sp : ffff00000a063c70
> [  716.611299] x29: ffff00000a063c70 x28: 0000000000000000 
> [  716.616555] x27: 0000000000000000 x26: 0000000000000002 
> [  716.621810] x25: ffff000009169000 x24: ffff000008f8e1b0 
> [  716.627065] x23: ffff000008ce0920 x22: 00000000ffffffff 
> [  716.632319] x21: ffff000009169000 x20: ffff8009762a2664 
> [  716.637574] x19: ffff000009294a90 x18: 0000000000000400 
> [  716.642828] x17: 0000000000000000 x16: 0000000000000000 
> [  716.648082] x15: 0000000000000000 x14: 0000000000000400 
> [  716.653336] x13: 000000000000023f x12: 0000000000043705 
> [  716.658590] x11: 0000000000000108 x10: 0000000000000960 
> [  716.663844] x9 : ffff00000a063970 x8 : ffff800976943ec0 
> [  716.669098] x7 : 0000000000000000 x6 : ffff80097ff720b8 
> [  716.674353] x5 : ffff80097ff720b8 x4 : 0000000000000000 
> [  716.679607] x3 : ffff80097ff78e68 x2 : ffff80097ff720b8 
> [  716.684861] x1 : 6374e2a7925c1100 x0 : 0000000000000000 
> [  716.690115] Call trace:
> [  716.692532]  refcount_dec_not_one+0x9c/0xc0
> [  716.696669]  refcount_dec_and_mutex_lock+0x18/0x70
> [  716.701409]  _put_opp_list_kref+0x28/0x50
> [  716.705373]  _dev_pm_opp_find_and_remove_table+0x24/0x88
> [  716.710628]  _dev_pm_opp_cpumask_remove_table+0x50/0xa0
> [  716.715796]  dev_pm_opp_cpumask_remove_table+0x10/0x18
> [  716.720879]  scpi_cpufreq_exit+0x40/0x50
> [  716.724758]  cpufreq_offline+0x108/0x1e0
> [  716.728637]  cpuhp_cpufreq_offline+0xc/0x18

This probably happened due to some of recent OPP core changes and I missed
updating this platform (I updated mvebu though). The problem is completely
different from what you logs show :)

Please try the below patch.

@Sudeep: Please help review it as well.

-- 
viresh

-------------------------8<-------------------------

From f3913738618031e9d71ebf64461cee22909e6e20 Mon Sep 17 00:00:00 2001
Message-Id: <f3913738618031e9d71ebf64461cee22909e6e20.1546498940.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Thu, 3 Jan 2019 12:28:26 +0530
Subject: [PATCH] cpufreq: scpi: Fix freeing of OPPs

Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
_dev_pm_opp_remove_table()"), dynamically created OPP aren't
automatically removed anymore by dev_pm_opp_cpumask_remove_table().

The OPPs for scpi cpufreq driver aren't getting freed currently, fix
that by adding a new callback scpi_ops->remove_device_opps() which will
remove those OPPs.

Cc: 4.20 <stable@vger.kernel.org> # v4.20
Reported-by: Valentin Schneider <valentin.schneider@arm.com>
Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/scpi-cpufreq.c |  4 ++--
 drivers/firmware/arm_scpi.c    | 15 +++++++++++++++
 include/linux/scpi_protocol.h  |  1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index 87a98ec77773..1bfd168de0b2 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -177,7 +177,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 out_free_priv:
 	kfree(priv);
 out_free_opp:
-	dev_pm_opp_cpumask_remove_table(policy->cpus);
+	scpi_ops->remove_device_opps(cpu_dev);
 
 	return ret;
 }
@@ -190,7 +190,7 @@ static int scpi_cpufreq_exit(struct cpufreq_policy *policy)
 	clk_put(priv->clk);
 	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
 	kfree(priv);
-	dev_pm_opp_cpumask_remove_table(policy->related_cpus);
+	scpi_ops->remove_device_opps(priv->cpu_dev);
 
 	return 0;
 }
diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
index c7d06a36b23a..963f2ffbd820 100644
--- a/drivers/firmware/arm_scpi.c
+++ b/drivers/firmware/arm_scpi.c
@@ -716,6 +716,20 @@ static int scpi_dvfs_add_opps_to_device(struct device *dev)
 	return 0;
 }
 
+static void scpi_dvfs_remove_device_opps(struct device *dev)
+{
+	int idx;
+	struct scpi_opp *opp;
+	struct scpi_dvfs_info *info = scpi_dvfs_info(dev);
+
+	/* We already added OPPs successfully, this data can't be invalid */
+	if (WARN_ON(IS_ERR(info) || !info->opps))
+		return;
+
+	for (opp = info->opps, idx = 0; idx < info->count; idx++, opp++)
+		dev_pm_opp_remove(dev, opp->freq);
+}
+
 static int scpi_sensor_get_capability(u16 *sensors)
 {
 	__le16 cap;
@@ -799,6 +813,7 @@ static struct scpi_ops scpi_ops = {
 	.device_domain_id = scpi_dev_domain_id,
 	.get_transition_latency = scpi_dvfs_get_transition_latency,
 	.add_opps_to_device = scpi_dvfs_add_opps_to_device,
+	.remove_device_opps = scpi_dvfs_remove_device_opps,
 	.sensor_get_capability = scpi_sensor_get_capability,
 	.sensor_get_info = scpi_sensor_get_info,
 	.sensor_get_value = scpi_sensor_get_value,
diff --git a/include/linux/scpi_protocol.h b/include/linux/scpi_protocol.h
index 327d65663dbf..d020d517ecaa 100644
--- a/include/linux/scpi_protocol.h
+++ b/include/linux/scpi_protocol.h
@@ -70,6 +70,7 @@ struct scpi_ops {
 	int (*device_domain_id)(struct device *);
 	int (*get_transition_latency)(struct device *);
 	int (*add_opps_to_device)(struct device *);
+	void (*remove_device_opps)(struct device *);
 	int (*sensor_get_capability)(u16 *sensors);
 	int (*sensor_get_info)(u16 sensor_id, struct scpi_sensor_info *);
 	int (*sensor_get_value)(u16, u64 *);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [BUG] dev_pm_opp refcount issue on Arm Juno r0
  2019-01-03  7:05 ` Viresh Kumar
@ 2019-01-03 10:33   ` Sudeep Holla
  2019-01-03 10:38     ` Viresh Kumar
  0 siblings, 1 reply; 11+ messages in thread
From: Sudeep Holla @ 2019-01-03 10:33 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Valentin Schneider, linux-kernel, Linux PM, LAK,
	Rafael J. Wysocki, nm, sboyd, Quentin Perret, Dietmar Eggemann,
	Sudeep Holla, Douglas Raillard

On Thu, Jan 03, 2019 at 12:35:14PM +0530, Viresh Kumar wrote:

[...]

> @Sudeep: Please help review it as well.
> 
> -- 
> viresh
> 
> -------------------------8<-------------------------
> 
> From f3913738618031e9d71ebf64461cee22909e6e20 Mon Sep 17 00:00:00 2001
> Message-Id: <f3913738618031e9d71ebf64461cee22909e6e20.1546498940.git.viresh.kumar@linaro.org>
> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Thu, 3 Jan 2019 12:28:26 +0530
> Subject: [PATCH] cpufreq: scpi: Fix freeing of OPPs
> 
> Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> automatically removed anymore by dev_pm_opp_cpumask_remove_table().
> 
> The OPPs for scpi cpufreq driver aren't getting freed currently, fix
> that by adding a new callback scpi_ops->remove_device_opps() which will
> remove those OPPs.
> 
> Cc: 4.20 <stable@vger.kernel.org> # v4.20
> Reported-by: Valentin Schneider <valentin.schneider@arm.com>
> Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/scpi-cpufreq.c |  4 ++--
>  drivers/firmware/arm_scpi.c    | 15 +++++++++++++++
>  include/linux/scpi_protocol.h  |  1 +
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
> index 87a98ec77773..1bfd168de0b2 100644
> --- a/drivers/cpufreq/scpi-cpufreq.c
> +++ b/drivers/cpufreq/scpi-cpufreq.c
> @@ -177,7 +177,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
>  out_free_priv:
>  	kfree(priv);
>  out_free_opp:
> -	dev_pm_opp_cpumask_remove_table(policy->cpus);
> +	scpi_ops->remove_device_opps(cpu_dev);
>  
>  	return ret;
>  }
> @@ -190,7 +190,7 @@ static int scpi_cpufreq_exit(struct cpufreq_policy *policy)
>  	clk_put(priv->clk);
>  	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
>  	kfree(priv);
> -	dev_pm_opp_cpumask_remove_table(policy->related_cpus);
> +	scpi_ops->remove_device_opps(priv->cpu_dev);
>  
>  	return 0;
>  }
> diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
> index c7d06a36b23a..963f2ffbd820 100644
> --- a/drivers/firmware/arm_scpi.c
> +++ b/drivers/firmware/arm_scpi.c
> @@ -716,6 +716,20 @@ static int scpi_dvfs_add_opps_to_device(struct device *dev)
>  	return 0;
>  }
>  
> +static void scpi_dvfs_remove_device_opps(struct device *dev)
> +{
> +	int idx;
> +	struct scpi_opp *opp;
> +	struct scpi_dvfs_info *info = scpi_dvfs_info(dev);
> +
> +	/* We already added OPPs successfully, this data can't be invalid */

As you already state the checks are unnecessary, if we drop them we don't need
to add any firmware specific callbacks. I am thinking if it makes sense to
add a generic helper function to remove the OPPs from a device. If we have
that any driver needing that can use it. The main reason I think helper is
useful is that we need exactly same fix for SCMI driver too.

Thoughts ?

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] dev_pm_opp refcount issue on Arm Juno r0
  2019-01-03 10:33   ` Sudeep Holla
@ 2019-01-03 10:38     ` Viresh Kumar
  2019-01-03 10:41       ` Sudeep Holla
  0 siblings, 1 reply; 11+ messages in thread
From: Viresh Kumar @ 2019-01-03 10:38 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Valentin Schneider, linux-kernel, Linux PM, LAK,
	Rafael J. Wysocki, nm, sboyd, Quentin Perret, Dietmar Eggemann,
	Douglas Raillard

On 03-01-19, 10:33, Sudeep Holla wrote:
> As you already state the checks are unnecessary, if we drop them we don't need
> to add any firmware specific callbacks. I am thinking if it makes sense to
> add a generic helper function to remove the OPPs from a device. If we have
> that any driver needing that can use it. The main reason I think helper is
> useful is that we need exactly same fix for SCMI driver too.
> 
> Thoughts ?

Okay, will do that and fix both the drivers as well.

-- 
viresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] dev_pm_opp refcount issue on Arm Juno r0
  2019-01-03 10:38     ` Viresh Kumar
@ 2019-01-03 10:41       ` Sudeep Holla
  0 siblings, 0 replies; 11+ messages in thread
From: Sudeep Holla @ 2019-01-03 10:41 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Valentin Schneider, linux-kernel, Linux PM, LAK,
	Rafael J. Wysocki, nm, sboyd, Quentin Perret, Dietmar Eggemann,
	Sudeep Holla, Douglas Raillard

On Thu, Jan 03, 2019 at 04:08:29PM +0530, Viresh Kumar wrote:
> On 03-01-19, 10:33, Sudeep Holla wrote:
> > As you already state the checks are unnecessary, if we drop them we don't need
> > to add any firmware specific callbacks. I am thinking if it makes sense to
> > add a generic helper function to remove the OPPs from a device. If we have
> > that any driver needing that can use it. The main reason I think helper is
> > useful is that we need exactly same fix for SCMI driver too.
> >
> > Thoughts ?
>
> Okay, will do that and fix both the drivers as well.
>

Thanks, happy to test here.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2018-12-20 15:27 [BUG] dev_pm_opp refcount issue on Arm Juno r0 Valentin Schneider
  2019-01-03  7:05 ` Viresh Kumar
@ 2019-01-04  9:44 ` Viresh Kumar
  2019-01-04 10:10   ` Rafael J. Wysocki
                     ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Viresh Kumar @ 2019-01-04  9:44 UTC (permalink / raw)
  To: Valentin Schneider, Sudeep Holla, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd
  Cc: Viresh Kumar, linux-pm, Vincent Guittot, quentin.perret,
	dietmar.eggemann, Douglas.Raillard, 4 . 20, linux-arm-kernel,
	linux-kernel

Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
_dev_pm_opp_remove_table()"), dynamically created OPP aren't
automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
affects the scpi and scmi cpufreq drivers which no longer free OPPs on
failures or on invocations of the policy->exit() callback.

Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
called from these drivers instead of dev_pm_opp_cpumask_remove_table().

In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
opp_list isn't getting accessed simultaneously from other parts of the
OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
the opp_table->lock while traversing through the OPP list. And to
accomplish that, this patch also creates _opp_kref_release_unlocked()
which can be called from this new helper with the opp_table lock already
held.

Cc: 4.20 <stable@vger.kernel.org> # v4.20
Reported-by: Valentin Schneider <valentin.schneider@arm.com>
Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/scmi-cpufreq.c |  4 +--
 drivers/cpufreq/scpi-cpufreq.c |  4 +--
 drivers/opp/core.c             | 63 +++++++++++++++++++++++++++++++---
 include/linux/pm_opp.h         |  5 +++
 4 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 50b1551ba894..c2e66528f5ee 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -176,7 +176,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 out_free_priv:
 	kfree(priv);
 out_free_opp:
-	dev_pm_opp_cpumask_remove_table(policy->cpus);
+	dev_pm_opp_remove_all_dynamic(cpu_dev);
 
 	return ret;
 }
@@ -188,7 +188,7 @@ static int scmi_cpufreq_exit(struct cpufreq_policy *policy)
 	cpufreq_cooling_unregister(priv->cdev);
 	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
 	kfree(priv);
-	dev_pm_opp_cpumask_remove_table(policy->related_cpus);
+	dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index 87a98ec77773..99449738faa4 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -177,7 +177,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 out_free_priv:
 	kfree(priv);
 out_free_opp:
-	dev_pm_opp_cpumask_remove_table(policy->cpus);
+	dev_pm_opp_remove_all_dynamic(cpu_dev);
 
 	return ret;
 }
@@ -190,7 +190,7 @@ static int scpi_cpufreq_exit(struct cpufreq_policy *policy)
 	clk_put(priv->clk);
 	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
 	kfree(priv);
-	dev_pm_opp_cpumask_remove_table(policy->related_cpus);
+	dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
 
 	return 0;
 }
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index e5507add8f04..18f1639dbc4a 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -988,11 +988,9 @@ void _opp_free(struct dev_pm_opp *opp)
 	kfree(opp);
 }
 
-static void _opp_kref_release(struct kref *kref)
+static void _opp_kref_release(struct dev_pm_opp *opp,
+			      struct opp_table *opp_table)
 {
-	struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
-	struct opp_table *opp_table = opp->opp_table;
-
 	/*
 	 * Notify the changes in the availability of the operable
 	 * frequency/voltage list.
@@ -1002,7 +1000,22 @@ static void _opp_kref_release(struct kref *kref)
 	opp_debug_remove_one(opp);
 	list_del(&opp->node);
 	kfree(opp);
+}
 
+static void _opp_kref_release_unlocked(struct kref *kref)
+{
+	struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
+	struct opp_table *opp_table = opp->opp_table;
+
+	_opp_kref_release(opp, opp_table);
+}
+
+static void _opp_kref_release_locked(struct kref *kref)
+{
+	struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
+	struct opp_table *opp_table = opp->opp_table;
+
+	_opp_kref_release(opp, opp_table);
 	mutex_unlock(&opp_table->lock);
 }
 
@@ -1013,10 +1026,16 @@ void dev_pm_opp_get(struct dev_pm_opp *opp)
 
 void dev_pm_opp_put(struct dev_pm_opp *opp)
 {
-	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
+	kref_put_mutex(&opp->kref, _opp_kref_release_locked,
+		       &opp->opp_table->lock);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put);
 
+static void dev_pm_opp_put_unlocked(struct dev_pm_opp *opp)
+{
+	kref_put(&opp->kref, _opp_kref_release_unlocked);
+}
+
 /**
  * dev_pm_opp_remove()  - Remove an OPP from OPP table
  * @dev:	device for which we do this operation
@@ -1060,6 +1079,40 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_remove);
 
+/**
+ * dev_pm_opp_remove_all_dynamic() - Remove all dynamically created OPPs
+ * @dev:	device for which we do this operation
+ *
+ * This function removes all dynamically created OPPs from the opp table.
+ */
+void dev_pm_opp_remove_all_dynamic(struct device *dev)
+{
+	struct opp_table *opp_table;
+	struct dev_pm_opp *opp, *temp;
+	int count = 0;
+
+	opp_table = _find_opp_table(dev);
+	if (IS_ERR(opp_table))
+		return;
+
+	mutex_lock(&opp_table->lock);
+	list_for_each_entry_safe(opp, temp, &opp_table->opp_list, node) {
+		if (opp->dynamic) {
+			dev_pm_opp_put_unlocked(opp);
+			count++;
+		}
+	}
+	mutex_unlock(&opp_table->lock);
+
+	/* Drop the references taken by dev_pm_opp_add() */
+	while (count--)
+		dev_pm_opp_put_opp_table(opp_table);
+
+	/* Drop the reference taken by _find_opp_table() */
+	dev_pm_opp_put_opp_table(opp_table);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_remove_all_dynamic);
+
 struct dev_pm_opp *_opp_allocate(struct opp_table *table)
 {
 	struct dev_pm_opp *opp;
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 0a2a88e5a383..b895f4e79868 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -108,6 +108,7 @@ void dev_pm_opp_put(struct dev_pm_opp *opp);
 int dev_pm_opp_add(struct device *dev, unsigned long freq,
 		   unsigned long u_volt);
 void dev_pm_opp_remove(struct device *dev, unsigned long freq);
+void dev_pm_opp_remove_all_dynamic(struct device *dev);
 
 int dev_pm_opp_enable(struct device *dev, unsigned long freq);
 
@@ -217,6 +218,10 @@ static inline void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 {
 }
 
+static inline void dev_pm_opp_remove_all_dynamic(struct device *dev)
+{
+}
+
 static inline int dev_pm_opp_enable(struct device *dev, unsigned long freq)
 {
 	return 0;
-- 
2.19.1.568.g152ad8e3369a


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
@ 2019-01-04 10:10   ` Rafael J. Wysocki
  2019-01-04 10:16     ` Viresh Kumar
  2019-01-04 10:40   ` Valentin Schneider
  2019-01-04 11:01   ` Sudeep Holla
  2 siblings, 1 reply; 11+ messages in thread
From: Rafael J. Wysocki @ 2019-01-04 10:10 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Valentin Schneider, Sudeep Holla, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd, Linux PM, Vincent Guittot,
	Quentin Perret, Dietmar Eggemann, Douglas.Raillard, 4 . 20,
	Linux ARM, Linux Kernel Mailing List

On Fri, Jan 4, 2019 at 10:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
> affects the scpi and scmi cpufreq drivers which no longer free OPPs on
> failures or on invocations of the policy->exit() callback.
>
> Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
> called from these drivers instead of dev_pm_opp_cpumask_remove_table().
>
> In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
> opp_list isn't getting accessed simultaneously from other parts of the
> OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
> the opp_table->lock while traversing through the OPP list. And to
> accomplish that, this patch also creates _opp_kref_release_unlocked()
> which can be called from this new helper with the opp_table lock already
> held.
>
> Cc: 4.20 <stable@vger.kernel.org> # v4.20
> Reported-by: Valentin Schneider <valentin.schneider@arm.com>
> Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

I guess I'll pick it up by hand.

I'm assuming that you have tested it, have you?

> ---
>  drivers/cpufreq/scmi-cpufreq.c |  4 +--
>  drivers/cpufreq/scpi-cpufreq.c |  4 +--
>  drivers/opp/core.c             | 63 +++++++++++++++++++++++++++++++---
>  include/linux/pm_opp.h         |  5 +++
>  4 files changed, 67 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
> index 50b1551ba894..c2e66528f5ee 100644
> --- a/drivers/cpufreq/scmi-cpufreq.c
> +++ b/drivers/cpufreq/scmi-cpufreq.c
> @@ -176,7 +176,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
>  out_free_priv:
>         kfree(priv);
>  out_free_opp:
> -       dev_pm_opp_cpumask_remove_table(policy->cpus);
> +       dev_pm_opp_remove_all_dynamic(cpu_dev);
>
>         return ret;
>  }
> @@ -188,7 +188,7 @@ static int scmi_cpufreq_exit(struct cpufreq_policy *policy)
>         cpufreq_cooling_unregister(priv->cdev);
>         dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
>         kfree(priv);
> -       dev_pm_opp_cpumask_remove_table(policy->related_cpus);
> +       dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
>
>         return 0;
>  }
> diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
> index 87a98ec77773..99449738faa4 100644
> --- a/drivers/cpufreq/scpi-cpufreq.c
> +++ b/drivers/cpufreq/scpi-cpufreq.c
> @@ -177,7 +177,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
>  out_free_priv:
>         kfree(priv);
>  out_free_opp:
> -       dev_pm_opp_cpumask_remove_table(policy->cpus);
> +       dev_pm_opp_remove_all_dynamic(cpu_dev);
>
>         return ret;
>  }
> @@ -190,7 +190,7 @@ static int scpi_cpufreq_exit(struct cpufreq_policy *policy)
>         clk_put(priv->clk);
>         dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
>         kfree(priv);
> -       dev_pm_opp_cpumask_remove_table(policy->related_cpus);
> +       dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
>
>         return 0;
>  }
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index e5507add8f04..18f1639dbc4a 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -988,11 +988,9 @@ void _opp_free(struct dev_pm_opp *opp)
>         kfree(opp);
>  }
>
> -static void _opp_kref_release(struct kref *kref)
> +static void _opp_kref_release(struct dev_pm_opp *opp,
> +                             struct opp_table *opp_table)
>  {
> -       struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
> -       struct opp_table *opp_table = opp->opp_table;
> -
>         /*
>          * Notify the changes in the availability of the operable
>          * frequency/voltage list.
> @@ -1002,7 +1000,22 @@ static void _opp_kref_release(struct kref *kref)
>         opp_debug_remove_one(opp);
>         list_del(&opp->node);
>         kfree(opp);
> +}
>
> +static void _opp_kref_release_unlocked(struct kref *kref)
> +{
> +       struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
> +       struct opp_table *opp_table = opp->opp_table;
> +
> +       _opp_kref_release(opp, opp_table);
> +}
> +
> +static void _opp_kref_release_locked(struct kref *kref)
> +{
> +       struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
> +       struct opp_table *opp_table = opp->opp_table;
> +
> +       _opp_kref_release(opp, opp_table);
>         mutex_unlock(&opp_table->lock);
>  }
>
> @@ -1013,10 +1026,16 @@ void dev_pm_opp_get(struct dev_pm_opp *opp)
>
>  void dev_pm_opp_put(struct dev_pm_opp *opp)
>  {
> -       kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
> +       kref_put_mutex(&opp->kref, _opp_kref_release_locked,
> +                      &opp->opp_table->lock);
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_put);
>
> +static void dev_pm_opp_put_unlocked(struct dev_pm_opp *opp)
> +{
> +       kref_put(&opp->kref, _opp_kref_release_unlocked);
> +}
> +
>  /**
>   * dev_pm_opp_remove()  - Remove an OPP from OPP table
>   * @dev:       device for which we do this operation
> @@ -1060,6 +1079,40 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_remove);
>
> +/**
> + * dev_pm_opp_remove_all_dynamic() - Remove all dynamically created OPPs
> + * @dev:       device for which we do this operation
> + *
> + * This function removes all dynamically created OPPs from the opp table.
> + */
> +void dev_pm_opp_remove_all_dynamic(struct device *dev)
> +{
> +       struct opp_table *opp_table;
> +       struct dev_pm_opp *opp, *temp;
> +       int count = 0;
> +
> +       opp_table = _find_opp_table(dev);
> +       if (IS_ERR(opp_table))
> +               return;
> +
> +       mutex_lock(&opp_table->lock);
> +       list_for_each_entry_safe(opp, temp, &opp_table->opp_list, node) {
> +               if (opp->dynamic) {
> +                       dev_pm_opp_put_unlocked(opp);
> +                       count++;
> +               }
> +       }
> +       mutex_unlock(&opp_table->lock);
> +
> +       /* Drop the references taken by dev_pm_opp_add() */
> +       while (count--)
> +               dev_pm_opp_put_opp_table(opp_table);
> +
> +       /* Drop the reference taken by _find_opp_table() */
> +       dev_pm_opp_put_opp_table(opp_table);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_remove_all_dynamic);
> +
>  struct dev_pm_opp *_opp_allocate(struct opp_table *table)
>  {
>         struct dev_pm_opp *opp;
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index 0a2a88e5a383..b895f4e79868 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -108,6 +108,7 @@ void dev_pm_opp_put(struct dev_pm_opp *opp);
>  int dev_pm_opp_add(struct device *dev, unsigned long freq,
>                    unsigned long u_volt);
>  void dev_pm_opp_remove(struct device *dev, unsigned long freq);
> +void dev_pm_opp_remove_all_dynamic(struct device *dev);
>
>  int dev_pm_opp_enable(struct device *dev, unsigned long freq);
>
> @@ -217,6 +218,10 @@ static inline void dev_pm_opp_remove(struct device *dev, unsigned long freq)
>  {
>  }
>
> +static inline void dev_pm_opp_remove_all_dynamic(struct device *dev)
> +{
> +}
> +
>  static inline int dev_pm_opp_enable(struct device *dev, unsigned long freq)
>  {
>         return 0;
> --
> 2.19.1.568.g152ad8e3369a
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2019-01-04 10:10   ` Rafael J. Wysocki
@ 2019-01-04 10:16     ` Viresh Kumar
  0 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2019-01-04 10:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Valentin Schneider, Sudeep Holla, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd, Linux PM, Vincent Guittot,
	Quentin Perret, Dietmar Eggemann, Douglas.Raillard, 4 . 20,
	Linux ARM, Linux Kernel Mailing List

On 04-01-19, 11:10, Rafael J. Wysocki wrote:
> On Fri, Jan 4, 2019 at 10:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> > _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> > automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
> > affects the scpi and scmi cpufreq drivers which no longer free OPPs on
> > failures or on invocations of the policy->exit() callback.
> >
> > Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
> > called from these drivers instead of dev_pm_opp_cpumask_remove_table().
> >
> > In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
> > opp_list isn't getting accessed simultaneously from other parts of the
> > OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
> > the opp_table->lock while traversing through the OPP list. And to
> > accomplish that, this patch also creates _opp_kref_release_unlocked()
> > which can be called from this new helper with the opp_table lock already
> > held.
> >
> > Cc: 4.20 <stable@vger.kernel.org> # v4.20
> > Reported-by: Valentin Schneider <valentin.schneider@arm.com>
> > Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
> I guess I'll pick it up by hand.

Sure.

> I'm assuming that you have tested it, have you?

Yes, but I had to fake few dynamic OPPs and ignore the static ones
coming from DT. Lets wait for Sudeep or Valentin to test this, who
have real hardware to fix with this patch.

-- 
viresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
  2019-01-04 10:10   ` Rafael J. Wysocki
@ 2019-01-04 10:40   ` Valentin Schneider
  2019-01-04 11:01   ` Sudeep Holla
  2 siblings, 0 replies; 11+ messages in thread
From: Valentin Schneider @ 2019-01-04 10:40 UTC (permalink / raw)
  To: Viresh Kumar, Sudeep Holla, Rafael J. Wysocki, Nishanth Menon,
	Stephen Boyd
  Cc: linux-pm, Vincent Guittot, quentin.perret, dietmar.eggemann,
	Douglas.Raillard, 4 . 20, linux-arm-kernel, linux-kernel

Hi,

On 04/01/2019 09:44, Viresh Kumar wrote:
> Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
> affects the scpi and scmi cpufreq drivers which no longer free OPPs on
> failures or on invocations of the policy->exit() callback.
> 
> Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
> called from these drivers instead of dev_pm_opp_cpumask_remove_table().
> 
> In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
> opp_list isn't getting accessed simultaneously from other parts of the
> OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
> the opp_table->lock while traversing through the OPP list. And to
> accomplish that, this patch also creates _opp_kref_release_unlocked()
> which can be called from this new helper with the opp_table lock already
> held.
> 
> Cc: 4.20 <stable@vger.kernel.org> # v4.20
> Reported-by: Valentin Schneider <valentin.schneider@arm.com>
> Fixes: 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()")
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Gave it a spin on my Juno, and I don't see the refcount warning anymore so:

Tested-by: Valentin Schneider <valentin.schneider@arm.com>

Thanks for having a look at this.

[...]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
  2019-01-04 10:10   ` Rafael J. Wysocki
  2019-01-04 10:40   ` Valentin Schneider
@ 2019-01-04 11:01   ` Sudeep Holla
  2019-01-11 10:37     ` Rafael J. Wysocki
  2 siblings, 1 reply; 11+ messages in thread
From: Sudeep Holla @ 2019-01-04 11:01 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Valentin Schneider, Rafael J. Wysocki, Nishanth Menon,
	Stephen Boyd, linux-pm, Vincent Guittot, quentin.perret,
	dietmar.eggemann, Sudeep Holla, Douglas.Raillard, 4 . 20,
	linux-arm-kernel, linux-kernel

On Fri, Jan 04, 2019 at 03:14:33PM +0530, Viresh Kumar wrote:
> Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
> affects the scpi and scmi cpufreq drivers which no longer free OPPs on
> failures or on invocations of the policy->exit() callback.
>
> Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
> called from these drivers instead of dev_pm_opp_cpumask_remove_table().
>
> In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
> opp_list isn't getting accessed simultaneously from other parts of the
> OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
> the opp_table->lock while traversing through the OPP list. And to
> accomplish that, this patch also creates _opp_kref_release_unlocked()
> which can be called from this new helper with the opp_table lock already
> held.
>

I did test it but since I couldn't reproduce the original issue, my
tested-by is not that worth. Anyways the changes look fine to me.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
  2019-01-04 11:01   ` Sudeep Holla
@ 2019-01-11 10:37     ` Rafael J. Wysocki
  0 siblings, 0 replies; 11+ messages in thread
From: Rafael J. Wysocki @ 2019-01-11 10:37 UTC (permalink / raw)
  To: Sudeep Holla, Viresh Kumar
  Cc: Valentin Schneider, Nishanth Menon, Stephen Boyd, linux-pm,
	Vincent Guittot, quentin.perret, dietmar.eggemann,
	Douglas.Raillard, 4 . 20, linux-arm-kernel, linux-kernel

On Friday, January 4, 2019 12:01:42 PM CET Sudeep Holla wrote:
> On Fri, Jan 04, 2019 at 03:14:33PM +0530, Viresh Kumar wrote:
> > Since the commit 2a4eb7358aba ("OPP: Don't remove dynamic OPPs from
> > _dev_pm_opp_remove_table()"), dynamically created OPP aren't
> > automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
> > affects the scpi and scmi cpufreq drivers which no longer free OPPs on
> > failures or on invocations of the policy->exit() callback.
> >
> > Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
> > called from these drivers instead of dev_pm_opp_cpumask_remove_table().
> >
> > In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
> > opp_list isn't getting accessed simultaneously from other parts of the
> > OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
> > the opp_table->lock while traversing through the OPP list. And to
> > accomplish that, this patch also creates _opp_kref_release_unlocked()
> > which can be called from this new helper with the opp_table lock already
> > held.
> >
> 
> I did test it but since I couldn't reproduce the original issue, my
> tested-by is not that worth. Anyways the changes look fine to me.
> 
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

Applied and pushed to Linus, thanks!


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-01-11 10:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-20 15:27 [BUG] dev_pm_opp refcount issue on Arm Juno r0 Valentin Schneider
2019-01-03  7:05 ` Viresh Kumar
2019-01-03 10:33   ` Sudeep Holla
2019-01-03 10:38     ` Viresh Kumar
2019-01-03 10:41       ` Sudeep Holla
2019-01-04  9:44 ` [PATCH] cpufreq: scpi/scmi: Fix freeing of dynamic OPPs Viresh Kumar
2019-01-04 10:10   ` Rafael J. Wysocki
2019-01-04 10:16     ` Viresh Kumar
2019-01-04 10:40   ` Valentin Schneider
2019-01-04 11:01   ` Sudeep Holla
2019-01-11 10:37     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).