All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] cpufreq/opp: rework regulator initialization
       [not found] <CGME20190207122255eucas1p1cdebed838c799eca46cce6a654a26187@eucas1p1.samsung.com>
@ 2019-02-07 12:22 ` Marek Szyprowski
       [not found]   ` <CGME20190207122255eucas1p1444023f01217a43cfb958fe0bd48ef4d@eucas1p1.samsung.com>
                     ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel, linux-pm
  Cc: Marek Szyprowski, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

Dear All,

Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
i2c adapters") added a visible warning for an attempt to do i2c transfer
over a suspended i2c bus. This revealed a long standing issue in the
cpufreq-dt driver, which gives a following warning during system
suspend/resume cycle:

--->8---
Enabling non-boot CPUs ...
CPU1 is up
CPU2 is up
CPU3 is up
------------[ cut here ]------------
WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
Modules linked in:
CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
[<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
[<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
[<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
[<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
[<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
[<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
[<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
[<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
[<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
[<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
[<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
[<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
[<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
[<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
[<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
[<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
[<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
[<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
[<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
[<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
[<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
[<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
[<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xe897dfb0 to 0xe897dff8)
dfa0:                                     00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
irq event stamp: 3865
hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
---[ end trace db48b455d924fec2 ]---
CPU4 is up
CPU5 is up
CPU6 is up
CPU7 is up
--->8---

This is a scenario that triggers the above issue:

1. system disables non-boot cpu's at the end of system suspend procedure,
2. this in turn deinitializes cpufreq drivers for the disabled cpus,
3. early in the system resume procedure all cpus are got back to online
   state,
4. this in turn causes cpufreq to be initialized for the newly onlined
   cpus,
5. cpufreq-dt acquires all its resources (clocks, regulators) during
   ->init() callback,
6. getting regulator require to check its state, what in turn requires
   i2c transfer,
7. during system early resume stage this is not really possible.

The issue is caused by cpufreq-dt driver not keeping its resources for
the whole driver lifetime and relying that they can be always acquired
at any system context. This problem has been observed on Samsung
Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which
have separate regulators for different CPU clusters.

To fix this issue I've rewritten resources (clock and regulators)
handling in cpufreq-dt driver. Now the driver gathers them in its
->probe() and keeps using them until the ->remove() happens. This
required also some rework in common opp regulators handling code.
Namely, regulators are no longer assigned by name, instead the caller
has to pass proper regulator object. The side-effect of this work
is also a removal of the FIXME item and correct handling of deferred
probe caused by lack of non-CPU0 regulator.

Best regards
Marek Szyprowski
Samsung R&D Institute Poland


Patch summary:

Marek Szyprowski (2):
  cpufreq: dt/ti/opp: move regulators initialization to the drivers
  cpufreq: dt: rework resources initialization

 drivers/cpufreq/cpufreq-dt.c | 127 +++++++++++++++--------------------
 drivers/cpufreq/ti-cpufreq.c |  42 ++++++++++--
 drivers/opp/core.c           |  40 ++---------
 include/linux/pm_opp.h       |   2 +-
 4 files changed, 96 insertions(+), 115 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/2] cpufreq: dt/ti/opp: move regulators initialization to the drivers
       [not found]   ` <CGME20190207122255eucas1p1444023f01217a43cfb958fe0bd48ef4d@eucas1p1.samsung.com>
@ 2019-02-07 12:22     ` Marek Szyprowski
  0 siblings, 0 replies; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel, linux-pm
  Cc: Marek Szyprowski, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

dev_pm_opp_set_regulators() helper is used to assign the regulators to
the used operation points. This helper however only got the names of the
passed regulators and performs their initialization on its own.

Change this by requiring proper regulator objects and move regulator
gathering to the client drivers. This will be later needed to avoid
regulator initialization in forbidden context (i.e. during early system
resume, when no irqs are available yet). Both clients of the
dev_pm_opp_set_regulators() function are adapted to the new signature.

ti-cpufreq driver is also marked with 'suppress_bind_attrs', as it really
doesn't properly support driver removal.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/cpufreq/cpufreq-dt.c | 39 ++++++++++++++++++++++-----------
 drivers/cpufreq/ti-cpufreq.c | 42 ++++++++++++++++++++++++++++++------
 drivers/opp/core.c           | 40 ++++------------------------------
 include/linux/pm_opp.h       |  2 +-
 4 files changed, 67 insertions(+), 56 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 36a011ea0039..02a344e9d818 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -30,6 +30,7 @@ struct private_data {
 	struct opp_table *opp_table;
 	struct device *cpu_dev;
 	const char *reg_name;
+	struct regulator *reg;
 	bool have_static_opps;
 };
 
@@ -153,6 +154,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	struct cpufreq_frequency_table *freq_table;
 	struct opp_table *opp_table = NULL;
 	struct private_data *priv;
+	struct regulator *reg;
 	struct device *cpu_dev;
 	struct clk *cpu_clk;
 	unsigned int transition_latency;
@@ -188,27 +190,34 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 			fallback = true;
 	}
 
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv) {
+		ret = -ENOMEM;
+		goto out_put_clk;
+	}
+
 	/*
-	 * OPP layer will be taking care of regulators now, but it needs to know
-	 * the name of the regulator first.
+	 * OPP layer will be taking care of regulators.
 	 */
 	name = find_supply_name(cpu_dev);
 	if (name) {
-		opp_table = dev_pm_opp_set_regulators(cpu_dev, &name, 1);
+		reg = regulator_get_optional(cpu_dev, name);
+		ret = PTR_ERR_OR_ZERO(reg);
+		if (ret) {
+			dev_err(cpu_dev, "Failed to get regulator for cpu%d: %d\n",
+				policy->cpu, ret);
+			goto out_free_priv;
+		}
+		priv->reg = reg;
+		opp_table = dev_pm_opp_set_regulators(cpu_dev, &priv->reg, 1);
 		if (IS_ERR(opp_table)) {
 			ret = PTR_ERR(opp_table);
 			dev_err(cpu_dev, "Failed to set regulator for cpu%d: %d\n",
 				policy->cpu, ret);
-			goto out_put_clk;
+			goto out_put_regulator;
 		}
 	}
 
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv) {
-		ret = -ENOMEM;
-		goto out_put_regulator;
-	}
-
 	priv->reg_name = name;
 	priv->opp_table = opp_table;
 
@@ -287,10 +296,14 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 out_free_opp:
 	if (priv->have_static_opps)
 		dev_pm_opp_of_cpumask_remove_table(policy->cpus);
-	kfree(priv);
-out_put_regulator:
+out_put_opp_regulator:
 	if (name)
 		dev_pm_opp_put_regulators(opp_table);
+out_put_regulator:
+	if (priv->reg)
+		regulator_put(priv->reg);
+out_free_priv:
+	kfree(priv);
 out_put_clk:
 	clk_put(cpu_clk);
 
@@ -306,6 +319,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
 		dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
 	if (priv->reg_name)
 		dev_pm_opp_put_regulators(priv->opp_table);
+	if (priv->reg)
+		regulator_put(priv->reg);
 
 	clk_put(policy->clk);
 	kfree(priv);
diff --git a/drivers/cpufreq/ti-cpufreq.c b/drivers/cpufreq/ti-cpufreq.c
index 22b53bf26817..623ae7fa34f9 100644
--- a/drivers/cpufreq/ti-cpufreq.c
+++ b/drivers/cpufreq/ti-cpufreq.c
@@ -23,6 +23,7 @@
 #include <linux/of_platform.h>
 #include <linux/pm_opp.h>
 #include <linux/regmap.h>
+#include <linux/regulator/consumer.h>
 #include <linux/slab.h>
 
 #define REVISION_MASK				0xF
@@ -213,13 +214,43 @@ static const struct of_device_id *ti_cpufreq_match_node(void)
 	return match;
 }
 
+#define TI_MULTIREGULATOR_COUNT 2
+static struct regulator *multi_regulators[TI_MULTIREGULATOR_COUNT];
+
+static int ti_setup_multi_regulators(struct device *cpu_dev)
+{
+	struct opp_table *ti_opp_table;
+	int ret = 0;
+
+	multi_regulators[0] = regulator_get(cpu_dev, "vdd");
+	if (IS_ERR(multi_regulators[0]))
+		return PTR_ERR(multi_regulators[0]);
+	multi_regulators[1] = regulator_get(cpu_dev, "vbb");
+	if (IS_ERR(multi_regulators[1])) {
+		ret = PTR_ERR(multi_regulators[1]);
+		goto free0;
+	}
+
+	ti_opp_table = dev_pm_opp_set_regulators(cpu_dev, multi_regulators,
+						 TI_MULTIREGULATOR_COUNT);
+	if (IS_ERR(ti_opp_table)) {
+		ret = PTR_ERR(ti_opp_table);
+		goto free1;
+	}
+	return 0;
+free1:
+	regulator_put(multi_regulators[1]);
+free0:
+	regulator_put(multi_regulators[0]);
+	return ret;
+}
+
 static int ti_cpufreq_probe(struct platform_device *pdev)
 {
 	u32 version[VERSION_COUNT];
 	const struct of_device_id *match;
 	struct opp_table *ti_opp_table;
 	struct ti_cpufreq_data *opp_data;
-	const char * const reg_names[] = {"vdd", "vbb"};
 	int ret;
 
 	match = dev_get_platdata(&pdev->dev);
@@ -273,14 +304,10 @@ static int ti_cpufreq_probe(struct platform_device *pdev)
 	}
 
 	opp_data->opp_table = ti_opp_table;
-
 	if (opp_data->soc_data->multi_regulator) {
-		ti_opp_table = dev_pm_opp_set_regulators(opp_data->cpu_dev,
-							 reg_names,
-							 ARRAY_SIZE(reg_names));
-		if (IS_ERR(ti_opp_table)) {
+		ret = ti_setup_multi_regulators(opp_data->cpu_dev);
+		if (ret) {
 			dev_pm_opp_put_supported_hw(opp_data->opp_table);
-			ret =  PTR_ERR(ti_opp_table);
 			goto fail_put_node;
 		}
 	}
@@ -316,6 +343,7 @@ static struct platform_driver ti_cpufreq_driver = {
 	.driver = {
 		.name = "ti-cpufreq",
 	},
+	.suppress_bind_attrs = true,
 };
 builtin_platform_driver(ti_cpufreq_driver);
 
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index d7f97167cac3..fc143a38fe8d 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1477,11 +1477,10 @@ static void _free_set_opp_data(struct opp_table *opp_table)
  * This must be called before any OPPs are initialized for the device.
  */
 struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
-					    const char * const names[],
+					    struct regulator **regulators,
 					    unsigned int count)
 {
 	struct opp_table *opp_table;
-	struct regulator *reg;
 	int ret, i;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
@@ -1498,41 +1497,14 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 	if (opp_table->regulators)
 		return opp_table;
 
-	opp_table->regulators = kmalloc_array(count,
-					      sizeof(*opp_table->regulators),
-					      GFP_KERNEL);
-	if (!opp_table->regulators) {
-		ret = -ENOMEM;
-		goto err;
-	}
-
-	for (i = 0; i < count; i++) {
-		reg = regulator_get_optional(dev, names[i]);
-		if (IS_ERR(reg)) {
-			ret = PTR_ERR(reg);
-			if (ret != -EPROBE_DEFER)
-				dev_err(dev, "%s: no regulator (%s) found: %d\n",
-					__func__, names[i], ret);
-			goto free_regulators;
-		}
-
-		opp_table->regulators[i] = reg;
-	}
-
+	opp_table->regulators = regulators;
 	opp_table->regulator_count = count;
 
 	/* Allocate block only once to pass to set_opp() routines */
 	ret = _allocate_set_opp_data(opp_table);
-	if (ret)
-		goto free_regulators;
-
-	return opp_table;
-
-free_regulators:
-	while (i != 0)
-		regulator_put(opp_table->regulators[--i]);
+	if (ret == 0)
+		return opp_table;
 
-	kfree(opp_table->regulators);
 	opp_table->regulators = NULL;
 	opp_table->regulator_count = -1;
 err:
@@ -1556,12 +1528,8 @@ void dev_pm_opp_put_regulators(struct opp_table *opp_table)
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
 
-	for (i = opp_table->regulator_count - 1; i >= 0; i--)
-		regulator_put(opp_table->regulators[i]);
-
 	_free_set_opp_data(opp_table);
 
-	kfree(opp_table->regulators);
 	opp_table->regulators = NULL;
 	opp_table->regulator_count = -1;
 
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 24c757a32a7b..3cf24c2c4969 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -123,7 +123,7 @@ struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev, const u32 *ver
 void dev_pm_opp_put_supported_hw(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name);
 void dev_pm_opp_put_prop_name(struct opp_table *opp_table);
-struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * const names[], unsigned int count);
+struct opp_table *dev_pm_opp_set_regulators(struct device *dev, struct regulator **regulators, unsigned int count);
 void dev_pm_opp_put_regulators(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char * name);
 void dev_pm_opp_put_clkname(struct opp_table *opp_table);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/2] cpufreq: dt: rework resources initialization
       [not found]   ` <CGME20190207122256eucas1p17e8742176bda911263d2d14d2797a886@eucas1p1.samsung.com>
@ 2019-02-07 12:22     ` Marek Szyprowski
  2019-02-08  1:26         ` kbuild test robot
  0 siblings, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel, linux-pm
  Cc: Marek Szyprowski, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

All resources needed for driver operation (clocks and regulators) are now
gathered in driver ->probe() and kept for the whole driver lifetime. This
allows to get rid of the re-acquiring resources always in ->init()
callback, which might be called in a context not approperiate for such
operation.

This fixes following warning during system suspend/resume cycle on Samsung
Exynos based Odroid XU3/XU4 boards:

--->8---
Enabling non-boot CPUs ...
CPU1 is up
CPU2 is up
CPU3 is up
------------[ cut here ]------------
WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
Modules linked in:
CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
[<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
[<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
[<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
[<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
[<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
[<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
[<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
[<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
[<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
[<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
[<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
[<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
[<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
[<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
[<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
[<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
[<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
[<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
[<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
[<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
[<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
[<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
[<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xe897dfb0 to 0xe897dff8)
dfa0:                                     00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
irq event stamp: 3865
hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
---[ end trace db48b455d924fec2 ]---
CPU4 is up
CPU5 is up
CPU6 is up
CPU7 is up
--->8---

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/cpufreq/cpufreq-dt.c | 134 ++++++++++++++---------------------
 1 file changed, 52 insertions(+), 82 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 02a344e9d818..ef17bf29dc7c 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -29,11 +29,13 @@
 struct private_data {
 	struct opp_table *opp_table;
 	struct device *cpu_dev;
-	const char *reg_name;
+	struct clk *clk;
 	struct regulator *reg;
 	bool have_static_opps;
 };
 
+static struct private_data *priv_data;
+
 static struct freq_attr *cpufreq_dt_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
 	NULL,   /* Extra space for boost-attr if required */
@@ -94,7 +96,7 @@ static const char *find_supply_name(struct device *dev)
 	return name;
 }
 
-static int resources_available(void)
+static int get_cpu_resources(struct private_data *priv, unsigned int cpu)
 {
 	struct device *cpu_dev;
 	struct regulator *cpu_reg;
@@ -102,9 +104,9 @@ static int resources_available(void)
 	int ret = 0;
 	const char *name;
 
-	cpu_dev = get_cpu_device(0);
+	cpu_dev = get_cpu_device(cpu);
 	if (!cpu_dev) {
-		pr_err("failed to get cpu0 device\n");
+		pr_err("failed to get cpu%d device\n", cpu);
 		return -ENODEV;
 	}
 
@@ -123,12 +125,10 @@ static int resources_available(void)
 		return ret;
 	}
 
-	clk_put(cpu_clk);
-
 	name = find_supply_name(cpu_dev);
 	/* Platform doesn't require regulator */
 	if (!name)
-		return 0;
+		goto no_regulator;
 
 	cpu_reg = regulator_get_optional(cpu_dev, name);
 	ret = PTR_ERR_OR_ZERO(cpu_reg);
@@ -138,48 +138,42 @@ static int resources_available(void)
 		 * not yet registered, we should try defering probe.
 		 */
 		if (ret == -EPROBE_DEFER)
-			dev_dbg(cpu_dev, "cpu0 regulator not ready, retry\n");
+			dev_dbg(cpu_dev, "regulator not ready, retry\n");
 		else
-			dev_dbg(cpu_dev, "no regulator for cpu0: %d\n", ret);
-
-		return ret;
+			dev_dbg(cpu_dev, "no regulator for cpu: %d\n", ret);
+		goto free;
 	}
-
-	regulator_put(cpu_reg);
+no_regulator:
+	priv->cpu_dev = cpu_dev;
+	priv->clk = cpu_clk;
+	priv->reg = cpu_reg;
 	return 0;
+free:
+	clk_put(cpu_clk);
+	return ret;
+}
+
+static void put_cpu_resources(struct private_data *priv)
+{
+	clk_put(priv->clk);
+	regulator_put(priv->reg);
 }
 
 static int cpufreq_init(struct cpufreq_policy *policy)
 {
 	struct cpufreq_frequency_table *freq_table;
 	struct opp_table *opp_table = NULL;
-	struct private_data *priv;
-	struct regulator *reg;
-	struct device *cpu_dev;
-	struct clk *cpu_clk;
+	struct private_data *priv = &priv_data[policy->cpu];
+	struct device *cpu_dev = priv->cpu_dev;
 	unsigned int transition_latency;
 	bool fallback = false;
-	const char *name;
 	int ret;
 
-	cpu_dev = get_cpu_device(policy->cpu);
-	if (!cpu_dev) {
-		pr_err("failed to get cpu%d device\n", policy->cpu);
-		return -ENODEV;
-	}
-
-	cpu_clk = clk_get(cpu_dev, NULL);
-	if (IS_ERR(cpu_clk)) {
-		ret = PTR_ERR(cpu_clk);
-		dev_err(cpu_dev, "%s: failed to get clk: %d\n", __func__, ret);
-		return ret;
-	}
-
 	/* Get OPP-sharing information from "operating-points-v2" bindings */
 	ret = dev_pm_opp_of_get_sharing_cpus(cpu_dev, policy->cpus);
 	if (ret) {
 		if (ret != -ENOENT)
-			goto out_put_clk;
+			return ret;
 
 		/*
 		 * operating-points-v2 not supported, fallback to old method of
@@ -190,35 +184,18 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 			fallback = true;
 	}
 
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv) {
-		ret = -ENOMEM;
-		goto out_put_clk;
-	}
-
 	/*
 	 * OPP layer will be taking care of regulators.
 	 */
-	name = find_supply_name(cpu_dev);
-	if (name) {
-		reg = regulator_get_optional(cpu_dev, name);
-		ret = PTR_ERR_OR_ZERO(reg);
-		if (ret) {
-			dev_err(cpu_dev, "Failed to get regulator for cpu%d: %d\n",
-				policy->cpu, ret);
-			goto out_free_priv;
-		}
-		priv->reg = reg;
+	if (priv->reg) {
 		opp_table = dev_pm_opp_set_regulators(cpu_dev, &priv->reg, 1);
 		if (IS_ERR(opp_table)) {
 			ret = PTR_ERR(opp_table);
-			dev_err(cpu_dev, "Failed to set regulator for cpu%d: %d\n",
-				policy->cpu, ret);
-			goto out_put_regulator;
+			dev_err(cpu_dev, "Failed to set regulator for cpu: %d\n",
+				ret);
+			return ret;
 		}
 	}
-
-	priv->reg_name = name;
 	priv->opp_table = opp_table;
 
 	/*
@@ -264,11 +241,9 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 		goto out_free_opp;
 	}
 
-	priv->cpu_dev = cpu_dev;
 	policy->driver_data = priv;
-	policy->clk = cpu_clk;
+	policy->clk = priv->clk;
 	policy->freq_table = freq_table;
-
 	policy->suspend_freq = dev_pm_opp_get_suspend_opp_freq(cpu_dev) / 1000;
 
 	/* Support turbo/boost mode */
@@ -296,16 +271,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 out_free_opp:
 	if (priv->have_static_opps)
 		dev_pm_opp_of_cpumask_remove_table(policy->cpus);
-out_put_opp_regulator:
-	if (name)
-		dev_pm_opp_put_regulators(opp_table);
-out_put_regulator:
 	if (priv->reg)
-		regulator_put(priv->reg);
-out_free_priv:
-	kfree(priv);
-out_put_clk:
-	clk_put(cpu_clk);
+		dev_pm_opp_put_regulators(opp_table);
 
 	return ret;
 }
@@ -317,13 +284,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
 	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
 	if (priv->have_static_opps)
 		dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
-	if (priv->reg_name)
-		dev_pm_opp_put_regulators(priv->opp_table);
 	if (priv->reg)
-		regulator_put(priv->reg);
-
-	clk_put(policy->clk);
-	kfree(priv);
+		dev_pm_opp_put_regulators(priv->opp_table);
 
 	return 0;
 }
@@ -344,18 +306,21 @@ static struct cpufreq_driver dt_cpufreq_driver = {
 static int dt_cpufreq_probe(struct platform_device *pdev)
 {
 	struct cpufreq_dt_platform_data *data = dev_get_platdata(&pdev->dev);
-	int ret;
+	int i, ret;
 
-	/*
-	 * All per-cluster (CPUs sharing clock/voltages) initialization is done
-	 * from ->init(). In probe(), we just need to make sure that clk and
-	 * regulators are available. Else defer probe and retry.
-	 *
-	 * FIXME: Is checking this only for CPU0 sufficient ?
-	 */
-	ret = resources_available();
-	if (ret)
-		return ret;
+	priv_data = devm_kcalloc(&pdev->dev, num_possible_cpus(),
+				 sizeof(*priv_data), GFP_KERNEL);
+	if (!priv_data)
+		return -ENOMEM;
+
+	for (i = 0; i < num_possible_cpus(); i++) {
+		ret = get_cpu_resources(&priv_data[i], i);
+		if (ret) {
+			while (i-- > 0)
+				put_cpu_resources(&priv_data[i]);
+			return ret;
+		}
+	}
 
 	if (data) {
 		if (data->have_governor_per_policy)
@@ -375,7 +340,12 @@ static int dt_cpufreq_probe(struct platform_device *pdev)
 
 static int dt_cpufreq_remove(struct platform_device *pdev)
 {
+	int i;
+
 	cpufreq_unregister_driver(&dt_cpufreq_driver);
+	for (i = 0; i < num_possible_cpus(); i++)
+		put_cpu_resources(&priv_data[i]);
+
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] cpufreq: dt: rework resources initialization
  2019-02-07 12:22     ` [PATCH 2/2] cpufreq: dt: rework resources initialization Marek Szyprowski
@ 2019-02-08  1:26         ` kbuild test robot
  0 siblings, 0 replies; 38+ messages in thread
From: kbuild test robot @ 2019-02-08  1:26 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: kbuild-all, linux-kernel, linux-pm, Marek Szyprowski,
	linux-samsung-soc, Viresh Kumar, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

[-- Attachment #1: Type: text/plain, Size: 3301 bytes --]

Hi Marek,

I love your patch! Perhaps something to improve:

[auto build test WARNING on pm/linux-next]
[also build test WARNING on next-20190207]
[cannot apply to v5.0-rc4]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Marek-Szyprowski/cpufreq-opp-rework-regulator-initialization/20190208-071454
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=8.2.0 make.cross ARCH=sh 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers//cpufreq/cpufreq-dt.c: In function 'dt_cpufreq_probe':
>> drivers//cpufreq/cpufreq-dt.c:149:12: warning: 'cpu_reg' may be used uninitialized in this function [-Wmaybe-uninitialized]
     priv->reg = cpu_reg;
     ~~~~~~~~~~^~~~~~~~~
   drivers//cpufreq/cpufreq-dt.c:102:20: note: 'cpu_reg' was declared here
     struct regulator *cpu_reg;
                       ^~~~~~~

vim +/cpu_reg +149 drivers//cpufreq/cpufreq-dt.c

    98	
    99	static int get_cpu_resources(struct private_data *priv, unsigned int cpu)
   100	{
   101		struct device *cpu_dev;
   102		struct regulator *cpu_reg;
   103		struct clk *cpu_clk;
   104		int ret = 0;
   105		const char *name;
   106	
   107		cpu_dev = get_cpu_device(cpu);
   108		if (!cpu_dev) {
   109			pr_err("failed to get cpu%d device\n", cpu);
   110			return -ENODEV;
   111		}
   112	
   113		cpu_clk = clk_get(cpu_dev, NULL);
   114		ret = PTR_ERR_OR_ZERO(cpu_clk);
   115		if (ret) {
   116			/*
   117			 * If cpu's clk node is present, but clock is not yet
   118			 * registered, we should try defering probe.
   119			 */
   120			if (ret == -EPROBE_DEFER)
   121				dev_dbg(cpu_dev, "clock not ready, retry\n");
   122			else
   123				dev_err(cpu_dev, "failed to get clock: %d\n", ret);
   124	
   125			return ret;
   126		}
   127	
   128		name = find_supply_name(cpu_dev);
   129		/* Platform doesn't require regulator */
   130		if (!name)
   131			goto no_regulator;
   132	
   133		cpu_reg = regulator_get_optional(cpu_dev, name);
   134		ret = PTR_ERR_OR_ZERO(cpu_reg);
   135		if (ret) {
   136			/*
   137			 * If cpu's regulator supply node is present, but regulator is
   138			 * not yet registered, we should try defering probe.
   139			 */
   140			if (ret == -EPROBE_DEFER)
   141				dev_dbg(cpu_dev, "regulator not ready, retry\n");
   142			else
   143				dev_dbg(cpu_dev, "no regulator for cpu: %d\n", ret);
   144			goto free;
   145		}
   146	no_regulator:
   147		priv->cpu_dev = cpu_dev;
   148		priv->clk = cpu_clk;
 > 149		priv->reg = cpu_reg;
   150		return 0;
   151	free:
   152		clk_put(cpu_clk);
   153		return ret;
   154	}
   155	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50796 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] cpufreq: dt: rework resources initialization
@ 2019-02-08  1:26         ` kbuild test robot
  0 siblings, 0 replies; 38+ messages in thread
From: kbuild test robot @ 2019-02-08  1:26 UTC (permalink / raw)
  Cc: kbuild-all, linux-kernel, linux-pm, Marek Szyprowski,
	linux-samsung-soc, Viresh Kumar, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

[-- Attachment #1: Type: text/plain, Size: 3301 bytes --]

Hi Marek,

I love your patch! Perhaps something to improve:

[auto build test WARNING on pm/linux-next]
[also build test WARNING on next-20190207]
[cannot apply to v5.0-rc4]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Marek-Szyprowski/cpufreq-opp-rework-regulator-initialization/20190208-071454
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=8.2.0 make.cross ARCH=sh 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers//cpufreq/cpufreq-dt.c: In function 'dt_cpufreq_probe':
>> drivers//cpufreq/cpufreq-dt.c:149:12: warning: 'cpu_reg' may be used uninitialized in this function [-Wmaybe-uninitialized]
     priv->reg = cpu_reg;
     ~~~~~~~~~~^~~~~~~~~
   drivers//cpufreq/cpufreq-dt.c:102:20: note: 'cpu_reg' was declared here
     struct regulator *cpu_reg;
                       ^~~~~~~

vim +/cpu_reg +149 drivers//cpufreq/cpufreq-dt.c

    98	
    99	static int get_cpu_resources(struct private_data *priv, unsigned int cpu)
   100	{
   101		struct device *cpu_dev;
   102		struct regulator *cpu_reg;
   103		struct clk *cpu_clk;
   104		int ret = 0;
   105		const char *name;
   106	
   107		cpu_dev = get_cpu_device(cpu);
   108		if (!cpu_dev) {
   109			pr_err("failed to get cpu%d device\n", cpu);
   110			return -ENODEV;
   111		}
   112	
   113		cpu_clk = clk_get(cpu_dev, NULL);
   114		ret = PTR_ERR_OR_ZERO(cpu_clk);
   115		if (ret) {
   116			/*
   117			 * If cpu's clk node is present, but clock is not yet
   118			 * registered, we should try defering probe.
   119			 */
   120			if (ret == -EPROBE_DEFER)
   121				dev_dbg(cpu_dev, "clock not ready, retry\n");
   122			else
   123				dev_err(cpu_dev, "failed to get clock: %d\n", ret);
   124	
   125			return ret;
   126		}
   127	
   128		name = find_supply_name(cpu_dev);
   129		/* Platform doesn't require regulator */
   130		if (!name)
   131			goto no_regulator;
   132	
   133		cpu_reg = regulator_get_optional(cpu_dev, name);
   134		ret = PTR_ERR_OR_ZERO(cpu_reg);
   135		if (ret) {
   136			/*
   137			 * If cpu's regulator supply node is present, but regulator is
   138			 * not yet registered, we should try defering probe.
   139			 */
   140			if (ret == -EPROBE_DEFER)
   141				dev_dbg(cpu_dev, "regulator not ready, retry\n");
   142			else
   143				dev_dbg(cpu_dev, "no regulator for cpu: %d\n", ret);
   144			goto free;
   145		}
   146	no_regulator:
   147		priv->cpu_dev = cpu_dev;
   148		priv->clk = cpu_clk;
 > 149		priv->reg = cpu_reg;
   150		return 0;
   151	free:
   152		clk_put(cpu_clk);
   153		return ret;
   154	}
   155	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50796 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-07 12:22 ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Marek Szyprowski
       [not found]   ` <CGME20190207122255eucas1p1444023f01217a43cfb958fe0bd48ef4d@eucas1p1.samsung.com>
       [not found]   ` <CGME20190207122256eucas1p17e8742176bda911263d2d14d2797a886@eucas1p1.samsung.com>
@ 2019-02-08  6:49   ` Viresh Kumar
  2019-02-08  8:12     ` Marek Szyprowski
  2019-02-08 10:22     ` Rafael J. Wysocki
  2019-02-08 11:00   ` Sudeep Holla
  2019-02-11  8:44   ` Viresh Kumar
  4 siblings, 2 replies; 38+ messages in thread
From: Viresh Kumar @ 2019-02-08  6:49 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

On 07-02-19, 13:22, Marek Szyprowski wrote:
> Dear All,
> 
> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
> i2c adapters") added a visible warning for an attempt to do i2c transfer
> over a suspended i2c bus. This revealed a long standing issue in the
> cpufreq-dt driver, which gives a following warning during system
> suspend/resume cycle:
> 
> --->8---
> Enabling non-boot CPUs ...
> CPU1 is up
> CPU2 is up
> CPU3 is up
> ------------[ cut here ]------------
> WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
> Modules linked in:
> CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
> [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
> [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
> [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
> [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
> [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
> [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
> [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
> [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
> [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
> [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
> [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
> [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
> [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
> [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
> [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
> [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
> [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
> [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
> [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
> [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
> [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
> [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
> [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> Exception stack(0xe897dfb0 to 0xe897dff8)
> dfa0:                                     00000000 00000000 00000000 00000000
> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> irq event stamp: 3865
> hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
> hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
> softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
> softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
> ---[ end trace db48b455d924fec2 ]---
> CPU4 is up
> CPU5 is up
> CPU6 is up
> CPU7 is up
> --->8---
> 
> This is a scenario that triggers the above issue:
> 
> 1. system disables non-boot cpu's at the end of system suspend procedure,
> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> 3. early in the system resume procedure all cpus are got back to online
>    state,
> 4. this in turn causes cpufreq to be initialized for the newly onlined
>    cpus,
> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>    ->init() callback,
> 6. getting regulator require to check its state, what in turn requires
>    i2c transfer,
> 7. during system early resume stage this is not really possible.
> 
> The issue is caused by cpufreq-dt driver not keeping its resources for
> the whole driver lifetime and relying that they can be always acquired
> at any system context. This problem has been observed on Samsung
> Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which
> have separate regulators for different CPU clusters.

Why don't you get similar problem during suspend? I think you can get
it when the CPUs are offlined as I2C would have gone by then. The
cpufreq or OPP core can try and run some regulator or genpd or clk
calls to disable resources, etc. Even if doesn't happen, it certainly
can.

Also at resume the cpufreq core may try to change the frequency right
from ->init() on certain cases, though not everytime and so the
problem can come despite of this series.

We guarantee that the resources are available during probe but not
during resume, that's where the problem is.

@Rafael any suggestions on how to fix this ?

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  6:49   ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Viresh Kumar
@ 2019-02-08  8:12     ` Marek Szyprowski
  2019-02-08  8:55       ` Viresh Kumar
  2019-02-08 10:22     ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08  8:12 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

Hi Viresh,

On 2019-02-08 07:49, Viresh Kumar wrote:
> On 07-02-19, 13:22, Marek Szyprowski wrote:
>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
>> i2c adapters") added a visible warning for an attempt to do i2c transfer
>> over a suspended i2c bus. This revealed a long standing issue in the
>> cpufreq-dt driver, which gives a following warning during system
>> suspend/resume cycle:
>>
>> --->8---
>> Enabling non-boot CPUs ...
>> CPU1 is up
>> CPU2 is up
>> CPU3 is up
>> ------------[ cut here ]------------
>> WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
>> Modules linked in:
>> CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>> [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
>> [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
>> [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
>> [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
>> [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
>> [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
>> [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
>> [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
>> [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
>> [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
>> [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
>> [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
>> [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
>> [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
>> [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
>> [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
>> [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
>> [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
>> [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
>> [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
>> [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
>> [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
>> [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
>> [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
>> Exception stack(0xe897dfb0 to 0xe897dff8)
>> dfa0:                                     00000000 00000000 00000000 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> irq event stamp: 3865
>> hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
>> hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
>> softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
>> softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
>> ---[ end trace db48b455d924fec2 ]---
>> CPU4 is up
>> CPU5 is up
>> CPU6 is up
>> CPU7 is up
>> --->8---
>>
>> This is a scenario that triggers the above issue:
>>
>> 1. system disables non-boot cpu's at the end of system suspend procedure,
>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
>> 3. early in the system resume procedure all cpus are got back to online
>>    state,
>> 4. this in turn causes cpufreq to be initialized for the newly onlined
>>    cpus,
>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>>    ->init() callback,
>> 6. getting regulator require to check its state, what in turn requires
>>    i2c transfer,
>> 7. during system early resume stage this is not really possible.
>>
>> The issue is caused by cpufreq-dt driver not keeping its resources for
>> the whole driver lifetime and relying that they can be always acquired
>> at any system context. This problem has been observed on Samsung
>> Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which
>> have separate regulators for different CPU clusters.
> Why don't you get similar problem during suspend? I think you can get
> it when the CPUs are offlined as I2C would have gone by then. The
> cpufreq or OPP core can try and run some regulator or genpd or clk
> calls to disable resources, etc. Even if doesn't happen, it certainly
> can.

CPUfreq is suspended very early during system suspend and thus it does
nothing when CPUs are being offlined.

> Also at resume the cpufreq core may try to change the frequency right
> from ->init() on certain cases, though not everytime and so the
> problem can come despite of this series.

cpufreq is still in suspended state (it is being 'resume' very late in
the system resume procedure), so if driver doesn't explicitly change any
opp in ->init(), then cpufreq core waits until everything is resumed. To
sum up, this seems to be fine, beside the issue with regulator
initialization I've addressed in this patchset.

> We guarantee that the resources are available during probe but not
> during resume, that's where the problem is.

Yes, so I've changed cpufreq-dt to the common approach, in which the
driver keeps all needed resources for the whole lifetime of the device.

> @Rafael any suggestions on how to fix this ?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  8:12     ` Marek Szyprowski
@ 2019-02-08  8:55       ` Viresh Kumar
  2019-02-08  9:15         ` Marek Szyprowski
  2019-02-08 10:18         ` Rafael J. Wysocki
  0 siblings, 2 replies; 38+ messages in thread
From: Viresh Kumar @ 2019-02-08  8:55 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

On 08-02-19, 09:12, Marek Szyprowski wrote:
> On 2019-02-08 07:49, Viresh Kumar wrote:
> > Why don't you get similar problem during suspend? I think you can get
> > it when the CPUs are offlined as I2C would have gone by then. The
> > cpufreq or OPP core can try and run some regulator or genpd or clk
> > calls to disable resources, etc. Even if doesn't happen, it certainly
> > can.
> 
> CPUfreq is suspended very early during system suspend and thus it does
> nothing when CPUs are being offlined.

> > Also at resume the cpufreq core may try to change the frequency right
> > from ->init() on certain cases, though not everytime and so the
> > problem can come despite of this series.
> 
> cpufreq is still in suspended state (it is being 'resume' very late in
> the system resume procedure), so if driver doesn't explicitly change any
> opp in ->init(), then cpufreq core waits until everything is resumed. To
> sum up, this seems to be fine, beside the issue with regulator
> initialization I've addressed in this patchset.

Yeah, the governors are suspended very soon, but any frequency change
starting from cpufreq core can still happen. There are at least two
points in cpufreq_online() where we may end up changing the frequency,
but that is conditional and may not be getting hit.

> > We guarantee that the resources are available during probe but not
> > during resume, that's where the problem is.
> 
> Yes, so I've changed cpufreq-dt to the common approach, in which the
> driver keeps all needed resources for the whole lifetime of the device.

That's not what I was saying actually. I was saying that it should be
fine to do a I2C transfer during resume, else we will always have
problems and have to fix them with hacks like the one you proposed
where you acquire resources for all the possible CPUs. Maybe we can
fix it once and for all.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  8:55       ` Viresh Kumar
@ 2019-02-08  9:15         ` Marek Szyprowski
  2019-02-08  9:23           ` Viresh Kumar
  2019-02-08 10:18         ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08  9:15 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

Hi Viresh,

On 2019-02-08 09:55, Viresh Kumar wrote:
> On 08-02-19, 09:12, Marek Szyprowski wrote:
>> On 2019-02-08 07:49, Viresh Kumar wrote:
>>> Why don't you get similar problem during suspend? I think you can get
>>> it when the CPUs are offlined as I2C would have gone by then. The
>>> cpufreq or OPP core can try and run some regulator or genpd or clk
>>> calls to disable resources, etc. Even if doesn't happen, it certainly
>>> can.
>> CPUfreq is suspended very early during system suspend and thus it does
>> nothing when CPUs are being offlined.
>>> Also at resume the cpufreq core may try to change the frequency right
>>> from ->init() on certain cases, though not everytime and so the
>>> problem can come despite of this series.
>> cpufreq is still in suspended state (it is being 'resume' very late in
>> the system resume procedure), so if driver doesn't explicitly change any
>> opp in ->init(), then cpufreq core waits until everything is resumed. To
>> sum up, this seems to be fine, beside the issue with regulator
>> initialization I've addressed in this patchset.
> Yeah, the governors are suspended very soon, but any frequency change
> starting from cpufreq core can still happen. There are at least two
> points in cpufreq_online() where we may end up changing the frequency,
> but that is conditional and may not be getting hit.

Then probably cpufreq core suspend should handle this.

>>> We guarantee that the resources are available during probe but not
>>> during resume, that's where the problem is.
>> Yes, so I've changed cpufreq-dt to the common approach, in which the
>> driver keeps all needed resources for the whole lifetime of the device.
> That's not what I was saying actually. I was saying that it should be
> fine to do a I2C transfer during resume, else we will always have
> problems and have to fix them with hacks like the one you proposed
> where you acquire resources for all the possible CPUs. Maybe we can
> fix it once and for all.

It is fine to do i2c transfers during cpufreq resume (see
drivers/base/power/main.c dpm_resume() function for exact call place).
The problem is that such calls are not allowed in ->init(), which might
be called very early from CPU hotplug path (CPUs are resumed in the
first step of system resume procedure).

What's wrong with my proposed fix? It is not that uncommon to gather all
resources in probe() and keep them until remove() happens.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  9:15         ` Marek Szyprowski
@ 2019-02-08  9:23           ` Viresh Kumar
  2019-02-08 10:02             ` Marek Szyprowski
  2019-02-08 10:08             ` Rafael J. Wysocki
  0 siblings, 2 replies; 38+ messages in thread
From: Viresh Kumar @ 2019-02-08  9:23 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

On 08-02-19, 10:15, Marek Szyprowski wrote:
> On 2019-02-08 09:55, Viresh Kumar wrote:
> > On 08-02-19, 09:12, Marek Szyprowski wrote:
> >> On 2019-02-08 07:49, Viresh Kumar wrote:
> >>> Why don't you get similar problem during suspend? I think you can get
> >>> it when the CPUs are offlined as I2C would have gone by then. The
> >>> cpufreq or OPP core can try and run some regulator or genpd or clk
> >>> calls to disable resources, etc. Even if doesn't happen, it certainly
> >>> can.
> >> CPUfreq is suspended very early during system suspend and thus it does
> >> nothing when CPUs are being offlined.
> >>> Also at resume the cpufreq core may try to change the frequency right
> >>> from ->init() on certain cases, though not everytime and so the
> >>> problem can come despite of this series.
> >> cpufreq is still in suspended state (it is being 'resume' very late in
> >> the system resume procedure), so if driver doesn't explicitly change any
> >> opp in ->init(), then cpufreq core waits until everything is resumed. To
> >> sum up, this seems to be fine, beside the issue with regulator
> >> initialization I've addressed in this patchset.
> > Yeah, the governors are suspended very soon, but any frequency change
> > starting from cpufreq core can still happen. There are at least two
> > points in cpufreq_online() where we may end up changing the frequency,
> > but that is conditional and may not be getting hit.
> 
> Then probably cpufreq core suspend should handle this.

Handle what ? That code is part of cpufreq_online() and needs to be
there only.

> >>> We guarantee that the resources are available during probe but not
> >>> during resume, that's where the problem is.
> >> Yes, so I've changed cpufreq-dt to the common approach, in which the
> >> driver keeps all needed resources for the whole lifetime of the device.
> > That's not what I was saying actually. I was saying that it should be
> > fine to do a I2C transfer during resume, else we will always have
> > problems and have to fix them with hacks like the one you proposed
> > where you acquire resources for all the possible CPUs. Maybe we can
> > fix it once and for all.
> 
> It is fine to do i2c transfers during cpufreq resume (see

By resume I meant system resume and the whole onlining process of
non-boot CPUs.

> drivers/base/power/main.c dpm_resume() function for exact call place).
> The problem is that such calls are not allowed in ->init(), which might
> be called very early from CPU hotplug path (CPUs are resumed in the
> first step of system resume procedure).

Right and that's where I think we can do something to fix it in a
proper way.

> What's wrong with my proposed fix? It is not that uncommon to gather all
> resources in probe() and keep them until remove() happens.

For cpufreq drivers, we must be doing most of the stuff in init/exit
only as far as possible. I am not saying your patch is bad, that is
the best we can do in such situations. But I don't like that we have
to get the resources for all CPUs, despite the fact that many of them
would be part of the same policy and hence share resources. The
problem was that we need to get sharing-cpus detail as well in probe
then, etc.

I was thinking about doing disable_nonboot_cpus() much earlier as the
userspace is already frozen by then.

@Rafael: Will that slowdown the suspend/resume process? Maybe not as
we are doing everything from a single CPU/thread anyways ?

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  9:23           ` Viresh Kumar
@ 2019-02-08 10:02             ` Marek Szyprowski
  2019-02-08 10:08             ` Rafael J. Wysocki
  1 sibling, 0 replies; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08 10:02 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

Hi Viresh,

On 2019-02-08 10:23, Viresh Kumar wrote:
> On 08-02-19, 10:15, Marek Szyprowski wrote:
>> On 2019-02-08 09:55, Viresh Kumar wrote:
>>> On 08-02-19, 09:12, Marek Szyprowski wrote:
>>>> On 2019-02-08 07:49, Viresh Kumar wrote:
>>>>> Why don't you get similar problem during suspend? I think you can get
>>>>> it when the CPUs are offlined as I2C would have gone by then. The
>>>>> cpufreq or OPP core can try and run some regulator or genpd or clk
>>>>> calls to disable resources, etc. Even if doesn't happen, it certainly
>>>>> can.
>>>> CPUfreq is suspended very early during system suspend and thus it does
>>>> nothing when CPUs are being offlined.
>>>>> Also at resume the cpufreq core may try to change the frequency right
>>>>> from ->init() on certain cases, though not everytime and so the
>>>>> problem can come despite of this series.
>>>> cpufreq is still in suspended state (it is being 'resume' very late in
>>>> the system resume procedure), so if driver doesn't explicitly change any
>>>> opp in ->init(), then cpufreq core waits until everything is resumed. To
>>>> sum up, this seems to be fine, beside the issue with regulator
>>>> initialization I've addressed in this patchset.
>>> Yeah, the governors are suspended very soon, but any frequency change
>>> starting from cpufreq core can still happen. There are at least two
>>> points in cpufreq_online() where we may end up changing the frequency,
>>> but that is conditional and may not be getting hit.
>> Then probably cpufreq core suspend should handle this.
> Handle what ? That code is part of cpufreq_online() and needs to be
> there only.

If got it right, it is a matter of handling
CPUFREQ_NEED_INITIAL_FREQ_CHECK flag. Maybe there should be additional
check if CPUfreq is not suspended?

>>>>> We guarantee that the resources are available during probe but not
>>>>> during resume, that's where the problem is.
>>>> Yes, so I've changed cpufreq-dt to the common approach, in which the
>>>> driver keeps all needed resources for the whole lifetime of the device.
>>> That's not what I was saying actually. I was saying that it should be
>>> fine to do a I2C transfer during resume, else we will always have
>>> problems and have to fix them with hacks like the one you proposed
>>> where you acquire resources for all the possible CPUs. Maybe we can
>>> fix it once and for all.
>> It is fine to do i2c transfers during cpufreq resume (see
> By resume I meant system resume and the whole onlining process of
> non-boot CPUs.

Right now those are 2 separate things in cpufreq core.

>> drivers/base/power/main.c dpm_resume() function for exact call place).
>> The problem is that such calls are not allowed in ->init(), which might
>> be called very early from CPU hotplug path (CPUs are resumed in the
>> first step of system resume procedure).
> Right and that's where I think we can do something to fix it in a
> proper way.
>
>> What's wrong with my proposed fix? It is not that uncommon to gather all
>> resources in probe() and keep them until remove() happens.
> For cpufreq drivers, we must be doing most of the stuff in init/exit
> only as far as possible. I am not saying your patch is bad, that is
> the best we can do in such situations. But I don't like that we have
> to get the resources for all CPUs, despite the fact that many of them
> would be part of the same policy and hence share resources. The
> problem was that we need to get sharing-cpus detail as well in probe
> then, etc.

Both resources in this case: clocks and regulators are refcounted by
their frameworks, so the cost of getting a few more references for them
is imho negligible.

> I was thinking about doing disable_nonboot_cpus() much earlier as the
> userspace is already frozen by then.
>
> @Rafael: Will that slowdown the suspend/resume process? Maybe not as
> we are doing everything from a single CPU/thread anyways ?

For some reasons drivers are handled partially asynchronously in
suspend/resume procedure and can do suspend and resume in parallel. I
don't think that limiting the whole suspend/resume process to a single
cpu is the best we can do.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  9:23           ` Viresh Kumar
  2019-02-08 10:02             ` Marek Szyprowski
@ 2019-02-08 10:08             ` Rafael J. Wysocki
  1 sibling, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 10:08 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Marek Szyprowski, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 8, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 08-02-19, 10:15, Marek Szyprowski wrote:
> > On 2019-02-08 09:55, Viresh Kumar wrote:
> > > On 08-02-19, 09:12, Marek Szyprowski wrote:
> > >> On 2019-02-08 07:49, Viresh Kumar wrote:
> > >>> Why don't you get similar problem during suspend? I think you can get
> > >>> it when the CPUs are offlined as I2C would have gone by then. The
> > >>> cpufreq or OPP core can try and run some regulator or genpd or clk
> > >>> calls to disable resources, etc. Even if doesn't happen, it certainly
> > >>> can.
> > >> CPUfreq is suspended very early during system suspend and thus it does
> > >> nothing when CPUs are being offlined.
> > >>> Also at resume the cpufreq core may try to change the frequency right
> > >>> from ->init() on certain cases, though not everytime and so the
> > >>> problem can come despite of this series.
> > >> cpufreq is still in suspended state (it is being 'resume' very late in
> > >> the system resume procedure), so if driver doesn't explicitly change any
> > >> opp in ->init(), then cpufreq core waits until everything is resumed. To
> > >> sum up, this seems to be fine, beside the issue with regulator
> > >> initialization I've addressed in this patchset.
> > > Yeah, the governors are suspended very soon, but any frequency change
> > > starting from cpufreq core can still happen. There are at least two
> > > points in cpufreq_online() where we may end up changing the frequency,
> > > but that is conditional and may not be getting hit.
> >
> > Then probably cpufreq core suspend should handle this.
>
> Handle what ? That code is part of cpufreq_online() and needs to be
> there only.
>
> > >>> We guarantee that the resources are available during probe but not
> > >>> during resume, that's where the problem is.
> > >> Yes, so I've changed cpufreq-dt to the common approach, in which the
> > >> driver keeps all needed resources for the whole lifetime of the device.
> > > That's not what I was saying actually. I was saying that it should be
> > > fine to do a I2C transfer during resume, else we will always have
> > > problems and have to fix them with hacks like the one you proposed
> > > where you acquire resources for all the possible CPUs. Maybe we can
> > > fix it once and for all.
> >
> > It is fine to do i2c transfers during cpufreq resume (see
>
> By resume I meant system resume and the whole onlining process of
> non-boot CPUs.
>
> > drivers/base/power/main.c dpm_resume() function for exact call place).
> > The problem is that such calls are not allowed in ->init(), which might
> > be called very early from CPU hotplug path (CPUs are resumed in the
> > first step of system resume procedure).
>
> Right and that's where I think we can do something to fix it in a
> proper way.
>
> > What's wrong with my proposed fix? It is not that uncommon to gather all
> > resources in probe() and keep them until remove() happens.
>
> For cpufreq drivers, we must be doing most of the stuff in init/exit
> only as far as possible. I am not saying your patch is bad, that is
> the best we can do in such situations. But I don't like that we have
> to get the resources for all CPUs, despite the fact that many of them
> would be part of the same policy and hence share resources. The
> problem was that we need to get sharing-cpus detail as well in probe
> then, etc.
>
> I was thinking about doing disable_nonboot_cpus() much earlier as the
> userspace is already frozen by then.
>
> @Rafael: Will that slowdown the suspend/resume process? Maybe not as
> we are doing everything from a single CPU/thread anyways ?

First, we used to do that and we switched over to what we have right
now several years ago, because it didn't work reliably then.

Arguably, CPU hotplug is in a much better shape now, so it might be
working better, but that would be a huge change and lots of
documentation would need to be revised. :-)

Also it is not true that everything is done on a single CPU, but I'm
not really sure about the possible slowdown.

Second, there are (many) systems for which that change is not really
necessary and it is risky because of possible regressions.

I guess that CPUs depending on I2C for online/offline could be taken
offline earlier and brought back online later during suspend/resume,
like before/after the _noirq suspend of devices, but doing that on all
systems altogether is almost guaranteed to introduce regressions.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  8:55       ` Viresh Kumar
  2019-02-08  9:15         ` Marek Szyprowski
@ 2019-02-08 10:18         ` Rafael J. Wysocki
  2019-02-08 10:28           ` Viresh Kumar
  1 sibling, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 10:18 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Marek Szyprowski, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 8, 2019 at 9:55 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 08-02-19, 09:12, Marek Szyprowski wrote:
> > On 2019-02-08 07:49, Viresh Kumar wrote:
> > > Why don't you get similar problem during suspend? I think you can get
> > > it when the CPUs are offlined as I2C would have gone by then. The
> > > cpufreq or OPP core can try and run some regulator or genpd or clk
> > > calls to disable resources, etc. Even if doesn't happen, it certainly
> > > can.
> >
> > CPUfreq is suspended very early during system suspend and thus it does
> > nothing when CPUs are being offlined.
>
> > > Also at resume the cpufreq core may try to change the frequency right
> > > from ->init() on certain cases, though not everytime and so the
> > > problem can come despite of this series.
> >
> > cpufreq is still in suspended state (it is being 'resume' very late in
> > the system resume procedure), so if driver doesn't explicitly change any
> > opp in ->init(), then cpufreq core waits until everything is resumed. To
> > sum up, this seems to be fine, beside the issue with regulator
> > initialization I've addressed in this patchset.
>
> Yeah, the governors are suspended very soon, but any frequency change
> starting from cpufreq core can still happen. There are at least two
> points in cpufreq_online() where we may end up changing the frequency,
> but that is conditional and may not be getting hit.
>
> > > We guarantee that the resources are available during probe but not
> > > during resume, that's where the problem is.
> >
> > Yes, so I've changed cpufreq-dt to the common approach, in which the
> > driver keeps all needed resources for the whole lifetime of the device.
>
> That's not what I was saying actually. I was saying that it should be
> fine to do a I2C transfer during resume,

Surely not before resuming the I2C controller involved?

> else we will always have
> problems and have to fix them with hacks like the one you proposed
> where you acquire resources for all the possible CPUs. Maybe we can
> fix it once and for all.

Obviously, all I2C transfers need to take place either before
suspending the I2C controller or after resuming it.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08  6:49   ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Viresh Kumar
  2019-02-08  8:12     ` Marek Szyprowski
@ 2019-02-08 10:22     ` Rafael J. Wysocki
  2019-02-08 10:31       ` Marek Szyprowski
  2019-02-08 10:31       ` Viresh Kumar
  1 sibling, 2 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 10:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Marek Szyprowski, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 8, 2019 at 7:50 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 07-02-19, 13:22, Marek Szyprowski wrote:
> > Dear All,
> >
> > Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
> > i2c adapters") added a visible warning for an attempt to do i2c transfer
> > over a suspended i2c bus. This revealed a long standing issue in the
> > cpufreq-dt driver, which gives a following warning during system
> > suspend/resume cycle:
> >
> > --->8---
> > Enabling non-boot CPUs ...
> > CPU1 is up
> > CPU2 is up
> > CPU3 is up
> > ------------[ cut here ]------------
> > WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
> > Modules linked in:
> > CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
> > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> > [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
> > [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
> > [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
> > [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
> > [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
> > [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
> > [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
> > [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
> > [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
> > [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
> > [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
> > [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
> > [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
> > [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
> > [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
> > [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
> > [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
> > [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
> > [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
> > [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
> > [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
> > [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
> > [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
> > [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > Exception stack(0xe897dfb0 to 0xe897dff8)
> > dfa0:                                     00000000 00000000 00000000 00000000
> > dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > irq event stamp: 3865
> > hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
> > hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
> > softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
> > softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
> > ---[ end trace db48b455d924fec2 ]---
> > CPU4 is up
> > CPU5 is up
> > CPU6 is up
> > CPU7 is up
> > --->8---
> >
> > This is a scenario that triggers the above issue:
> >
> > 1. system disables non-boot cpu's at the end of system suspend procedure,
> > 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> > 3. early in the system resume procedure all cpus are got back to online
> >    state,
> > 4. this in turn causes cpufreq to be initialized for the newly onlined
> >    cpus,
> > 5. cpufreq-dt acquires all its resources (clocks, regulators) during
> >    ->init() callback,
> > 6. getting regulator require to check its state, what in turn requires
> >    i2c transfer,
> > 7. during system early resume stage this is not really possible.
> >
> > The issue is caused by cpufreq-dt driver not keeping its resources for
> > the whole driver lifetime and relying that they can be always acquired
> > at any system context. This problem has been observed on Samsung
> > Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which
> > have separate regulators for different CPU clusters.
>
> Why don't you get similar problem during suspend? I think you can get
> it when the CPUs are offlined as I2C would have gone by then. The
> cpufreq or OPP core can try and run some regulator or genpd or clk
> calls to disable resources, etc. Even if doesn't happen, it certainly
> can.
>
> Also at resume the cpufreq core may try to change the frequency right
> from ->init() on certain cases, though not everytime and so the
> problem can come despite of this series.
>
> We guarantee that the resources are available during probe but not
> during resume, that's where the problem is.
>
> @Rafael any suggestions on how to fix this ?

There are cpufreq driver suspend and resume callbacks, maybe use them?

The driver could do the I2C transactions in its suspend/resume
callbacks and do nothing in online/offline if those are part of
system-wide suspend/resume.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:18         ` Rafael J. Wysocki
@ 2019-02-08 10:28           ` Viresh Kumar
  0 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2019-02-08 10:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Marek Szyprowski, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On 08-02-19, 11:18, Rafael J. Wysocki wrote:
> Obviously, all I2C transfers need to take place either before
> suspending the I2C controller or after resuming it.

Right. But all I am saying is that it is the responsibility of the
cpufreq core/driver to make sure we call ->init() only when the
resources are available to be used. For example during driver load, we
defer probe if resources aren't available and that makes sure ->init()
gets called when we can actually change the frequency.

Shouldn't the same be true during resume? That is make sure all
resources are in place before ->init() gets called ?

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:22     ` Rafael J. Wysocki
@ 2019-02-08 10:31       ` Marek Szyprowski
  2019-02-08 10:31       ` Viresh Kumar
  1 sibling, 0 replies; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08 10:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: Linux Kernel Mailing List, Linux PM, Linux Samsung SoC,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

Hi Rafael,

On 2019-02-08 11:22, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 7:50 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 07-02-19, 13:22, Marek Szyprowski wrote:
>>> Dear All,
>>>
>>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
>>> i2c adapters") added a visible warning for an attempt to do i2c transfer
>>> over a suspended i2c bus. This revealed a long standing issue in the
>>> cpufreq-dt driver, which gives a following warning during system
>>> suspend/resume cycle:
>>>
>>> --->8---
>>> Enabling non-boot CPUs ...
>>> CPU1 is up
>>> CPU2 is up
>>> CPU3 is up
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50
>>> Modules linked in:
>>> CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G        W         5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324
>>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>>> [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14)
>>> [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8)
>>> [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124)
>>> [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48)
>>> [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50)
>>> [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4)
>>> [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64)
>>> [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450)
>>> [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68)
>>> [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250)
>>> [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c)
>>> [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90)
>>> [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40)
>>> [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c)
>>> [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278)
>>> [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0)
>>> [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0)
>>> [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c)
>>> [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10)
>>> [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc)
>>> [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320)
>>> [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340)
>>> [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160)
>>> [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
>>> Exception stack(0xe897dfb0 to 0xe897dff8)
>>> dfa0:                                     00000000 00000000 00000000 00000000
>>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>>> irq event stamp: 3865
>>> hardirqs last  enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4
>>> hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4
>>> softirqs last  enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c
>>> softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168
>>> ---[ end trace db48b455d924fec2 ]---
>>> CPU4 is up
>>> CPU5 is up
>>> CPU6 is up
>>> CPU7 is up
>>> --->8---
>>>
>>> This is a scenario that triggers the above issue:
>>>
>>> 1. system disables non-boot cpu's at the end of system suspend procedure,
>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
>>> 3. early in the system resume procedure all cpus are got back to online
>>>    state,
>>> 4. this in turn causes cpufreq to be initialized for the newly onlined
>>>    cpus,
>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>>>    ->init() callback,
>>> 6. getting regulator require to check its state, what in turn requires
>>>    i2c transfer,
>>> 7. during system early resume stage this is not really possible.
>>>
>>> The issue is caused by cpufreq-dt driver not keeping its resources for
>>> the whole driver lifetime and relying that they can be always acquired
>>> at any system context. This problem has been observed on Samsung
>>> Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which
>>> have separate regulators for different CPU clusters.
>> Why don't you get similar problem during suspend? I think you can get
>> it when the CPUs are offlined as I2C would have gone by then. The
>> cpufreq or OPP core can try and run some regulator or genpd or clk
>> calls to disable resources, etc. Even if doesn't happen, it certainly
>> can.
>>
>> Also at resume the cpufreq core may try to change the frequency right
>> from ->init() on certain cases, though not everytime and so the
>> problem can come despite of this series.
>>
>> We guarantee that the resources are available during probe but not
>> during resume, that's where the problem is.
>>
>> @Rafael any suggestions on how to fix this ?
> There are cpufreq driver suspend and resume callbacks, maybe use them?
>
> The driver could do the I2C transactions in its suspend/resume
> callbacks and do nothing in online/offline if those are part of
> system-wide suspend/resume.

This is exactly what I suggested, when I wrote to handle it in cpufreq
suspend/resume.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:22     ` Rafael J. Wysocki
  2019-02-08 10:31       ` Marek Szyprowski
@ 2019-02-08 10:31       ` Viresh Kumar
  2019-02-08 10:42         ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2019-02-08 10:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Marek Szyprowski, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> There are cpufreq driver suspend and resume callbacks, maybe use them?
> 
> The driver could do the I2C transactions in its suspend/resume
> callbacks and do nothing in online/offline if those are part of
> system-wide suspend/resume.

These are per-policy things that we need to do, not sure if driver
suspend/resume is a good place for that. It is more for a case where
CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
hot-unplugged during system suspend and hotplugged later on. This is
more like complete removal/addition of devices instead of
suspend/resume.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:31       ` Viresh Kumar
@ 2019-02-08 10:42         ` Rafael J. Wysocki
  2019-02-08 10:52           ` Rafael J. Wysocki
  2019-02-08 11:39           ` Sudeep Holla
  0 siblings, 2 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 10:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Marek Szyprowski, Linux Kernel Mailing List,
	Linux PM, Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > There are cpufreq driver suspend and resume callbacks, maybe use them?
> >
> > The driver could do the I2C transactions in its suspend/resume
> > callbacks and do nothing in online/offline if those are part of
> > system-wide suspend/resume.
>
> These are per-policy things that we need to do, not sure if driver
> suspend/resume is a good place for that. It is more for a case where
> CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> hot-unplugged during system suspend and hotplugged later on. This is
> more like complete removal/addition of devices instead of
> suspend/resume.

No, it isn't.  We don't remove devices on offline.  We migrate stuff
away from them and (opportunistically) power them down.

If this is system suspend, the driver kind of knows that offline will
take place, so it can prepare for it.  Likewise, when online takes
place during system-wide resume, it generally is known that this is
system-wide resume (there is a flag to indicate that in CPU hotplug),
it can be "smart" and avoid accessing suspended devices.  Deferring
the frequency set up until the driver resume time should do the trick
I suppose.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:42         ` Rafael J. Wysocki
@ 2019-02-08 10:52           ` Rafael J. Wysocki
  2019-02-08 11:39           ` Sudeep Holla
  1 sibling, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 10:52 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Marek Szyprowski, Linux Kernel Mailing List,
	Linux PM, Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 8, 2019 at 11:42 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > >
> > > The driver could do the I2C transactions in its suspend/resume
> > > callbacks and do nothing in online/offline if those are part of
> > > system-wide suspend/resume.
> >
> > These are per-policy things that we need to do, not sure if driver
> > suspend/resume is a good place for that. It is more for a case where
> > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > hot-unplugged during system suspend and hotplugged later on. This is
> > more like complete removal/addition of devices instead of
> > suspend/resume.
>
> No, it isn't.  We don't remove devices on offline.  We migrate stuff
> away from them and (opportunistically) power them down.
>
> If this is system suspend, the driver kind of knows that offline will
> take place, so it can prepare for it.  Likewise, when online takes
> place during system-wide resume, it generally is known that this is
> system-wide resume (there is a flag to indicate that in CPU hotplug),
> it can be "smart" and avoid accessing suspended devices.  Deferring
> the frequency set up until the driver resume time should do the trick
> I suppose.

BTW, do transitions to idle states involve I2C transactions on the
systems in question?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-07 12:22 ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Marek Szyprowski
                     ` (2 preceding siblings ...)
  2019-02-08  6:49   ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Viresh Kumar
@ 2019-02-08 11:00   ` Sudeep Holla
  2019-02-08 11:47     ` Marek Szyprowski
  2019-02-11  8:44   ` Viresh Kumar
  4 siblings, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 11:00 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
> Dear All,
>
> This is a scenario that triggers the above issue:
>

[...]

> 1. system disables non-boot cpu's at the end of system suspend procedure,
> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> 3. early in the system resume procedure all cpus are got back to online
>    state,
> 4. this in turn causes cpufreq to be initialized for the newly onlined
>    cpus,
> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>    ->init() callback,

This is strictly not just restricted to cpufreq-dt, but to any driver
supporting multiple policies. So we need a generic fix not just
cpufreq-dt specific.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 10:42         ` Rafael J. Wysocki
  2019-02-08 10:52           ` Rafael J. Wysocki
@ 2019-02-08 11:39           ` Sudeep Holla
  2019-02-08 12:03             ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 11:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Marek Szyprowski, Linux Kernel Mailing List,
	Linux PM, Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Sudeep Holla,
	Dave Gerlach, Wolfram Sang

On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > >
> > > The driver could do the I2C transactions in its suspend/resume
> > > callbacks and do nothing in online/offline if those are part of
> > > system-wide suspend/resume.
> >
> > These are per-policy things that we need to do, not sure if driver
> > suspend/resume is a good place for that. It is more for a case where
> > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > hot-unplugged during system suspend and hotplugged later on. This is
> > more like complete removal/addition of devices instead of
> > suspend/resume.
>
> No, it isn't.  We don't remove devices on offline.  We migrate stuff
> away from them and (opportunistically) power them down.
>
> If this is system suspend, the driver kind of knows that offline will
> take place, so it can prepare for it.  Likewise, when online takes
> place during system-wide resume, it generally is known that this is
> system-wide resume (there is a flag to indicate that in CPU hotplug),
> it can be "smart" and avoid accessing suspended devices.  Deferring
> the frequency set up until the driver resume time should do the trick
> I suppose.

I agree. The reason we don't see this generally on boot is because all
the CPUs are brought online before CPUfreq is initialised. While during
system suspend, we call cpufreq_online which in turn calls ->init in
the hotplug state machine.

So as Rafael suggests we need to do some trick, but can it be done in
the core itself ? I may be missing something, but how about the patch
below:

Regards,
Sudeep

--
diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
index e35a886e00bc..7d8b0b99f91d 100644
--- i/drivers/cpufreq/cpufreq.c
+++ w/drivers/cpufreq/cpufreq.c
@@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
                policy->max = policy->user_policy.max;
        }

-       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
+       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
+           !cpufreq_suspended) {
                policy->cur = cpufreq_driver->get(policy->cpu);
                if (!policy->cur) {
                        pr_err("%s: ->get() failed\n", __func__);
@@ -1702,6 +1703,11 @@ void cpufreq_resume(void)
                                pr_err("%s: Failed to start governor for policy: %p\n",
                                       __func__, policy);
                }
+               policy->cur = cpufreq_driver->get(policy->cpu);
+               if (!policy->cur) {
+                       pr_err("%s: ->get() failed\n", __func__);
+                       goto out_destroy_policy;
+               }
        }
 }


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 11:00   ` Sudeep Holla
@ 2019-02-08 11:47     ` Marek Szyprowski
  2019-02-08 11:51       ` Sudeep Holla
  0 siblings, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08 11:47 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

Hi Sudeep,

On 2019-02-08 12:00, Sudeep Holla wrote:
> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
>> Dear All,
>>
>> This is a scenario that triggers the above issue:
> [...]
>> 1. system disables non-boot cpu's at the end of system suspend procedure,
>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
>> 3. early in the system resume procedure all cpus are got back to online
>>    state,
>> 4. this in turn causes cpufreq to be initialized for the newly onlined
>>    cpus,
>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>>    ->init() callback,
> This is strictly not just restricted to cpufreq-dt, but to any driver
> supporting multiple policies. So we need a generic fix not just
> cpufreq-dt specific.

Could you point which other driver needs similar fix? Here in cpufreq-dt
the problem was caused by using regulator api (indirectly) from
->init(). All other drivers, which have regulators support, are for old,
obsolete, uni-processor systems, which don't have the problem of
secondary cpu suspend during system suspend/resume cycle.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 11:47     ` Marek Szyprowski
@ 2019-02-08 11:51       ` Sudeep Holla
  2019-02-08 12:04         ` Marek Szyprowski
  0 siblings, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 11:51 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote:
> Hi Sudeep,
>
> On 2019-02-08 12:00, Sudeep Holla wrote:
> > On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
> >> Dear All,
> >>
> >> This is a scenario that triggers the above issue:
> > [...]
> >> 1. system disables non-boot cpu's at the end of system suspend procedure,
> >> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> >> 3. early in the system resume procedure all cpus are got back to online
> >>    state,
> >> 4. this in turn causes cpufreq to be initialized for the newly onlined
> >>    cpus,
> >> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
> >>    ->init() callback,
> > This is strictly not just restricted to cpufreq-dt, but to any driver
> > supporting multiple policies. So we need a generic fix not just
> > cpufreq-dt specific.
>
> Could you point which other driver needs similar fix? Here in cpufreq-dt
> the problem was caused by using regulator api (indirectly) from
> ->init(). All other drivers, which have regulators support, are for old,
> obsolete, uni-processor systems, which don't have the problem of
> secondary cpu suspend during system suspend/resume cycle.
>

scmi_cpufreq for instance. We can fix that in driver my moving to polling
to get cpufreq_get_rate, but we support both polling and interrupt based.
We may wait for remote processor interrupt in get_rate.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 11:39           ` Sudeep Holla
@ 2019-02-08 12:03             ` Rafael J. Wysocki
  2019-02-08 12:09               ` Sudeep Holla
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 12:03 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Rafael J. Wysocki, Viresh Kumar, Marek Szyprowski,
	Linux Kernel Mailing List, Linux PM, Linux Samsung SoC,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > >
> > > > The driver could do the I2C transactions in its suspend/resume
> > > > callbacks and do nothing in online/offline if those are part of
> > > > system-wide suspend/resume.
> > >
> > > These are per-policy things that we need to do, not sure if driver
> > > suspend/resume is a good place for that. It is more for a case where
> > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > hot-unplugged during system suspend and hotplugged later on. This is
> > > more like complete removal/addition of devices instead of
> > > suspend/resume.
> >
> > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > away from them and (opportunistically) power them down.
> >
> > If this is system suspend, the driver kind of knows that offline will
> > take place, so it can prepare for it.  Likewise, when online takes
> > place during system-wide resume, it generally is known that this is
> > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > it can be "smart" and avoid accessing suspended devices.  Deferring
> > the frequency set up until the driver resume time should do the trick
> > I suppose.
>
> I agree. The reason we don't see this generally on boot is because all
> the CPUs are brought online before CPUfreq is initialised. While during
> system suspend, we call cpufreq_online which in turn calls ->init in
> the hotplug state machine.
>
> So as Rafael suggests we need to do some trick, but can it be done in
> the core itself ? I may be missing something, but how about the patch
> below:
>
> Regards,
> Sudeep
>
> --
> diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> index e35a886e00bc..7d8b0b99f91d 100644
> --- i/drivers/cpufreq/cpufreq.c
> +++ w/drivers/cpufreq/cpufreq.c
> @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
>                 policy->max = policy->user_policy.max;
>         }
>
> -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> +           !cpufreq_suspended) {
>                 policy->cur = cpufreq_driver->get(policy->cpu);
>                 if (!policy->cur) {
>                         pr_err("%s: ->get() failed\n", __func__);

It looks like we need to skip the "initial freq check" block below.

Also this doesn't really help the case when the driver ->init() messes
up with things.

> @@ -1702,6 +1703,11 @@ void cpufreq_resume(void)
>                                 pr_err("%s: Failed to start governor for policy: %p\n",
>                                        __func__, policy);
>                 }
> +               policy->cur = cpufreq_driver->get(policy->cpu);
> +               if (!policy->cur) {
> +                       pr_err("%s: ->get() failed\n", __func__);
> +                       goto out_destroy_policy;
> +               }
>         }
>  }
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 11:51       ` Sudeep Holla
@ 2019-02-08 12:04         ` Marek Szyprowski
  2019-02-08 12:11           ` Rafael J. Wysocki
                             ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-08 12:04 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

Hi Sudeep,

On 2019-02-08 12:51, Sudeep Holla wrote:
> On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote:
>> On 2019-02-08 12:00, Sudeep Holla wrote:
>>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
>>>> Dear All,
>>>>
>>>> This is a scenario that triggers the above issue:
>>> [...]
>>>> 1. system disables non-boot cpu's at the end of system suspend procedure,
>>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
>>>> 3. early in the system resume procedure all cpus are got back to online
>>>>    state,
>>>> 4. this in turn causes cpufreq to be initialized for the newly onlined
>>>>    cpus,
>>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
>>>>    ->init() callback,
>>> This is strictly not just restricted to cpufreq-dt, but to any driver
>>> supporting multiple policies. So we need a generic fix not just
>>> cpufreq-dt specific.
>> Could you point which other driver needs similar fix? Here in cpufreq-dt
>> the problem was caused by using regulator api (indirectly) from
>> ->init(). All other drivers, which have regulators support, are for old,
>> obsolete, uni-processor systems, which don't have the problem of
>> secondary cpu suspend during system suspend/resume cycle.
>>
> scmi_cpufreq for instance. We can fix that in driver my moving to polling
> to get cpufreq_get_rate, but we support both polling and interrupt based.
> We may wait for remote processor interrupt in get_rate.

Frankly, I don't feel I know enough to touch this driver and I don't
think that this can even be fixed in a generic way in the cpufreq core.
Maybe a comment somewhere is needed that ->init() might be called during
early system resume with irqs off and driver is responsible for handling
such case until the proper resume?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:03             ` Rafael J. Wysocki
@ 2019-02-08 12:09               ` Sudeep Holla
  2019-02-08 12:23                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 12:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Marek Szyprowski, Linux Kernel Mailing List,
	Linux PM, Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > > >
> > > > > The driver could do the I2C transactions in its suspend/resume
> > > > > callbacks and do nothing in online/offline if those are part of
> > > > > system-wide suspend/resume.
> > > >
> > > > These are per-policy things that we need to do, not sure if driver
> > > > suspend/resume is a good place for that. It is more for a case where
> > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > > hot-unplugged during system suspend and hotplugged later on. This is
> > > > more like complete removal/addition of devices instead of
> > > > suspend/resume.
> > >
> > > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > > away from them and (opportunistically) power them down.
> > >
> > > If this is system suspend, the driver kind of knows that offline will
> > > take place, so it can prepare for it.  Likewise, when online takes
> > > place during system-wide resume, it generally is known that this is
> > > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > > it can be "smart" and avoid accessing suspended devices.  Deferring
> > > the frequency set up until the driver resume time should do the trick
> > > I suppose.
> >
> > I agree. The reason we don't see this generally on boot is because all
> > the CPUs are brought online before CPUfreq is initialised. While during
> > system suspend, we call cpufreq_online which in turn calls ->init in
> > the hotplug state machine.
> >
> > So as Rafael suggests we need to do some trick, but can it be done in
> > the core itself ? I may be missing something, but how about the patch
> > below:
> >
> > Regards,
> > Sudeep
> >
> > --
> > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> > index e35a886e00bc..7d8b0b99f91d 100644
> > --- i/drivers/cpufreq/cpufreq.c
> > +++ w/drivers/cpufreq/cpufreq.c
> > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
> >                 policy->max = policy->user_policy.max;
> >         }
> >
> > -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> > +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> > +           !cpufreq_suspended) {
> >                 policy->cur = cpufreq_driver->get(policy->cpu);
> >                 if (!policy->cur) {
> >                         pr_err("%s: ->get() failed\n", __func__);
> 
> It looks like we need to skip the "initial freq check" block below.
> 

Indeed, copy pasted an earlier version of diff. I found that I even
used a goto label wrong which I fixed along with the additional check
in "initial freq check" when I tried to compile :).

> Also this doesn't really help the case when the driver ->init() messes
> up with things.
>

Yes, in that case additional logic in the driver also needed. I am fine
if we enforce driver to deal with this issue, but was thinking if we can
make it generic. Also I was just trying to avoid adding _suspend/resume
to driver just to avoid this issue.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:04         ` Marek Szyprowski
@ 2019-02-08 12:11           ` Rafael J. Wysocki
  2019-02-08 12:16           ` Sudeep Holla
  2019-02-08 17:41           ` Sudeep Holla
  2 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 12:11 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Sudeep Holla, Linux Kernel Mailing List, Linux PM,
	Linux Samsung SoC, Viresh Kumar, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang

On Fri, Feb 8, 2019 at 1:04 PM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> Hi Sudeep,
>
> On 2019-02-08 12:51, Sudeep Holla wrote:
> > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote:
> >> On 2019-02-08 12:00, Sudeep Holla wrote:
> >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
> >>>> Dear All,
> >>>>
> >>>> This is a scenario that triggers the above issue:
> >>> [...]
> >>>> 1. system disables non-boot cpu's at the end of system suspend procedure,
> >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> >>>> 3. early in the system resume procedure all cpus are got back to online
> >>>>    state,
> >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined
> >>>>    cpus,
> >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
> >>>>    ->init() callback,
> >>> This is strictly not just restricted to cpufreq-dt, but to any driver
> >>> supporting multiple policies. So we need a generic fix not just
> >>> cpufreq-dt specific.
> >> Could you point which other driver needs similar fix? Here in cpufreq-dt
> >> the problem was caused by using regulator api (indirectly) from
> >> ->init(). All other drivers, which have regulators support, are for old,
> >> obsolete, uni-processor systems, which don't have the problem of
> >> secondary cpu suspend during system suspend/resume cycle.
> >>
> > scmi_cpufreq for instance. We can fix that in driver my moving to polling
> > to get cpufreq_get_rate, but we support both polling and interrupt based.
> > We may wait for remote processor interrupt in get_rate.
>
> Frankly, I don't feel I know enough to touch this driver and I don't
> think that this can even be fixed in a generic way in the cpufreq core.
> Maybe a comment somewhere is needed that ->init() might be called during
> early system resume with irqs off and driver is responsible for handling
> such case until the proper resume?

Well, adding a comment to that effect certainly won't hurt as that's
how things work now.

However, it looks like something needs to be done in addition to that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:04         ` Marek Szyprowski
  2019-02-08 12:11           ` Rafael J. Wysocki
@ 2019-02-08 12:16           ` Sudeep Holla
  2019-02-08 17:41           ` Sudeep Holla
  2 siblings, 0 replies; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 12:16 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Fri, Feb 08, 2019 at 01:04:18PM +0100, Marek Szyprowski wrote:
> Hi Sudeep,
>
> On 2019-02-08 12:51, Sudeep Holla wrote:
> > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote:
> >> On 2019-02-08 12:00, Sudeep Holla wrote:
> >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
> >>>> Dear All,
> >>>>
> >>>> This is a scenario that triggers the above issue:
> >>> [...]
> >>>> 1. system disables non-boot cpu's at the end of system suspend procedure,
> >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> >>>> 3. early in the system resume procedure all cpus are got back to online
> >>>>    state,
> >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined
> >>>>    cpus,
> >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
> >>>>    ->init() callback,
> >>> This is strictly not just restricted to cpufreq-dt, but to any driver
> >>> supporting multiple policies. So we need a generic fix not just
> >>> cpufreq-dt specific.
> >> Could you point which other driver needs similar fix? Here in cpufreq-dt
> >> the problem was caused by using regulator api (indirectly) from
> >> ->init(). All other drivers, which have regulators support, are for old,
> >> obsolete, uni-processor systems, which don't have the problem of
> >> secondary cpu suspend during system suspend/resume cycle.
> >>
> > scmi_cpufreq for instance. We can fix that in driver my moving to polling
> > to get cpufreq_get_rate, but we support both polling and interrupt based.
> > We may wait for remote processor interrupt in get_rate.
>
> Frankly, I don't feel I know enough to touch this driver and I don't
> think that this can even be fixed in a generic way in the cpufreq core.

Why not ? IIUC it's only to get/set the frequency you would need to talk
to remote processor or external chip over I2C which can be deferred until
resume. Rafael has valid concerns on messed up init implementations, still
wondering if there's any chance to solve this in the core.

> Maybe a comment somewhere is needed that ->init() might be called during
> early system resume with irqs off and driver is responsible for handling
> such case until the proper resume?
>

I agree and +1 for comment, but every driver needs to deal with that,
which is fine. Just trying to see if we can avoid it.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:09               ` Sudeep Holla
@ 2019-02-08 12:23                 ` Rafael J. Wysocki
  2019-02-08 14:28                   ` Sudeep Holla
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2019-02-08 12:23 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Rafael J. Wysocki, Viresh Kumar, Marek Szyprowski,
	Linux Kernel Mailing List, Linux PM, Linux Samsung SoC,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > >
> > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > > > >
> > > > > > The driver could do the I2C transactions in its suspend/resume
> > > > > > callbacks and do nothing in online/offline if those are part of
> > > > > > system-wide suspend/resume.
> > > > >
> > > > > These are per-policy things that we need to do, not sure if driver
> > > > > suspend/resume is a good place for that. It is more for a case where
> > > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > > > hot-unplugged during system suspend and hotplugged later on. This is
> > > > > more like complete removal/addition of devices instead of
> > > > > suspend/resume.
> > > >
> > > > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > > > away from them and (opportunistically) power them down.
> > > >
> > > > If this is system suspend, the driver kind of knows that offline will
> > > > take place, so it can prepare for it.  Likewise, when online takes
> > > > place during system-wide resume, it generally is known that this is
> > > > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > > > it can be "smart" and avoid accessing suspended devices.  Deferring
> > > > the frequency set up until the driver resume time should do the trick
> > > > I suppose.
> > >
> > > I agree. The reason we don't see this generally on boot is because all
> > > the CPUs are brought online before CPUfreq is initialised. While during
> > > system suspend, we call cpufreq_online which in turn calls ->init in
> > > the hotplug state machine.
> > >
> > > So as Rafael suggests we need to do some trick, but can it be done in
> > > the core itself ? I may be missing something, but how about the patch
> > > below:
> > >
> > > Regards,
> > > Sudeep
> > >
> > > --
> > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> > > index e35a886e00bc..7d8b0b99f91d 100644
> > > --- i/drivers/cpufreq/cpufreq.c
> > > +++ w/drivers/cpufreq/cpufreq.c
> > > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
> > >                 policy->max = policy->user_policy.max;
> > >         }
> > >
> > > -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> > > +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> > > +           !cpufreq_suspended) {
> > >                 policy->cur = cpufreq_driver->get(policy->cpu);
> > >                 if (!policy->cur) {
> > >                         pr_err("%s: ->get() failed\n", __func__);
> >
> > It looks like we need to skip the "initial freq check" block below.
> >
>
> Indeed, copy pasted an earlier version of diff. I found that I even
> used a goto label wrong which I fixed along with the additional check
> in "initial freq check" when I tried to compile :).
>
> > Also this doesn't really help the case when the driver ->init() messes
> > up with things.
> >
>
> Yes, in that case additional logic in the driver also needed. I am fine
> if we enforce driver to deal with this issue, but was thinking if we can
> make it generic. Also I was just trying to avoid adding _suspend/resume
> to driver just to avoid this issue.

I was wondering if cpufreq_offline()/online() could be invoked from
cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs
it (there could be a driver flag to indicate that).

If they are made exit immediately when cpufreq_suspended is set (and
the requisite driver flag is set too), that might work AFAICS.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:23                 ` Rafael J. Wysocki
@ 2019-02-08 14:28                   ` Sudeep Holla
  0 siblings, 0 replies; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 14:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Marek Szyprowski, Linux Kernel Mailing List,
	Linux PM, Linux Samsung SoC, Rafael J . Wysocki, Nishanth Menon,
	Stephen Boyd, Bartlomiej Zolnierkiewicz, Dave Gerlach,
	Wolfram Sang

On Fri, Feb 08, 2019 at 01:23:37PM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > Yes, in that case additional logic in the driver also needed. I am fine
> > if we enforce driver to deal with this issue, but was thinking if we can
> > make it generic. Also I was just trying to avoid adding _suspend/resume
> > to driver just to avoid this issue.
>
> I was wondering if cpufreq_offline()/online() could be invoked from
> cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs
> it (there could be a driver flag to indicate that).
>
> If they are made exit immediately when cpufreq_suspended is set (and
> the requisite driver flag is set too), that might work AFAICS.

Yes that sounds feasible. It should be fine to assume it's safe to call
cpufreq_online on a CPU even for CPU that might have failed to come
online or didn't reached a state in CPUHP from where CPUFreq callback
is executed or am I missing something ?

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 12:04         ` Marek Szyprowski
  2019-02-08 12:11           ` Rafael J. Wysocki
  2019-02-08 12:16           ` Sudeep Holla
@ 2019-02-08 17:41           ` Sudeep Holla
  2019-02-11  8:47             ` Viresh Kumar
  2 siblings, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2019-02-08 17:41 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Viresh Kumar,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Fri, Feb 08, 2019 at 01:04:18PM +0100, Marek Szyprowski wrote:
> Hi Sudeep,
> 
> On 2019-02-08 12:51, Sudeep Holla wrote:
> > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote:
> >> On 2019-02-08 12:00, Sudeep Holla wrote:
> >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote:
> >>>> Dear All,
> >>>>
> >>>> This is a scenario that triggers the above issue:
> >>> [...]
> >>>> 1. system disables non-boot cpu's at the end of system suspend procedure,
> >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus,
> >>>> 3. early in the system resume procedure all cpus are got back to online
> >>>>    state,
> >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined
> >>>>    cpus,
> >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during
> >>>>    ->init() callback,
> >>> This is strictly not just restricted to cpufreq-dt, but to any driver
> >>> supporting multiple policies. So we need a generic fix not just
> >>> cpufreq-dt specific.
> >> Could you point which other driver needs similar fix? Here in cpufreq-dt
> >> the problem was caused by using regulator api (indirectly) from
> >> ->init(). All other drivers, which have regulators support, are for old,
> >> obsolete, uni-processor systems, which don't have the problem of
> >> secondary cpu suspend during system suspend/resume cycle.
> >>
> > scmi_cpufreq for instance. We can fix that in driver my moving to polling
> > to get cpufreq_get_rate, but we support both polling and interrupt based.
> > We may wait for remote processor interrupt in get_rate.
> 
> Frankly, I don't feel I know enough to touch this driver and I don't
> think that this can even be fixed in a generic way in the cpufreq core.

Based on Rafael's suggestion, I cooked up something. See if this helps ?
The policy to cpu dance can be removed and we can just run through the
online cpumask I think.

Regards,
Sudeep

-->8

diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
index e35a886e00bc..03d65a02a542 100644
--- i/drivers/cpufreq/cpufreq.c
+++ w/drivers/cpufreq/cpufreq.c
@@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
 void cpufreq_suspend(void)
 {
 	struct cpufreq_policy *policy;
+	int cpu;
 
 	if (!cpufreq_driver)
 		return;
@@ -1662,6 +1663,11 @@ void cpufreq_suspend(void)
 	}
 
 suspend:
+	if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)
+		for_each_active_policy(policy)
+			for_each_cpu(cpu, policy->cpus)
+				cpufreq_offline(cpu);
+
 	cpufreq_suspended = true;
 }
 
@@ -1674,7 +1680,7 @@ void cpufreq_suspend(void)
 void cpufreq_resume(void)
 {
 	struct cpufreq_policy *policy;
-	int ret;
+	int ret, cpu;
 
 	if (!cpufreq_driver)
 		return;
@@ -1682,6 +1688,11 @@ void cpufreq_resume(void)
 	if (unlikely(!cpufreq_suspended))
 		return;
 
+	if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)
+		for_each_active_policy(policy)
+			for_each_cpu(cpu, policy->cpus)
+				cpufreq_online(cpu);
+
 	cpufreq_suspended = false;
 
 	if (!has_target() && !cpufreq_driver->resume)
@@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online;
 
 static int cpuhp_cpufreq_online(unsigned int cpu)
 {
-	cpufreq_online(cpu);
+	if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME))
+		cpufreq_online(cpu);
 
 	return 0;
 }
 
 static int cpuhp_cpufreq_offline(unsigned int cpu)
 {
-	cpufreq_offline(cpu);
+	if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME))
+		cpufreq_offline(cpu);
 
 	return 0;
 }
diff --git i/drivers/cpufreq/scmi-cpufreq.c w/drivers/cpufreq/scmi-cpufreq.c
index 01871418ffde..0bfc96102739 100644
--- i/drivers/cpufreq/scmi-cpufreq.c
+++ w/drivers/cpufreq/scmi-cpufreq.c
@@ -203,7 +203,8 @@ static void scmi_cpufreq_ready(struct cpufreq_policy *policy)
 static struct cpufreq_driver scmi_cpufreq_driver = {
 	.name	= "scmi",
 	.flags	= CPUFREQ_STICKY | CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
-		  CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+		  CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+		  CPUFREQ_DEFER_INIT_DURING_RESUME,
 	.verify	= cpufreq_generic_frequency_table_verify,
 	.attr	= cpufreq_generic_attr,
 	.target_index	= scmi_cpufreq_set_target,
diff --git i/include/linux/cpufreq.h w/include/linux/cpufreq.h
index c86d6d8bdfed..9cf6b3ce063a 100644
--- i/include/linux/cpufreq.h
+++ w/include/linux/cpufreq.h
@@ -385,6 +385,12 @@ struct cpufreq_driver {
  */
 #define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING (1 << 6)
 
+/*
+ * Set by drivers to advance/defer the cpufreq online/offline operation during
+ * system suspend/resume.
+ */
+#define CPUFREQ_DEFER_INIT_DURING_RESUME (1 << 7)
+
 int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-07 12:22 ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Marek Szyprowski
                     ` (3 preceding siblings ...)
  2019-02-08 11:00   ` Sudeep Holla
@ 2019-02-11  8:44   ` Viresh Kumar
  2019-02-11  9:52     ` Marek Szyprowski
  4 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2019-02-11  8:44 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang, Sudeep.Holla

On 07-02-19, 13:22, Marek Szyprowski wrote:
> Dear All,
> 
> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
> i2c adapters") added a visible warning for an attempt to do i2c transfer
> over a suspended i2c bus. This revealed a long standing issue in the
> cpufreq-dt driver, which gives a following warning during system
> suspend/resume cycle:

Marek,

I have sent a patchset which is not directly related to the problem
you are facing, but it may solve your problem as a side effect. Can
you please see if that works to solve this issue as well ?

https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u

Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-08 17:41           ` Sudeep Holla
@ 2019-02-11  8:47             ` Viresh Kumar
  2019-02-11 14:08               ` Sudeep Holla
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2019-02-11  8:47 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Marek Szyprowski, linux-kernel, linux-pm, linux-samsung-soc,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On 08-02-19, 17:41, Sudeep Holla wrote:
> Based on Rafael's suggestion, I cooked up something. See if this helps ?
> The policy to cpu dance can be removed and we can just run through the
> online cpumask I think.
> 
> Regards,
> Sudeep
> 
> -->8
> 
> diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> index e35a886e00bc..03d65a02a542 100644
> --- i/drivers/cpufreq/cpufreq.c
> +++ w/drivers/cpufreq/cpufreq.c
> @@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
>  void cpufreq_suspend(void)
>  {
>  	struct cpufreq_policy *policy;
> +	int cpu;
>  
>  	if (!cpufreq_driver)
>  		return;
> @@ -1662,6 +1663,11 @@ void cpufreq_suspend(void)
>  	}
>  
>  suspend:
> +	if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)
> +		for_each_active_policy(policy)
> +			for_each_cpu(cpu, policy->cpus)
> +				cpufreq_offline(cpu);

You will offline boot-cpu as well :)

> +
>  	cpufreq_suspended = true;
>  }
>  
> @@ -1674,7 +1680,7 @@ void cpufreq_suspend(void)
>  void cpufreq_resume(void)
>  {
>  	struct cpufreq_policy *policy;
> -	int ret;
> +	int ret, cpu;
>  
>  	if (!cpufreq_driver)
>  		return;
> @@ -1682,6 +1688,11 @@ void cpufreq_resume(void)
>  	if (unlikely(!cpufreq_suspended))
>  		return;
>  
> +	if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)
> +		for_each_active_policy(policy)
> +			for_each_cpu(cpu, policy->cpus)
> +				cpufreq_online(cpu);
> +
>  	cpufreq_suspended = false;
>  
>  	if (!has_target() && !cpufreq_driver->resume)
> @@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online;
>  
>  static int cpuhp_cpufreq_online(unsigned int cpu)
>  {
> -	cpufreq_online(cpu);
> +	if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME))
> +		cpufreq_online(cpu);

This isn't correct as we can offline the CPUs without suspend as well
and cpufreq_online/offline should always be called in such cases.

Anyways, I have cc'd you on another series which may end up fixing
this problem as well.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-11  8:44   ` Viresh Kumar
@ 2019-02-11  9:52     ` Marek Szyprowski
  2019-02-11  9:55       ` Viresh Kumar
  0 siblings, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-11  9:52 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang, Sudeep.Holla

Hi Viresh,

On 2019-02-11 09:44, Viresh Kumar wrote:
> On 07-02-19, 13:22, Marek Szyprowski wrote:
>> Dear All,
>>
>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
>> i2c adapters") added a visible warning for an attempt to do i2c transfer
>> over a suspended i2c bus. This revealed a long standing issue in the
>> cpufreq-dt driver, which gives a following warning during system
>> suspend/resume cycle:
> Marek,
>
> I have sent a patchset which is not directly related to the problem
> you are facing, but it may solve your problem as a side effect. Can
> you please see if that works to solve this issue as well ?
>
> https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u

Thanks for the patch. It indeed solves the problem of the i2c transfer
in cpu hotplug procedure during system resume, although my resources
management rewrite is still valid as a way to fix remaining 'todos' in
the cpufreq-dt driver.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-11  9:52     ` Marek Szyprowski
@ 2019-02-11  9:55       ` Viresh Kumar
  2019-02-11 12:22         ` Marek Szyprowski
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2019-02-11  9:55 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang, Sudeep.Holla

On 11-02-19, 10:52, Marek Szyprowski wrote:
> Hi Viresh,
> 
> On 2019-02-11 09:44, Viresh Kumar wrote:
> > On 07-02-19, 13:22, Marek Szyprowski wrote:
> >> Dear All,
> >>
> >> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
> >> i2c adapters") added a visible warning for an attempt to do i2c transfer
> >> over a suspended i2c bus. This revealed a long standing issue in the
> >> cpufreq-dt driver, which gives a following warning during system
> >> suspend/resume cycle:
> > Marek,
> >
> > I have sent a patchset which is not directly related to the problem
> > you are facing, but it may solve your problem as a side effect. Can
> > you please see if that works to solve this issue as well ?
> >
> > https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u
> 
> Thanks for the patch. It indeed solves the problem of the i2c transfer
> in cpu hotplug procedure during system resume, although my resources
> management rewrite is still valid as a way to fix remaining 'todos' in
> the cpufreq-dt driver.

Which remaining todos are you talking about ? Sorry I am not able to
recall :(

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-11  9:55       ` Viresh Kumar
@ 2019-02-11 12:22         ` Marek Szyprowski
  2019-02-12  5:08           ` Viresh Kumar
  0 siblings, 1 reply; 38+ messages in thread
From: Marek Szyprowski @ 2019-02-11 12:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang, Sudeep.Holla

Hi Viresh,

On 2019-02-11 10:55, Viresh Kumar wrote:
> On 11-02-19, 10:52, Marek Szyprowski wrote:
>> On 2019-02-11 09:44, Viresh Kumar wrote:
>>> On 07-02-19, 13:22, Marek Szyprowski wrote:
>>>> Dear All,
>>>>
>>>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
>>>> i2c adapters") added a visible warning for an attempt to do i2c transfer
>>>> over a suspended i2c bus. This revealed a long standing issue in the
>>>> cpufreq-dt driver, which gives a following warning during system
>>>> suspend/resume cycle:
>>> Marek,
>>>
>>> I have sent a patchset which is not directly related to the problem
>>> you are facing, but it may solve your problem as a side effect. Can
>>> you please see if that works to solve this issue as well ?
>>>
>>> https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u
>> Thanks for the patch. It indeed solves the problem of the i2c transfer
>> in cpu hotplug procedure during system resume, although my resources
>> management rewrite is still valid as a way to fix remaining 'todos' in
>> the cpufreq-dt driver.
> Which remaining todos are you talking about ? Sorry I am not able to
> recall :(

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/cpufreq/cpufreq-dt.c#n347

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-11  8:47             ` Viresh Kumar
@ 2019-02-11 14:08               ` Sudeep Holla
  0 siblings, 0 replies; 38+ messages in thread
From: Sudeep Holla @ 2019-02-11 14:08 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Marek Szyprowski, linux-kernel, linux-pm, linux-samsung-soc,
	Rafael J . Wysocki, Nishanth Menon, Stephen Boyd,
	Bartlomiej Zolnierkiewicz, Dave Gerlach, Wolfram Sang

On Mon, Feb 11, 2019 at 02:17:14PM +0530, Viresh Kumar wrote:
> On 08-02-19, 17:41, Sudeep Holla wrote:
> > Based on Rafael's suggestion, I cooked up something. See if this helps ?
> > The policy to cpu dance can be removed and we can just run through the
> > online cpumask I think.
> >
> > Regards,
> > Sudeep
> >
> > -->8
> >
> > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> > index e35a886e00bc..03d65a02a542 100644
> > --- i/drivers/cpufreq/cpufreq.c
> > +++ w/drivers/cpufreq/cpufreq.c
> > @@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
> >  void cpufreq_suspend(void)
> >  {
> >  	struct cpufreq_policy *policy;
> > +	int cpu;
> >
> >  	if (!cpufreq_driver)
> >  		return;
> > @@ -1662,6 +1663,11 @@ void cpufreq_suspend(void)
> >  	}
> >
> >  suspend:
> > +	if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)
> > +		for_each_active_policy(policy)
> > +			for_each_cpu(cpu, policy->cpus)
> > +				cpufreq_offline(cpu);
>
> You will offline boot-cpu as well :)
>

Indeed, I was just trying to check the idea of flags and clearly missed
the boot cpu :(


[..]

> > @@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online;
> >
> >  static int cpuhp_cpufreq_online(unsigned int cpu)
> >  {
> > -	cpufreq_online(cpu);
> > +	if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME))
> > +		cpufreq_online(cpu);
>
> This isn't correct as we can offline the CPUs without suspend as well
> and cpufreq_online/offline should always be called in such cases.
>

Understood

> Anyways, I have cc'd you on another series which may end up fixing
> this problem as well.

Sure, will have a look.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization
  2019-02-11 12:22         ` Marek Szyprowski
@ 2019-02-12  5:08           ` Viresh Kumar
  0 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2019-02-12  5:08 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-kernel, linux-pm, linux-samsung-soc, Rafael J . Wysocki,
	Nishanth Menon, Stephen Boyd, Bartlomiej Zolnierkiewicz,
	Dave Gerlach, Wolfram Sang, Sudeep.Holla

On 11-02-19, 13:22, Marek Szyprowski wrote:
> Hi Viresh,
> 
> On 2019-02-11 10:55, Viresh Kumar wrote:
> > On 11-02-19, 10:52, Marek Szyprowski wrote:
> >> On 2019-02-11 09:44, Viresh Kumar wrote:
> >>> On 07-02-19, 13:22, Marek Szyprowski wrote:
> >>>> Dear All,
> >>>>
> >>>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for
> >>>> i2c adapters") added a visible warning for an attempt to do i2c transfer
> >>>> over a suspended i2c bus. This revealed a long standing issue in the
> >>>> cpufreq-dt driver, which gives a following warning during system
> >>>> suspend/resume cycle:
> >>> Marek,
> >>>
> >>> I have sent a patchset which is not directly related to the problem
> >>> you are facing, but it may solve your problem as a side effect. Can
> >>> you please see if that works to solve this issue as well ?
> >>>
> >>> https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u
> >> Thanks for the patch. It indeed solves the problem of the i2c transfer
> >> in cpu hotplug procedure during system resume, although my resources
> >> management rewrite is still valid as a way to fix remaining 'todos' in
> >> the cpufreq-dt driver.
> > Which remaining todos are you talking about ? Sorry I am not able to
> > recall :(
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/cpufreq/cpufreq-dt.c#n347

Ah, okay. Thanks for the pointer.

Ideally we shouldn't be doing anything in probe, like
resources_available(), but we are forced to do it as
subsys_interface_register() doesn't return the errors returned from
cpufreq_add_dev() currently. I tried to fix it once but there were
some complications I believe and then left it.

The main problem is that if cpufreq_online() doesn't find the required
resources and returns -EPROBE_DEFER, it never comes back to the
probe() routine which registers the cpufreq driver. And so we have to
add the duplicate checks in probe() itself before it registers the
cpufreq driver.

Now all we need to do there is to guarantee that the resources are
available when the cpufreq driver registers and so we do it only for
CPU0 currently. Logically speaking, if the resources are available for
CPU0, it will normally be available for any other CPU as well and so
there never was a requirement to test it for other CPUs. And so I left
a FIXME there, so that we know what's going on there in case a
platform comes up for which it doesn't work.

I won't fix it with something like what this patch series tried to do,
rather I would try to make sure that EPROBE_DEFER gets returned from
cpufreq_driver_register() instead.

Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-02-12  5:09 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20190207122255eucas1p1cdebed838c799eca46cce6a654a26187@eucas1p1.samsung.com>
2019-02-07 12:22 ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Marek Szyprowski
     [not found]   ` <CGME20190207122255eucas1p1444023f01217a43cfb958fe0bd48ef4d@eucas1p1.samsung.com>
2019-02-07 12:22     ` [PATCH 1/2] cpufreq: dt/ti/opp: move regulators initialization to the drivers Marek Szyprowski
     [not found]   ` <CGME20190207122256eucas1p17e8742176bda911263d2d14d2797a886@eucas1p1.samsung.com>
2019-02-07 12:22     ` [PATCH 2/2] cpufreq: dt: rework resources initialization Marek Szyprowski
2019-02-08  1:26       ` kbuild test robot
2019-02-08  1:26         ` kbuild test robot
2019-02-08  6:49   ` [PATCH 0/2] cpufreq/opp: rework regulator initialization Viresh Kumar
2019-02-08  8:12     ` Marek Szyprowski
2019-02-08  8:55       ` Viresh Kumar
2019-02-08  9:15         ` Marek Szyprowski
2019-02-08  9:23           ` Viresh Kumar
2019-02-08 10:02             ` Marek Szyprowski
2019-02-08 10:08             ` Rafael J. Wysocki
2019-02-08 10:18         ` Rafael J. Wysocki
2019-02-08 10:28           ` Viresh Kumar
2019-02-08 10:22     ` Rafael J. Wysocki
2019-02-08 10:31       ` Marek Szyprowski
2019-02-08 10:31       ` Viresh Kumar
2019-02-08 10:42         ` Rafael J. Wysocki
2019-02-08 10:52           ` Rafael J. Wysocki
2019-02-08 11:39           ` Sudeep Holla
2019-02-08 12:03             ` Rafael J. Wysocki
2019-02-08 12:09               ` Sudeep Holla
2019-02-08 12:23                 ` Rafael J. Wysocki
2019-02-08 14:28                   ` Sudeep Holla
2019-02-08 11:00   ` Sudeep Holla
2019-02-08 11:47     ` Marek Szyprowski
2019-02-08 11:51       ` Sudeep Holla
2019-02-08 12:04         ` Marek Szyprowski
2019-02-08 12:11           ` Rafael J. Wysocki
2019-02-08 12:16           ` Sudeep Holla
2019-02-08 17:41           ` Sudeep Holla
2019-02-11  8:47             ` Viresh Kumar
2019-02-11 14:08               ` Sudeep Holla
2019-02-11  8:44   ` Viresh Kumar
2019-02-11  9:52     ` Marek Szyprowski
2019-02-11  9:55       ` Viresh Kumar
2019-02-11 12:22         ` Marek Szyprowski
2019-02-12  5:08           ` Viresh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.