Linux-ACPI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 0/8] Make fw_devlink=on more forgiving
@ 2021-02-05 22:26 ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 1/8] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
                     ` (12 more replies)
  0 siblings, 13 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

There are a lot of devices/drivers where they never have a struct device
created for them or the driver initializes the hardware without ever
binding to the struct device.

This series is intended to avoid any boot regressions due to such
devices/drivers when fw_devlink=on and also address the handling of
optional suppliers.

Patch 1 and 2 addresses the issue of firmware nodes that look like
they'll have struct devices created for them, but will never actually
have struct devices added for them. For example, DT nodes with a
compatible property that don't have devices added for them.

Patch 3 and 4 allow for handling optional DT bindings.

Patch 5 sets up a generic API to handle drivers that never bind with
their devices.

Patch 6 through 8 update different frameworks to use the new API.

Thanks,
Saravana

Saravana Kannan (8):
  driver core: fw_devlink: Detect supplier devices that will never be
    added
  of: property: Don't add links to absent suppliers
  driver core: Add fw_devlink.strict kernel param
  of: property: Add fw_devlink support for optional properties
  driver core: fw_devlink: Handle suppliers that don't use driver core
  irqdomain: Mark fwnodes when their irqdomain is added/removed
  PM: domains: Mark fwnodes when their powerdomain is added/removed
  clk: Mark fwnodes when their clock provider is added/removed

 .../admin-guide/kernel-parameters.txt         |  5 ++
 drivers/base/core.c                           | 58 ++++++++++++++++++-
 drivers/base/power/domain.c                   |  2 +
 drivers/clk/clk.c                             |  3 +
 drivers/of/property.c                         | 16 +++--
 include/linux/fwnode.h                        | 20 ++++++-
 kernel/irq/irqdomain.c                        |  2 +
 7 files changed, 98 insertions(+), 8 deletions(-)

-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 1/8] driver core: fw_devlink: Detect supplier devices that will never be added
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 2/8] of: property: Don't add links to absent suppliers Saravana Kannan
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Saravana Kannan
  Cc: linux-doc, linux-kernel, linux-pm, linux-clk, devicetree,
	linux-acpi, Marek Szyprowski, Geert Uytterhoeven, kernel-team

During the initial parsing of firmware by fw_devlink, fw_devlink might
infer that some supplier firmware nodes would get populated as devices.
But the inference is not always correct. This patch tries to logically
detect and fix such mistakes as boot progresses or more devices probe.

fw_devlink makes a fundamental assumption that once a device binds to a
driver, it will populate (i.e: add as struct devices) all the child
firmware nodes that could be populated as devices (if they aren't
populated already).

So, whenever a device probes, we check all its child firmware nodes. If
a child firmware node has a corresponding device populated, we don't
modify the child node or its descendants. However, if a child firmware
node has not been populated as a device, we delete all the fwnode links
where the child node or its descendants are suppliers. This ensures that
no other device is blocked on a firmware node that will never be
populated as a device. We also mark such fwnodes as NOT_DEVICE, so that
no new fwnode links are created with these nodes as suppliers.

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
---
 drivers/base/core.c    | 31 ++++++++++++++++++++++++++++---
 include/linux/fwnode.h |  2 ++
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 484a942884ba..c95b1daabac7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -148,6 +148,21 @@ void fwnode_links_purge(struct fwnode_handle *fwnode)
 	fwnode_links_purge_consumers(fwnode);
 }
 
+static void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode)
+{
+	struct fwnode_handle *child;
+
+	/* Don't purge consumer links of an added child */
+	if (fwnode->dev)
+		return;
+
+	fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
+	fwnode_links_purge_consumers(fwnode);
+
+	fwnode_for_each_available_child_node(fwnode, child)
+		fw_devlink_purge_absent_suppliers(child);
+}
+
 #ifdef CONFIG_SRCU
 static DEFINE_MUTEX(device_links_lock);
 DEFINE_STATIC_SRCU(device_links_srcu);
@@ -1154,12 +1169,22 @@ void device_links_driver_bound(struct device *dev)
 	LIST_HEAD(sync_list);
 
 	/*
-	 * If a device probes successfully, it's expected to have created all
+	 * If a device binds successfully, it's expected to have created all
 	 * the device links it needs to or make new device links as it needs
-	 * them. So, it no longer needs to wait on any suppliers.
+	 * them. So, fw_devlink no longer needs to create device links to any
+	 * of the device's suppliers.
+	 *
+	 * Also, if a child firmware node of this bound device is not added as
+	 * a device by now, assume it is never going to be added and make sure
+	 * other devices don't defer probe indefinitely by waiting for such a
+	 * child device.
 	 */
-	if (dev->fwnode && dev->fwnode->dev == dev)
+	if (dev->fwnode && dev->fwnode->dev == dev) {
+		struct fwnode_handle *child;
 		fwnode_links_purge_suppliers(dev->fwnode);
+		fwnode_for_each_available_child_node(dev->fwnode, child)
+			fw_devlink_purge_absent_suppliers(child);
+	}
 	device_remove_file(dev, &dev_attr_waiting_for_supplier);
 
 	device_links_write_lock();
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index fde4ad97564c..21082f11473f 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -19,8 +19,10 @@ struct device;
  * fwnode link flags
  *
  * LINKS_ADDED: The fwnode has already be parsed to add fwnode links.
+ * NOT_DEVICE: The fwnode will never be populated as a struct device.
  */
 #define FWNODE_FLAG_LINKS_ADDED		BIT(0)
+#define FWNODE_FLAG_NOT_DEVICE		BIT(1)
 
 struct fwnode_handle {
 	struct fwnode_handle *secondary;
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 2/8] of: property: Don't add links to absent suppliers
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 1/8] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-09 21:25     ` Rob Herring
  2021-02-05 22:26   ` [PATCH v4 3/8] driver core: Add fw_devlink.strict kernel param Saravana Kannan
                     ` (10 subsequent siblings)
  12 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Saravana Kannan
  Cc: linux-doc, linux-kernel, linux-pm, linux-clk, devicetree,
	linux-acpi, Marek Szyprowski, Geert Uytterhoeven, kernel-team

If driver core marks a firmware node as not a device, don't add fwnode
links where it's a supplier.

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 6287c6d60bb7..53d163c8d39b 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1103,7 +1103,9 @@ static int of_link_to_phandle(struct device_node *con_np,
 	 * created for them.
 	 */
 	sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
-	if (!sup_dev && of_node_check_flag(sup_np, OF_POPULATED)) {
+	if (!sup_dev &&
+	    (of_node_check_flag(sup_np, OF_POPULATED) ||
+	     sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
 		pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
 			 con_np, sup_np);
 		of_node_put(sup_np);
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 3/8] driver core: Add fw_devlink.strict kernel param
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 1/8] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 2/8] of: property: Don't add links to absent suppliers Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties Saravana Kannan
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

This param allows forcing all dependencies to be treated as mandatory.
This will be useful for boards in which all optional dependencies like
IOMMUs and DMAs need to be treated as mandatory dependencies.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +++++
 drivers/base/core.c                             | 12 ++++++++++++
 include/linux/fwnode.h                          |  1 +
 3 files changed, 18 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a10b545c2070..692b63644133 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1433,6 +1433,11 @@
 				to enforce probe and suspend/resume ordering.
 			rpm --	Like "on", but also use to order runtime PM.
 
+	fw_devlink.strict=<bool>
+			[KNL] Treat all inferred dependencies as mandatory
+			dependencies. This only applies for fw_devlink=on|rpm.
+			Format: <bool>
+
 	gamecon.map[2|3]=
 			[HW,JOY] Multisystem joystick and NES/SNES/PSX pad
 			support via parallel port (up to 5 devices per port)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index c95b1daabac7..f466ab4f1c35 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1521,6 +1521,13 @@ static int __init fw_devlink_setup(char *arg)
 }
 early_param("fw_devlink", fw_devlink_setup);
 
+static bool fw_devlink_strict;
+static int __init fw_devlink_strict_setup(char *arg)
+{
+	return strtobool(arg, &fw_devlink_strict);
+}
+early_param("fw_devlink.strict", fw_devlink_strict_setup);
+
 u32 fw_devlink_get_flags(void)
 {
 	return fw_devlink_flags;
@@ -1531,6 +1538,11 @@ static bool fw_devlink_is_permissive(void)
 	return fw_devlink_flags == FW_DEVLINK_FLAGS_PERMISSIVE;
 }
 
+bool fw_devlink_is_strict(void)
+{
+	return fw_devlink_strict && !fw_devlink_is_permissive();
+}
+
 static void fw_devlink_parse_fwnode(struct fwnode_handle *fwnode)
 {
 	if (fwnode->flags & FWNODE_FLAG_LINKS_ADDED)
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index 21082f11473f..d5caefe39d93 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -162,6 +162,7 @@ static inline void fwnode_init(struct fwnode_handle *fwnode,
 }
 
 extern u32 fw_devlink_get_flags(void);
+extern bool fw_devlink_is_strict(void);
 int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup);
 void fwnode_links_purge(struct fwnode_handle *fwnode);
 
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (2 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 3/8] driver core: Add fw_devlink.strict kernel param Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-09 21:33     ` Rob Herring
  2021-02-05 22:26   ` [PATCH v4 5/8] driver core: fw_devlink: Handle suppliers that don't use driver core Saravana Kannan
                     ` (8 subsequent siblings)
  12 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

Not all DT bindings are mandatory bindings. Add support for optional DT
bindings and mark iommus, iommu-map, dmas as optional DT bindings.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 53d163c8d39b..962109082df1 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1235,6 +1235,7 @@ static struct device_node *parse_##fname(struct device_node *np,	     \
 struct supplier_bindings {
 	struct device_node *(*parse_prop)(struct device_node *np,
 					  const char *prop_name, int index);
+	bool optional;
 };
 
 DEFINE_SIMPLE_PROP(clocks, "clocks", "#clock-cells")
@@ -1308,12 +1309,12 @@ static struct device_node *parse_interrupts(struct device_node *np,
 static const struct supplier_bindings of_supplier_bindings[] = {
 	{ .parse_prop = parse_clocks, },
 	{ .parse_prop = parse_interconnects, },
-	{ .parse_prop = parse_iommus, },
-	{ .parse_prop = parse_iommu_maps, },
+	{ .parse_prop = parse_iommus, .optional = true, },
+	{ .parse_prop = parse_iommu_maps, .optional = true, },
 	{ .parse_prop = parse_mboxes, },
 	{ .parse_prop = parse_io_channels, },
 	{ .parse_prop = parse_interrupt_parent, },
-	{ .parse_prop = parse_dmas, },
+	{ .parse_prop = parse_dmas, .optional = true, },
 	{ .parse_prop = parse_power_domains, },
 	{ .parse_prop = parse_hwlocks, },
 	{ .parse_prop = parse_extcon, },
@@ -1368,6 +1369,11 @@ static int of_link_property(struct device_node *con_np, const char *prop_name)
 
 	/* Do not stop at first failed link, link all available suppliers. */
 	while (!matched && s->parse_prop) {
+		if (s->optional && !fw_devlink_is_strict()) {
+			s++;
+			continue;
+		}
+
 		while ((phandle = s->parse_prop(con_np, prop_name, i))) {
 			matched = true;
 			i++;
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 5/8] driver core: fw_devlink: Handle suppliers that don't use driver core
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (3 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 6/8] irqdomain: Mark fwnodes when their irqdomain is added/removed Saravana Kannan
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

Device links only work between devices that use the driver core to match
and bind a driver to a device. So, add an API for frameworks to let the
driver core know that a fwnode has been initialized by a driver without
using the driver core.

Then use this information to make sure that fw_devlink doesn't make the
consumers wait indefinitely on suppliers that'll never bind to a driver.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c    | 15 +++++++++++++++
 include/linux/fwnode.h | 19 +++++++++++++++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index f466ab4f1c35..ea710b33bda6 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1636,6 +1636,17 @@ static int fw_devlink_create_devlink(struct device *con,
 
 	sup_dev = get_dev_from_fwnode(sup_handle);
 	if (sup_dev) {
+		/*
+		 * If it's one of those drivers that don't actually bind to
+		 * their device using driver core, then don't wait on this
+		 * supplier device indefinitely.
+		 */
+		if (sup_dev->links.status == DL_DEV_NO_DRIVER &&
+		    sup_handle->flags & FWNODE_FLAG_INITIALIZED) {
+			ret = -EINVAL;
+			goto out;
+		}
+
 		/*
 		 * If this fails, it is due to cycles in device links.  Just
 		 * give up on this link and treat it as invalid.
@@ -1655,6 +1666,10 @@ static int fw_devlink_create_devlink(struct device *con,
 		goto out;
 	}
 
+	/* Supplier that's already initialized without a struct device. */
+	if (sup_handle->flags & FWNODE_FLAG_INITIALIZED)
+		return -EINVAL;
+
 	/*
 	 * DL_FLAG_SYNC_STATE_ONLY doesn't block probing and supports
 	 * cycles. So cycle detection isn't necessary and shouldn't be
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index d5caefe39d93..dfefd43a737c 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -11,6 +11,7 @@
 
 #include <linux/types.h>
 #include <linux/list.h>
+#include <linux/err.h>
 
 struct fwnode_operations;
 struct device;
@@ -18,11 +19,13 @@ struct device;
 /*
  * fwnode link flags
  *
- * LINKS_ADDED: The fwnode has already be parsed to add fwnode links.
- * NOT_DEVICE: The fwnode will never be populated as a struct device.
+ * LINKS_ADDED:	The fwnode has already be parsed to add fwnode links.
+ * NOT_DEVICE:	The fwnode will never be populated as a struct device.
+ * INITIALIZED: The hardware corresponding to fwnode has been initialized.
  */
 #define FWNODE_FLAG_LINKS_ADDED		BIT(0)
 #define FWNODE_FLAG_NOT_DEVICE		BIT(1)
+#define FWNODE_FLAG_INITIALIZED		BIT(2)
 
 struct fwnode_handle {
 	struct fwnode_handle *secondary;
@@ -161,6 +164,18 @@ static inline void fwnode_init(struct fwnode_handle *fwnode,
 	INIT_LIST_HEAD(&fwnode->suppliers);
 }
 
+static inline void fwnode_dev_initialized(struct fwnode_handle *fwnode,
+					  bool initialized)
+{
+	if (IS_ERR_OR_NULL(fwnode))
+		return;
+
+	if (initialized)
+		fwnode->flags |= FWNODE_FLAG_INITIALIZED;
+	else
+		fwnode->flags &= ~FWNODE_FLAG_INITIALIZED;
+}
+
 extern u32 fw_devlink_get_flags(void);
 extern bool fw_devlink_is_strict(void);
 int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup);
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 6/8] irqdomain: Mark fwnodes when their irqdomain is added/removed
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (4 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 5/8] driver core: fw_devlink: Handle suppliers that don't use driver core Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 7/8] PM: domains: Mark fwnodes when their powerdomain " Saravana Kannan
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

This allows fw_devlink to recognize irqdomain drivers that don't use the
device-driver model to initialize the device. fw_devlink will use this
information to make sure consumers of such irqdomain aren't indefinitely
blocked from probing, waiting for the irqdomain device to appear and
bind to a driver.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 kernel/irq/irqdomain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 6aacd342cd14..288151393a06 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -205,6 +205,7 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size,
 	}
 
 	fwnode_handle_get(fwnode);
+	fwnode_dev_initialized(fwnode, true);
 
 	/* Fill structure */
 	INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL);
@@ -253,6 +254,7 @@ void irq_domain_remove(struct irq_domain *domain)
 
 	pr_debug("Removed domain %s\n", domain->name);
 
+	fwnode_dev_initialized(domain->fwnode, false);
 	fwnode_handle_put(domain->fwnode);
 	if (domain->flags & IRQ_DOMAIN_NAME_ALLOCATED)
 		kfree(domain->name);
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 7/8] PM: domains: Mark fwnodes when their powerdomain is added/removed
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (5 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 6/8] irqdomain: Mark fwnodes when their irqdomain is added/removed Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-05 22:26   ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider " Saravana Kannan
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Rafael J. Wysocki
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

This allows fw_devlink to recognize power domain drivers that don't use
the device-driver model to initialize the device. fw_devlink will use
this information to make sure consumers of such power domain aren't
indefinitely blocked from probing, waiting for the power domain device
to appear and bind to a driver.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/power/domain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 9a14eedacb92..6ac52a038bb9 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -2164,6 +2164,7 @@ static int genpd_add_provider(struct device_node *np, genpd_xlate_t xlate,
 	cp->node = of_node_get(np);
 	cp->data = data;
 	cp->xlate = xlate;
+	fwnode_dev_initialized(&np->fwnode, true);
 
 	mutex_lock(&of_genpd_mutex);
 	list_add(&cp->link, &of_genpd_providers);
@@ -2353,6 +2354,7 @@ void of_genpd_del_provider(struct device_node *np)
 				}
 			}
 
+			fwnode_dev_initialized(&cp->node->fwnode, false);
 			list_del(&cp->link);
 			of_node_put(cp->node);
 			kfree(cp);
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 8/8] clk: Mark fwnodes when their clock provider is added/removed
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (6 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 7/8] PM: domains: Mark fwnodes when their powerdomain " Saravana Kannan
@ 2021-02-05 22:26   ` Saravana Kannan
  2021-02-08 15:38     ` Rob Herring
                       ` (2 more replies)
  2021-02-06  2:45   ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (4 subsequent siblings)
  12 siblings, 3 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-05 22:26 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Saravana Kannan, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

This allows fw_devlink to recognize clock provider drivers that don't
use the device-driver model to initialize the device. fw_devlink will
use this information to make sure consumers of such clock providers
aren't indefinitely blocked from probing, waiting for the power domain
device to appear and bind to a driver.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/clk/clk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 8c1d04db990d..27ff90eacb1f 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4555,6 +4555,8 @@ int of_clk_add_provider(struct device_node *np,
 	if (ret < 0)
 		of_clk_del_provider(np);
 
+	fwnode_dev_initialized(&np->fwnode, true);
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(of_clk_add_provider);
@@ -4672,6 +4674,7 @@ void of_clk_del_provider(struct device_node *np)
 	list_for_each_entry(cp, &of_clk_providers, link) {
 		if (cp->node == np) {
 			list_del(&cp->link);
+			fwnode_dev_initialized(&np->fwnode, false);
 			of_node_put(cp->node);
 			kfree(cp);
 			break;
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (7 preceding siblings ...)
  2021-02-05 22:26   ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider " Saravana Kannan
@ 2021-02-06  2:45   ` Saravana Kannan
  2021-02-06 19:41   ` Geert Uytterhoeven
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-06  2:45 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner
  Cc: Linux Doc Mailing List, LKML, Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Fri, Feb 5, 2021 at 2:26 PM Saravana Kannan <saravanak@google.com> wrote:
>
> There are a lot of devices/drivers where they never have a struct device
> created for them or the driver initializes the hardware without ever
> binding to the struct device.
>
> This series is intended to avoid any boot regressions due to such
> devices/drivers when fw_devlink=on and also address the handling of
> optional suppliers.
>
> Patch 1 and 2 addresses the issue of firmware nodes that look like
> they'll have struct devices created for them, but will never actually
> have struct devices added for them. For example, DT nodes with a
> compatible property that don't have devices added for them.
>
> Patch 3 and 4 allow for handling optional DT bindings.
>
> Patch 5 sets up a generic API to handle drivers that never bind with
> their devices.
>
> Patch 6 through 8 update different frameworks to use the new API.
>
> Thanks,
> Saravana
>

Forgot to add version history:

v1 -> v2:
Patch 1: Added a flag to fwnodes that aren't devices.
Patch 3: New patch to ise the flag set in patch 1 to not create bad links.

v2 -> v3:
- Patch 1: Added Rafael's Ack
- New patches 3 and 4

v3 -> v4:
- No changes to patches 1-4.
- New patches 5-8.

-Saravana

> Saravana Kannan (8):
>   driver core: fw_devlink: Detect supplier devices that will never be
>     added
>   of: property: Don't add links to absent suppliers
>   driver core: Add fw_devlink.strict kernel param
>   of: property: Add fw_devlink support for optional properties
>   driver core: fw_devlink: Handle suppliers that don't use driver core
>   irqdomain: Mark fwnodes when their irqdomain is added/removed
>   PM: domains: Mark fwnodes when their powerdomain is added/removed
>   clk: Mark fwnodes when their clock provider is added/removed
>
>  .../admin-guide/kernel-parameters.txt         |  5 ++
>  drivers/base/core.c                           | 58 ++++++++++++++++++-
>  drivers/base/power/domain.c                   |  2 +
>  drivers/clk/clk.c                             |  3 +
>  drivers/of/property.c                         | 16 +++--
>  include/linux/fwnode.h                        | 20 ++++++-
>  kernel/irq/irqdomain.c                        |  2 +
>  7 files changed, 98 insertions(+), 8 deletions(-)
>
> --
> 2.30.0.478.g8a0d178c01-goog
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (8 preceding siblings ...)
  2021-02-06  2:45   ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
@ 2021-02-06 19:41   ` Geert Uytterhoeven
  2021-02-06 20:47     ` Saravana Kannan
  2021-02-08  8:40   ` Marek Szyprowski
                     ` (2 subsequent siblings)
  12 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-06 19:41 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team

Hi Saravana,

On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan <saravanak@google.com> wrote:
> There are a lot of devices/drivers where they never have a struct device
> created for them or the driver initializes the hardware without ever
> binding to the struct device.
>
> This series is intended to avoid any boot regressions due to such
> devices/drivers when fw_devlink=on and also address the handling of
> optional suppliers.

Thanks for your series!

> Patch 5 sets up a generic API to handle drivers that never bind with
> their devices.
>
> Patch 6 through 8 update different frameworks to use the new API.

>   driver core: fw_devlink: Handle suppliers that don't use driver core
>   irqdomain: Mark fwnodes when their irqdomain is added/removed
>   PM: domains: Mark fwnodes when their powerdomain is added/removed
>   clk: Mark fwnodes when their clock provider is added/removed

I take it this is an automatic alternative for letting drivers set the
OF_POPULATED flag manually?

Is this actually safe?  It's not uncommon for a driver to register
multiple providers, sometimes even of different types (clock, genpd,
irq, reset[1], ...).
Can you be sure consumer drivers do not start probing while their
dependency is still busy registering providers?

[1] Which brings my attention to the fact that devlink does not consider
    "resets" properties yet.

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-06 19:41   ` Geert Uytterhoeven
@ 2021-02-06 20:47     ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-06 20:47 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team

On Sat, Feb 6, 2021 at 11:41 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan <saravanak@google.com> wrote:
> > There are a lot of devices/drivers where they never have a struct device
> > created for them or the driver initializes the hardware without ever
> > binding to the struct device.
> >
> > This series is intended to avoid any boot regressions due to such
> > devices/drivers when fw_devlink=on and also address the handling of
> > optional suppliers.
>
> Thanks for your series!
>
> > Patch 5 sets up a generic API to handle drivers that never bind with
> > their devices.
> >
> > Patch 6 through 8 update different frameworks to use the new API.
>
> >   driver core: fw_devlink: Handle suppliers that don't use driver core
> >   irqdomain: Mark fwnodes when their irqdomain is added/removed
> >   PM: domains: Mark fwnodes when their powerdomain is added/removed
> >   clk: Mark fwnodes when their clock provider is added/removed
>
> I take it this is an automatic alternative for letting drivers set the
> OF_POPULATED flag manually?

The frameworks can still continue setting it to avoid creating dead
"struct devices" that'll never be used. This new flag handles cases
where the device is already created, but will never bind to a driver.
So, they are meant to do slightly different things, but the end result
is removing the need for individual drivers to set OF_POPULATED (and
Rob hates that too).

> Is this actually safe?  It's not uncommon for a driver to register
> multiple providers, sometimes even of different types (clock, genpd,
> irq, reset[1], ...).

This flag is just an indication that the fwnode has been initialized
by a driver. It's okay if the flag gets set multiple times when a
driver is registering with multiple frameworks. It's also okay if the
flag is cleared multiple times as the driver is uninitializing the
hardware (although, this is very unlikely for drivers that don't use
device-driver model). When we actually try to create device links, we
just check if this happened without a driver actually binding to this
device. There's no "probing" race because the "status" I check goes
through NO_DRIVER -> PROBING -(registering happens)-> BOUND ->
UNBINDING -(deregistering happens) -> NO_DRIVER. So if the fwnode flag
is getting set as part of the driver's probe function, the "status"
value will never be NO_DRIVER.

> Can you be sure consumer drivers do not start probing while their
> dependency is still busy registering providers?

The code only acts on that flag when trying to create device links
from the consumer to the supplier. This is just a way to tell "hey,
don't bother creating a device link, this supplier will never bind".
So it just avoids blocking the consumer. Doesn't really make the
consumers probe earlier than they would have.

> [1] Which brings my attention to the fact that devlink does not consider
>     "resets" properties yet.
>

Yeah, we can add that and other bindings as we go.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (9 preceding siblings ...)
  2021-02-06 19:41   ` Geert Uytterhoeven
@ 2021-02-08  8:40   ` Marek Szyprowski
  2021-02-08 23:57     ` Saravana Kannan
  2021-02-10  8:19   ` Tudor.Ambarus
  2021-02-11 13:00   ` Geert Uytterhoeven
  12 siblings, 1 reply; 55+ messages in thread
From: Marek Szyprowski @ 2021-02-08  8:40 UTC (permalink / raw)
  To: Saravana Kannan, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Kevin Hilman, Ulf Hansson, Len Brown,
	Len Brown, Pavel Machek, Michael Turquette, Stephen Boyd,
	Rob Herring, Frank Rowand, Marc Zyngier, Thomas Gleixner
  Cc: linux-doc, linux-kernel, linux-pm, linux-clk, devicetree,
	linux-acpi, Geert Uytterhoeven, kernel-team

Hi Saravana,

On 05.02.2021 23:26, Saravana Kannan wrote:
> There are a lot of devices/drivers where they never have a struct device
> created for them or the driver initializes the hardware without ever
> binding to the struct device.
>
> This series is intended to avoid any boot regressions due to such
> devices/drivers when fw_devlink=on and also address the handling of
> optional suppliers.
>
> Patch 1 and 2 addresses the issue of firmware nodes that look like
> they'll have struct devices created for them, but will never actually
> have struct devices added for them. For example, DT nodes with a
> compatible property that don't have devices added for them.
>
> Patch 3 and 4 allow for handling optional DT bindings.
>
> Patch 5 sets up a generic API to handle drivers that never bind with
> their devices.
>
> Patch 6 through 8 update different frameworks to use the new API.

This patchset fixes probing issue observed on various Exynos based 
boards even with commit c09a3e6c97f0 ("soc: samsung: pm_domains: Convert 
to regular platform driver") reverted. Thanks!

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 8/8] clk: Mark fwnodes when their clock provider is added/removed
  2021-02-05 22:26   ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider " Saravana Kannan
@ 2021-02-08 15:38     ` Rob Herring
  2021-02-08 23:34       ` Saravana Kannan
  2021-02-10 11:44     ` [PATCH] clk: Mark fwnodes when their clock provider is added Tudor Ambarus
       [not found]     ` <161317679292.1254594.15797939257637374295@swboyd.mtv.corp.google.com>
  2 siblings, 1 reply; 55+ messages in thread
From: Rob Herring @ 2021-02-08 15:38 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Frank Rowand, Marc Zyngier,
	Thomas Gleixner, Linux Doc Mailing List, linux-kernel,
	open list:THERMAL, linux-clk, devicetree,
	open list:ACPI FOR ARM64 (ACPI/arm64),
	Marek Szyprowski, Geert Uytterhoeven, Android Kernel Team

On Fri, Feb 5, 2021 at 4:27 PM Saravana Kannan <saravanak@google.com> wrote:
>
> This allows fw_devlink to recognize clock provider drivers that don't
> use the device-driver model to initialize the device. fw_devlink will
> use this information to make sure consumers of such clock providers
> aren't indefinitely blocked from probing, waiting for the power domain
> device to appear and bind to a driver.

Don't we have cases that are a mixture? IOW, a subset of the clock
provider is initialized early, then the full driver takes over. You'd
want consumers that are not a driver to succeed, but drivers to defer
until the full driver is up.

>
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/clk/clk.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 8c1d04db990d..27ff90eacb1f 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -4555,6 +4555,8 @@ int of_clk_add_provider(struct device_node *np,
>         if (ret < 0)
>                 of_clk_del_provider(np);
>
> +       fwnode_dev_initialized(&np->fwnode, true);
> +
>         return ret;
>  }
>  EXPORT_SYMBOL_GPL(of_clk_add_provider);
> @@ -4672,6 +4674,7 @@ void of_clk_del_provider(struct device_node *np)
>         list_for_each_entry(cp, &of_clk_providers, link) {
>                 if (cp->node == np) {
>                         list_del(&cp->link);
> +                       fwnode_dev_initialized(&np->fwnode, false);
>                         of_node_put(cp->node);
>                         kfree(cp);
>                         break;
> --
> 2.30.0.478.g8a0d178c01-goog
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 8/8] clk: Mark fwnodes when their clock provider is added/removed
  2021-02-08 15:38     ` Rob Herring
@ 2021-02-08 23:34       ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-08 23:34 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Frank Rowand, Marc Zyngier,
	Thomas Gleixner, Linux Doc Mailing List, linux-kernel,
	open list:THERMAL, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	open list:ACPI FOR ARM64 (ACPI/arm64),
	Marek Szyprowski, Geert Uytterhoeven, Android Kernel Team

On Mon, Feb 8, 2021 at 7:39 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Fri, Feb 5, 2021 at 4:27 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > This allows fw_devlink to recognize clock provider drivers that don't
> > use the device-driver model to initialize the device. fw_devlink will
> > use this information to make sure consumers of such clock providers
> > aren't indefinitely blocked from probing, waiting for the power domain
> > device to appear and bind to a driver.
>
> Don't we have cases that are a mixture? IOW, a subset of the clock
> provider is initialized early, then the full driver takes over. You'd
> want consumers that are not a driver to succeed, but drivers to defer
> until the full driver is up.

You probably just made a typo, but to clarify, this is about ignoring
suppliers that never bind. So, in your case the clock device is the
supplier.

To answer your question, consumer devices added after the full
supplier driver takes over will still have device links created to the
supplier clock device. But consumers added before the full driver
takes over won't. So, nothing is worse off with fw_devlink=on and we
get way more dependency tracking (device links) created than what we
have today.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-08  8:40   ` Marek Szyprowski
@ 2021-02-08 23:57     ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-08 23:57 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Linux Doc Mailing List, LKML,
	Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Geert Uytterhoeven, Android Kernel Team

On Mon, Feb 8, 2021 at 12:40 AM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> Hi Saravana,
>
> On 05.02.2021 23:26, Saravana Kannan wrote:
> > There are a lot of devices/drivers where they never have a struct device
> > created for them or the driver initializes the hardware without ever
> > binding to the struct device.
> >
> > This series is intended to avoid any boot regressions due to such
> > devices/drivers when fw_devlink=on and also address the handling of
> > optional suppliers.
> >
> > Patch 1 and 2 addresses the issue of firmware nodes that look like
> > they'll have struct devices created for them, but will never actually
> > have struct devices added for them. For example, DT nodes with a
> > compatible property that don't have devices added for them.
> >
> > Patch 3 and 4 allow for handling optional DT bindings.
> >
> > Patch 5 sets up a generic API to handle drivers that never bind with
> > their devices.
> >
> > Patch 6 through 8 update different frameworks to use the new API.
>
> This patchset fixes probing issue observed on various Exynos based
> boards even with commit c09a3e6c97f0 ("soc: samsung: pm_domains: Convert
> to regular platform driver") reverted. Thanks!
>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>

Thanks for testing!

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 2/8] of: property: Don't add links to absent suppliers
  2021-02-05 22:26   ` [PATCH v4 2/8] of: property: Don't add links to absent suppliers Saravana Kannan
@ 2021-02-09 21:25     ` Rob Herring
  0 siblings, 0 replies; 55+ messages in thread
From: Rob Herring @ 2021-02-09 21:25 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: linux-clk, Kevin Hilman, Jonathan Corbet, linux-acpi, devicetree,
	Greg Kroah-Hartman, Thomas Gleixner, kernel-team,
	Michael Turquette, Rafael J. Wysocki, linux-doc, Len Brown,
	Frank Rowand, Pavel Machek, Geert Uytterhoeven, linux-kernel,
	Marc Zyngier, linux-pm, Ulf Hansson, Len Brown, Rob Herring,
	Stephen Boyd, Marek Szyprowski

On Fri, 05 Feb 2021 14:26:38 -0800, Saravana Kannan wrote:
> If driver core marks a firmware node as not a device, don't add fwnode
> links where it's a supplier.
> 
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/of/property.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties
  2021-02-05 22:26   ` [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties Saravana Kannan
@ 2021-02-09 21:33     ` Rob Herring
  2021-02-09 21:54       ` Saravana Kannan
  0 siblings, 1 reply; 55+ messages in thread
From: Rob Herring @ 2021-02-09 21:33 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Frank Rowand, Marc Zyngier,
	Thomas Gleixner, linux-doc, linux-kernel, linux-pm, linux-clk,
	devicetree, linux-acpi, Marek Szyprowski, Geert Uytterhoeven,
	kernel-team

On Fri, Feb 05, 2021 at 02:26:40PM -0800, Saravana Kannan wrote:
> Not all DT bindings are mandatory bindings. Add support for optional DT
> bindings and mark iommus, iommu-map, dmas as optional DT bindings.

I don't think we can say these are optional or not. It's got to be a 
driver decision somehow.

For example, if IOMMU is optional, what happens with this sequence:

driver probes without IOMMU
driver calls dma_map_?()
IOMMU driver probes
h/w accesses DMA buffer --> BOOM!

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties
  2021-02-09 21:33     ` Rob Herring
@ 2021-02-09 21:54       ` Saravana Kannan
  2021-02-10  8:25         ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-09 21:54 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Frank Rowand, Marc Zyngier,
	Thomas Gleixner, Linux Doc Mailing List, LKML, Linux PM,
	linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Tue, Feb 9, 2021 at 1:33 PM Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Feb 05, 2021 at 02:26:40PM -0800, Saravana Kannan wrote:
> > Not all DT bindings are mandatory bindings. Add support for optional DT
> > bindings and mark iommus, iommu-map, dmas as optional DT bindings.
>
> I don't think we can say these are optional or not. It's got to be a
> driver decision somehow.

Right, so maybe the word "optional" isn't a good name for it. I can
change that if you want.

The point being, fw_devlink can't block the probe of this driver based
on iommu property. We let the driver decide if it wants to
-EPROBE_DEFER or not or however it wants to handle this.

> For example, if IOMMU is optional, what happens with this sequence:
>
> driver probes without IOMMU
> driver calls dma_map_?()
> IOMMU driver probes
> h/w accesses DMA buffer --> BOOM!

Right. But how is this really related to fw_devlink? AFAICT, this is
an issue even today. If the driver needs the IOMMU, then it needs to
make sure the IOMMU has probed? What am I missing?

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (10 preceding siblings ...)
  2021-02-08  8:40   ` Marek Szyprowski
@ 2021-02-10  8:19   ` Tudor.Ambarus
  2021-02-10  8:54     ` Saravana Kannan
  2021-02-11 13:00   ` Geert Uytterhoeven
  12 siblings, 1 reply; 55+ messages in thread
From: Tudor.Ambarus @ 2021-02-10  8:19 UTC (permalink / raw)
  To: saravanak, corbet, gregkh, rafael, khilman, ulf.hansson,
	len.brown, lenb, pavel, mturquette, sboyd, robh+dt, frowand.list,
	maz, tglx, sboyd
  Cc: linux-doc, linux-kernel, linux-pm, linux-clk, devicetree,
	linux-acpi, m.szyprowski, geert, kernel-team

Hi, Saravana,

On 2/6/21 12:26 AM, Saravana Kannan wrote:
> There are a lot of devices/drivers where they never have a struct device
> created for them or the driver initializes the hardware without ever
> binding to the struct device.
> 
> This series is intended to avoid any boot regressions due to such
> devices/drivers when fw_devlink=on and also address the handling of
> optional suppliers.
> 
> Patch 1 and 2 addresses the issue of firmware nodes that look like
> they'll have struct devices created for them, but will never actually
> have struct devices added for them. For example, DT nodes with a
> compatible property that don't have devices added for them.
> 
> Patch 3 and 4 allow for handling optional DT bindings.
> 
> Patch 5 sets up a generic API to handle drivers that never bind with
> their devices.
> 
> Patch 6 through 8 update different frameworks to use the new API.
> 
> Thanks,
> Saravana
> 
> Saravana Kannan (8):
>   driver core: fw_devlink: Detect supplier devices that will never be
>     added
>   of: property: Don't add links to absent suppliers
>   driver core: Add fw_devlink.strict kernel param
>   of: property: Add fw_devlink support for optional properties
>   driver core: fw_devlink: Handle suppliers that don't use driver core
>   irqdomain: Mark fwnodes when their irqdomain is added/removed
>   PM: domains: Mark fwnodes when their powerdomain is added/removed
>   clk: Mark fwnodes when their clock provider is added/removed
> 
>  .../admin-guide/kernel-parameters.txt         |  5 ++
>  drivers/base/core.c                           | 58 ++++++++++++++++++-
>  drivers/base/power/domain.c                   |  2 +
>  drivers/clk/clk.c                             |  3 +
>  drivers/of/property.c                         | 16 +++--
>  include/linux/fwnode.h                        | 20 ++++++-
>  kernel/irq/irqdomain.c                        |  2 +
>  7 files changed, 98 insertions(+), 8 deletions(-)
> 

Even with this patch set applied, sama5d2_xplained can not boot.
Patch at [1] makes sama5d2_xplained boot again. Stephen applied it
to clk-next.

Cheers,
ta

[1] https://lore.kernel.org/lkml/20210203154332.470587-1-tudor.ambarus@microchip.com/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties
  2021-02-09 21:54       ` Saravana Kannan
@ 2021-02-10  8:25         ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-10  8:25 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Kevin Hilman, Ulf Hansson, Len Brown,
	Len Brown, Pavel Machek, Michael Turquette, Stephen Boyd,
	Frank Rowand, Marc Zyngier, Thomas Gleixner,
	Linux Doc Mailing List, LKML, Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Joerg Roedel, Linux IOMMU

Hi Saravana,

CC iommu

On Tue, Feb 9, 2021 at 10:55 PM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 9, 2021 at 1:33 PM Rob Herring <robh@kernel.org> wrote:
> > On Fri, Feb 05, 2021 at 02:26:40PM -0800, Saravana Kannan wrote:
> > > Not all DT bindings are mandatory bindings. Add support for optional DT
> > > bindings and mark iommus, iommu-map, dmas as optional DT bindings.
> >
> > I don't think we can say these are optional or not. It's got to be a
> > driver decision somehow.
>
> Right, so maybe the word "optional" isn't a good name for it. I can
> change that if you want.
>
> The point being, fw_devlink can't block the probe of this driver based
> on iommu property. We let the driver decide if it wants to
> -EPROBE_DEFER or not or however it wants to handle this.

The driver cannot make that decision, cfr. below.

> > For example, if IOMMU is optional, what happens with this sequence:
> >
> > driver probes without IOMMU
> > driver calls dma_map_?()
> > IOMMU driver probes
> > h/w accesses DMA buffer --> BOOM!

Does it really behave that way? Or does it continue without IOMMU?

> Right. But how is this really related to fw_devlink? AFAICT, this is
> an issue even today. If the driver needs the IOMMU, then it needs to
> make sure the IOMMU has probed? What am I missing?

Individual I/O (IOMMU slave) drivers are completely unaware of the
presence or absence of an IOMMU; they just use the DMA API, which is the
same regardless of an IOMMU being used or not.
While for GPIO/IRQ/CLK/DMA/... have request/get_{gpio,irq,clk,dma,...}
APIs for a driver to get a reference, which can return -EPROBE_DEFER, no
such thing exists for IOMMUs.  This is handled by the IOMMU core
instead.

Using the IOMMU or not is more like a system policy decision.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-10  8:19   ` Tudor.Ambarus
@ 2021-02-10  8:54     ` Saravana Kannan
  2021-02-10 10:02       ` Tudor.Ambarus
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-10  8:54 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Brown, Len, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Linux Doc Mailing List, LKML,
	Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Wed, Feb 10, 2021 at 12:19 AM <Tudor.Ambarus@microchip.com> wrote:
>
> Hi, Saravana,
>
> On 2/6/21 12:26 AM, Saravana Kannan wrote:
> > There are a lot of devices/drivers where they never have a struct device
> > created for them or the driver initializes the hardware without ever
> > binding to the struct device.
> >
> > This series is intended to avoid any boot regressions due to such
> > devices/drivers when fw_devlink=on and also address the handling of
> > optional suppliers.
> >
> > Patch 1 and 2 addresses the issue of firmware nodes that look like
> > they'll have struct devices created for them, but will never actually
> > have struct devices added for them. For example, DT nodes with a
> > compatible property that don't have devices added for them.
> >
> > Patch 3 and 4 allow for handling optional DT bindings.
> >
> > Patch 5 sets up a generic API to handle drivers that never bind with
> > their devices.
> >
> > Patch 6 through 8 update different frameworks to use the new API.
> >
> > Thanks,
> > Saravana
> >
> > Saravana Kannan (8):
> >   driver core: fw_devlink: Detect supplier devices that will never be
> >     added
> >   of: property: Don't add links to absent suppliers
> >   driver core: Add fw_devlink.strict kernel param
> >   of: property: Add fw_devlink support for optional properties
> >   driver core: fw_devlink: Handle suppliers that don't use driver core
> >   irqdomain: Mark fwnodes when their irqdomain is added/removed
> >   PM: domains: Mark fwnodes when their powerdomain is added/removed
> >   clk: Mark fwnodes when their clock provider is added/removed
> >
> >  .../admin-guide/kernel-parameters.txt         |  5 ++
> >  drivers/base/core.c                           | 58 ++++++++++++++++++-
> >  drivers/base/power/domain.c                   |  2 +
> >  drivers/clk/clk.c                             |  3 +
> >  drivers/of/property.c                         | 16 +++--
> >  include/linux/fwnode.h                        | 20 ++++++-
> >  kernel/irq/irqdomain.c                        |  2 +
> >  7 files changed, 98 insertions(+), 8 deletions(-)
> >
>
> Even with this patch set applied, sama5d2_xplained can not boot.
> Patch at [1] makes sama5d2_xplained boot again. Stephen applied it
> to clk-next.

I'm glad you won't actually have any boot issues in 5.12, but the fact
you need [1] with this series doesn't make a lot of sense to me
because:

1. The FWNODE_FLAG_INITIALIZED flag will be set for the clock fwnode
in question way before any consumer devices are added.
2. Any consumer device added after (1) will stop trying to link to the
clock device.

Are you somehow adding a consumer to the clock fwnode before (1)?

Can you try this patch without your clk fix? I was trying to avoid
looping through a list, but looks like your case might somehow need
it?

-Saravana

+++ b/drivers/base/core.c
@@ -943,6 +943,31 @@ static void device_links_missing_supplier(struct
device *dev)
        }
 }

+static int fw_devlink_check_suppliers(struct device *dev)
+{
+       struct fwnode_link *link;
+       int ret = 0;
+
+       if (!dev->fwnode ||fw_devlink_is_permissive())
+               return 0;
+
+       /*
+        * Device waiting for supplier to become available is not allowed to
+        * probe.
+        */
+       mutex_lock(&fwnode_link_lock);
+       list_for_each_entry(link, &dev->fwnode->suppliers, c_hook) {
+               if (link->supplier->flags & FWNODE_FLAG_INITIALIZED)
+                       continue;
+
+               ret = -EPROBE_DEFER;
+               break;
+       }
+       mutex_unlock(&fwnode_link_lock);
+
+       return ret;
+}
+
 /**
  * device_links_check_suppliers - Check presence of supplier drivers.
  * @dev: Consumer device.
@@ -964,21 +989,13 @@ int device_links_check_suppliers(struct device *dev)
        struct device_link *link;
        int ret = 0;

-       /*
-        * Device waiting for supplier to become available is not allowed to
-        * probe.
-        */
-       mutex_lock(&fwnode_link_lock);
-       if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
-           !fw_devlink_is_permissive()) {
+       if (fw_devlink_check_suppliers(dev)) {
                dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
                        list_first_entry(&dev->fwnode->suppliers,
                        struct fwnode_link,
                        c_hook)->supplier);
-               mutex_unlock(&fwnode_link_lock);
                return -EPROBE_DEFER;
        }
-       mutex_unlock(&fwnode_link_lock);

        device_links_write_lock();



>
> Cheers,
> ta
>
> [1] https://lore.kernel.org/lkml/20210203154332.470587-1-tudor.ambarus@microchip.com/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-10  8:54     ` Saravana Kannan
@ 2021-02-10 10:02       ` Tudor.Ambarus
  2021-02-10 19:14         ` Saravana Kannan
  0 siblings, 1 reply; 55+ messages in thread
From: Tudor.Ambarus @ 2021-02-10 10:02 UTC (permalink / raw)
  To: saravanak
  Cc: corbet, gregkh, rafael, khilman, ulf.hansson, len.brown, lenb,
	pavel, mturquette, sboyd, robh+dt, frowand.list, maz, tglx,
	linux-doc, linux-kernel, linux-pm, linux-clk, devicetree,
	linux-acpi, m.szyprowski, geert, kernel-team

On 2/10/21 10:54 AM, Saravana Kannan wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On Wed, Feb 10, 2021 at 12:19 AM <Tudor.Ambarus@microchip.com> wrote:
>>
>> Hi, Saravana,
>>
>> On 2/6/21 12:26 AM, Saravana Kannan wrote:
>>> There are a lot of devices/drivers where they never have a struct device
>>> created for them or the driver initializes the hardware without ever
>>> binding to the struct device.
>>>
>>> This series is intended to avoid any boot regressions due to such
>>> devices/drivers when fw_devlink=on and also address the handling of
>>> optional suppliers.
>>>
>>> Patch 1 and 2 addresses the issue of firmware nodes that look like
>>> they'll have struct devices created for them, but will never actually
>>> have struct devices added for them. For example, DT nodes with a
>>> compatible property that don't have devices added for them.
>>>
>>> Patch 3 and 4 allow for handling optional DT bindings.
>>>
>>> Patch 5 sets up a generic API to handle drivers that never bind with
>>> their devices.
>>>
>>> Patch 6 through 8 update different frameworks to use the new API.
>>>
>>> Thanks,
>>> Saravana
>>>
>>> Saravana Kannan (8):
>>>   driver core: fw_devlink: Detect supplier devices that will never be
>>>     added
>>>   of: property: Don't add links to absent suppliers
>>>   driver core: Add fw_devlink.strict kernel param
>>>   of: property: Add fw_devlink support for optional properties
>>>   driver core: fw_devlink: Handle suppliers that don't use driver core
>>>   irqdomain: Mark fwnodes when their irqdomain is added/removed
>>>   PM: domains: Mark fwnodes when their powerdomain is added/removed
>>>   clk: Mark fwnodes when their clock provider is added/removed
>>>
>>>  .../admin-guide/kernel-parameters.txt         |  5 ++
>>>  drivers/base/core.c                           | 58 ++++++++++++++++++-
>>>  drivers/base/power/domain.c                   |  2 +
>>>  drivers/clk/clk.c                             |  3 +
>>>  drivers/of/property.c                         | 16 +++--
>>>  include/linux/fwnode.h                        | 20 ++++++-
>>>  kernel/irq/irqdomain.c                        |  2 +
>>>  7 files changed, 98 insertions(+), 8 deletions(-)
>>>
>>
>> Even with this patch set applied, sama5d2_xplained can not boot.
>> Patch at [1] makes sama5d2_xplained boot again. Stephen applied it
>> to clk-next.
> 
> I'm glad you won't actually have any boot issues in 5.12, but the fact
> you need [1] with this series doesn't make a lot of sense to me
> because:
> 
> 1. The FWNODE_FLAG_INITIALIZED flag will be set for the clock fwnode
> in question way before any consumer devices are added.

Looks like in my case FWNODE_FLAG_INITIALIZED is not set, because
drivers/clk/at91/sama5d2.c uses of_clk_add_hw_provider().

> 2. Any consumer device added after (1) will stop trying to link to the
> clock device.
> 
> Are you somehow adding a consumer to the clock fwnode before (1)?
> 
> Can you try this patch without your clk fix? I was trying to avoid
> looping through a list, but looks like your case might somehow need
> it?
> 

I tried it, didn't solve my boot problem. The following patch makes the
sama5d2_xplained boot again, even without the patch from [1]:

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 27ff90eacb1f..9370e4dfecae 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
        if (ret < 0)
                of_clk_del_provider(np);
 
+       fwnode_dev_initialized(&np->fwnode, true);
+
        return ret;
 }
 EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);

Cheers,
ta

> -Saravana
> 
> +++ b/drivers/base/core.c
> @@ -943,6 +943,31 @@ static void device_links_missing_supplier(struct
> device *dev)
>         }
>  }
> 
> +static int fw_devlink_check_suppliers(struct device *dev)
> +{
> +       struct fwnode_link *link;
> +       int ret = 0;
> +
> +       if (!dev->fwnode ||fw_devlink_is_permissive())
> +               return 0;
> +
> +       /*
> +        * Device waiting for supplier to become available is not allowed to
> +        * probe.
> +        */
> +       mutex_lock(&fwnode_link_lock);
> +       list_for_each_entry(link, &dev->fwnode->suppliers, c_hook) {
> +               if (link->supplier->flags & FWNODE_FLAG_INITIALIZED)
> +                       continue;
> +
> +               ret = -EPROBE_DEFER;
> +               break;
> +       }
> +       mutex_unlock(&fwnode_link_lock);
> +
> +       return ret;
> +}
> +
>  /**
>   * device_links_check_suppliers - Check presence of supplier drivers.
>   * @dev: Consumer device.
> @@ -964,21 +989,13 @@ int device_links_check_suppliers(struct device *dev)
>         struct device_link *link;
>         int ret = 0;
> 
> -       /*
> -        * Device waiting for supplier to become available is not allowed to
> -        * probe.
> -        */
> -       mutex_lock(&fwnode_link_lock);
> -       if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> -           !fw_devlink_is_permissive()) {
> +       if (fw_devlink_check_suppliers(dev)) {
>                 dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
>                         list_first_entry(&dev->fwnode->suppliers,
>                         struct fwnode_link,
>                         c_hook)->supplier);
> -               mutex_unlock(&fwnode_link_lock);
>                 return -EPROBE_DEFER;
>         }
> -       mutex_unlock(&fwnode_link_lock);
> 
>         device_links_write_lock();
> 
> 
> 
>>
>> Cheers,
>> ta
>>
>> [1] https://lore.kernel.org/lkml/20210203154332.470587-1-tudor.ambarus@microchip.com/


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-05 22:26   ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider " Saravana Kannan
  2021-02-08 15:38     ` Rob Herring
@ 2021-02-10 11:44     ` Tudor Ambarus
  2021-02-10 11:44       ` Tudor Ambarus
                         ` (2 more replies)
       [not found]     ` <161317679292.1254594.15797939257637374295@swboyd.mtv.corp.google.com>
  2 siblings, 3 replies; 55+ messages in thread
From: Tudor Ambarus @ 2021-02-10 11:44 UTC (permalink / raw)
  To: corbet, gregkh, rafael, khilman, ulf.hansson, len.brown, lenb,
	pavel, mturquette, sboyd, robh+dt, frowand.list, maz, tglx,
	saravanak
  Cc: nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, m.szyprowski, geert,
	kernel-team, Tudor Ambarus

This is a follow-up for:
commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")

The above commit updated the deprecated of_clk_add_provider(),
but missed to update the preferred of_clk_add_hw_provider().
Update it now.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
---
 drivers/clk/clk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 27ff90eacb1f..9370e4dfecae 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
 	if (ret < 0)
 		of_clk_del_provider(np);
 
+	fwnode_dev_initialized(&np->fwnode, true);
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-10 11:44     ` [PATCH] clk: Mark fwnodes when their clock provider is added Tudor Ambarus
@ 2021-02-10 11:44       ` Tudor Ambarus
  2021-02-11 13:00         ` Greg KH
                           ` (2 more replies)
  2021-02-10 19:13       ` Saravana Kannan
  2021-03-30 15:42       ` Guenter Roeck
  2 siblings, 3 replies; 55+ messages in thread
From: Tudor Ambarus @ 2021-02-10 11:44 UTC (permalink / raw)
  To: corbet, gregkh, rafael, khilman, ulf.hansson, len.brown, lenb,
	pavel, mturquette, sboyd, robh+dt, frowand.list, maz, tglx,
	saravanak
  Cc: nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, m.szyprowski, geert,
	kernel-team, Tudor Ambarus

This is a follow-up for:
commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")

The above commit updated the deprecated of_clk_add_provider(),
but missed to update the preferred of_clk_add_hw_provider().
Update it now.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
---
 drivers/clk/clk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 27ff90eacb1f..9370e4dfecae 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
 	if (ret < 0)
 		of_clk_del_provider(np);
 
+	fwnode_dev_initialized(&np->fwnode, true);
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-10 11:44     ` [PATCH] clk: Mark fwnodes when their clock provider is added Tudor Ambarus
  2021-02-10 11:44       ` Tudor Ambarus
@ 2021-02-10 19:13       ` Saravana Kannan
  2021-03-30 15:42       ` Guenter Roeck
  2 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-10 19:13 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Brown, Len, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Nicolas Ferre, Claudiu Beznea,
	Linux Doc Mailing List, LKML, Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Wed, Feb 10, 2021 at 3:44 AM Tudor Ambarus
<tudor.ambarus@microchip.com> wrote:
>
> This is a follow-up for:
> commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
>
> The above commit updated the deprecated of_clk_add_provider(),
> but missed to update the preferred of_clk_add_hw_provider().
> Update it now.

Thanks Tudor! Good catch!

I checked to make sure the deregistration path undoes this one. So, it
looks good to me.

Reviewed-by: Saravana Kannan <saravanak@google.com>

-Saravana

>
> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> ---
>  drivers/clk/clk.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 27ff90eacb1f..9370e4dfecae 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
>         if (ret < 0)
>                 of_clk_del_provider(np);
>
> +       fwnode_dev_initialized(&np->fwnode, true);
> +
>         return ret;
>  }
>  EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-10 10:02       ` Tudor.Ambarus
@ 2021-02-10 19:14         ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-10 19:14 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Brown, Len, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Linux Doc Mailing List, LKML,
	Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Wed, Feb 10, 2021 at 2:02 AM <Tudor.Ambarus@microchip.com> wrote:
>
> On 2/10/21 10:54 AM, Saravana Kannan wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On Wed, Feb 10, 2021 at 12:19 AM <Tudor.Ambarus@microchip.com> wrote:
> >>
> >> Hi, Saravana,
> >>
> >> On 2/6/21 12:26 AM, Saravana Kannan wrote:
> >>> There are a lot of devices/drivers where they never have a struct device
> >>> created for them or the driver initializes the hardware without ever
> >>> binding to the struct device.
> >>>
> >>> This series is intended to avoid any boot regressions due to such
> >>> devices/drivers when fw_devlink=on and also address the handling of
> >>> optional suppliers.
> >>>
> >>> Patch 1 and 2 addresses the issue of firmware nodes that look like
> >>> they'll have struct devices created for them, but will never actually
> >>> have struct devices added for them. For example, DT nodes with a
> >>> compatible property that don't have devices added for them.
> >>>
> >>> Patch 3 and 4 allow for handling optional DT bindings.
> >>>
> >>> Patch 5 sets up a generic API to handle drivers that never bind with
> >>> their devices.
> >>>
> >>> Patch 6 through 8 update different frameworks to use the new API.
> >>>
> >>> Thanks,
> >>> Saravana
> >>>
> >>> Saravana Kannan (8):
> >>>   driver core: fw_devlink: Detect supplier devices that will never be
> >>>     added
> >>>   of: property: Don't add links to absent suppliers
> >>>   driver core: Add fw_devlink.strict kernel param
> >>>   of: property: Add fw_devlink support for optional properties
> >>>   driver core: fw_devlink: Handle suppliers that don't use driver core
> >>>   irqdomain: Mark fwnodes when their irqdomain is added/removed
> >>>   PM: domains: Mark fwnodes when their powerdomain is added/removed
> >>>   clk: Mark fwnodes when their clock provider is added/removed
> >>>
> >>>  .../admin-guide/kernel-parameters.txt         |  5 ++
> >>>  drivers/base/core.c                           | 58 ++++++++++++++++++-
> >>>  drivers/base/power/domain.c                   |  2 +
> >>>  drivers/clk/clk.c                             |  3 +
> >>>  drivers/of/property.c                         | 16 +++--
> >>>  include/linux/fwnode.h                        | 20 ++++++-
> >>>  kernel/irq/irqdomain.c                        |  2 +
> >>>  7 files changed, 98 insertions(+), 8 deletions(-)
> >>>
> >>
> >> Even with this patch set applied, sama5d2_xplained can not boot.
> >> Patch at [1] makes sama5d2_xplained boot again. Stephen applied it
> >> to clk-next.
> >
> > I'm glad you won't actually have any boot issues in 5.12, but the fact
> > you need [1] with this series doesn't make a lot of sense to me
> > because:
> >
> > 1. The FWNODE_FLAG_INITIALIZED flag will be set for the clock fwnode
> > in question way before any consumer devices are added.
>
> Looks like in my case FWNODE_FLAG_INITIALIZED is not set, because
> drivers/clk/at91/sama5d2.c uses of_clk_add_hw_provider().

Ah, that explains it.

> > 2. Any consumer device added after (1) will stop trying to link to the
> > clock device.
> >
> > Are you somehow adding a consumer to the clock fwnode before (1)?
> >
> > Can you try this patch without your clk fix? I was trying to avoid
> > looping through a list, but looks like your case might somehow need
> > it?
> >
>
> I tried it, didn't solve my boot problem.

Thanks! I should stop coding past midnight!

> The following patch makes the
> sama5d2_xplained boot again, even without the patch from [1]:

Great! I gave a reviewed-by.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
                     ` (11 preceding siblings ...)
  2021-02-10  8:19   ` Tudor.Ambarus
@ 2021-02-11 13:00   ` Geert Uytterhoeven
  2021-02-11 13:05     ` Geert Uytterhoeven
                       ` (2 more replies)
  12 siblings, 3 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-11 13:00 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan <saravanak@google.com> wrote:
> There are a lot of devices/drivers where they never have a struct device
> created for them or the driver initializes the hardware without ever
> binding to the struct device.
>
> This series is intended to avoid any boot regressions due to such
> devices/drivers when fw_devlink=on and also address the handling of
> optional suppliers.
>
> Patch 1 and 2 addresses the issue of firmware nodes that look like
> they'll have struct devices created for them, but will never actually
> have struct devices added for them. For example, DT nodes with a
> compatible property that don't have devices added for them.
>
> Patch 3 and 4 allow for handling optional DT bindings.
>
> Patch 5 sets up a generic API to handle drivers that never bind with
> their devices.
>
> Patch 6 through 8 update different frameworks to use the new API.
>
> Thanks,
> Saravana
>
> Saravana Kannan (8):
>   driver core: fw_devlink: Detect supplier devices that will never be
>     added
>   of: property: Don't add links to absent suppliers
>   driver core: Add fw_devlink.strict kernel param
>   of: property: Add fw_devlink support for optional properties
>   driver core: fw_devlink: Handle suppliers that don't use driver core
>   irqdomain: Mark fwnodes when their irqdomain is added/removed
>   PM: domains: Mark fwnodes when their powerdomain is added/removed
>   clk: Mark fwnodes when their clock provider is added/removed

Thanks for your series, which is now part of driver-core-next.
I gave driver-core-next + [1] a try on various Renesas boards.
Test results are below.
In general, the result looks much better than before.

[1] - https://lore.kernel.org/lkml/20210210114435.122242-1-tudor.ambarus@microchip.com/

  1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).

      - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
        node OF_POPULATED after init") is no longer needed (but already
        queued for v5.12 anyway)

      - Some devices are reprobed, despite their drivers returning
        a real error code, and not -EPROBE_DEFER:

            renesas_wdt e6020000.watchdog: Watchdog blacklisted on r8a7791 ES1.*
            (rwdt_probe() returns -ENODEV)

            sh-pfc e6060000.pinctrl: pin GP_7_23 already requested by
ee090000.pci; cannot claim for e6590000.usb
            sh-pfc e6060000.pinctrl: pin-247 (e6590000.usb) status -22
            sh-pfc e6060000.pinctrl: could not request pin 247
(GP_7_23) from group usb0  on device sh-pfc
            renesas_usbhs e6590000.usb: Error applying setting,
reverse things back
            renesas_usbhs: probe of e6590000.usb failed with error -22

            rcar-pcie fe000000.pcie: host bridge /soc/pcie@fe000000 ranges:
            rcar-pcie fe000000.pcie:       IO
0x00fe100000..0x00fe1fffff -> 0x0000000000
            rcar-pcie fe000000.pcie:      MEM
0x00fe200000..0x00fe3fffff -> 0x00fe200000
            rcar-pcie fe000000.pcie:      MEM
0x0030000000..0x0037ffffff -> 0x0030000000
            rcar-pcie fe000000.pcie:      MEM
0x0038000000..0x003fffffff -> 0x0038000000
            rcar-pcie fe000000.pcie:   IB MEM
0x0040000000..0x00bfffffff -> 0x0040000000
            rcar-pcie fe000000.pcie:   IB MEM
0x0200000000..0x02ffffffff -> 0x0200000000
            rcar-pcie fe000000.pcie: PCIe link down
            (rcar_pcie_probe() returns -ENODEV)

            xhci-hcd ee000000.usb: xHCI Host Controller
            xhci-hcd ee000000.usb: new USB bus registered, assigned bus number 7
            xhci-hcd ee000000.usb: Direct firmware load for
r8a779x_usb3_v3.dlmem failed with error -2
            xhci-hcd ee000000.usb: can't setup: -2
            xhci-hcd ee000000.usb: USB bus 7 deregistered
            xhci-hcd: probe of ee000000.usb failed with error -2

      - The PCI reprobing leads to a memory leak, for which I've sent a fix
        "[PATCH] PCI: Fix memory leak in pci_register_io_range()"
        https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@glider.be/

      - I2C on R-Car Gen3 does not seem to use DMA, according to
        /sys/kernel/debug/dmaengine/summary:

            -dma4chan0    | e66d8000.i2c:tx
            -dma4chan1    | e66d8000.i2c:rx
            -dma5chan0    | e6510000.i2c:tx

      - Disabling CONFIG_IPMMU_VMSA (IOMMU) now works, good!

           ignoring dependency for device, assuming no driver

      - Disabling CONFIG_RCAR_DMAC works for most devices, except for
        sound:

            -rcar_sound ec500000.sound: probed

             ALSA device list:
            -  #0: rcar-sound
            +  No soundcards found.

            # cat  /sys/kernel/debug/devices_deferred
            2-0010
            sound
            ec500000.sound

            platform e6510000.i2c: Linked as a sync state only
consumer to ec500000.sound
            platform ec500000.sound: Linked as a consumer to e6060000.pinctrl
            platform ec500000.sound: Linked as a consumer to
e6150000.clock-controller
            i2c 2-0010: Linked as a consumer to ec500000.sound
            platform ec500000.sound: Linked as a consumer to 2-004f
            cs2000-cp 2-004f: revision - C1
            i2c-rcar e6510000.i2c: probed
            i2c-rcar e6510000.i2c: Dropping the link to ec500000.sound
            i2c 2-0010: probe deferral - supplier ec500000.sound not ready

        With CONFIG_RCAR_DMAC=y, ec500000.sound is probed quite early.

            arch/arm64/boot/dts/renesas/r8a77951-salvator-xs.dts

            ak4613: codec@10 {
                    clocks = <&rcar_sound 3>;

                    port {
                            ak4613_endpoint: endpoint {
                                    remote-endpoint = <&rsnd_endpoint0>;
                            };
                    };
            };

            sound_card: sound {
                    dais = <&rsnd_port0     /* ak4613 */
                            &rsnd_port1     /* HDMI0  */
                            &rsnd_port2>;   /* HDMI1  */
            };

            rcar_sound: sound@ec500000 {
                    ports {
                            rsnd_port0: port@0 {
                                    rsnd_endpoint0: endpoint {
                                            remote-endpoint =
<&ak4613_endpoint>;
                                    }
                            }
                    }
            };


  2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva)

      - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb
        reset handling" is no longer needed
        https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/

      - On R-Mobile A1, I get a BUG and a memory leak:

            BUG: spinlock bad magic on CPU#0, swapper/1
             lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: 0
            CPU: 0 PID: 1 Comm: swapper Not tainted
5.11.0-rc5-armadillo-00032-gf0a85c26907e #266
            Hardware name: Generic R8A7740 (Flattened Device Tree)
            [<c010c3c8>] (unwind_backtrace) from [<c010a49c>]
(show_stack+0x10/0x14)
            [<c010a49c>] (show_stack) from [<c0159534>]
(do_raw_spin_lock+0x20/0x94)
            [<c0159534>] (do_raw_spin_lock) from [<c04089d8>]
(dev_pm_get_subsys_data+0x30/0xa0)
            [<c04089d8>] (dev_pm_get_subsys_data) from [<c0413698>]
(genpd_add_device+0x34/0x1c0)
            [<c0413698>] (genpd_add_device) from [<c041389c>]
(of_genpd_add_device+0x34/0x4c)
            [<c041389c>] (of_genpd_add_device) from [<c0a1e9bc>]
(board_staging_register_device+0xf8/0x118)
            [<c0a1e9bc>] (board_staging_register_device) from
[<c0a1ea00>] (board_staging_register_devices+0x24/0x28)
            [<c0a1ea00>] (board_staging_register_devices) from
[<c0a1ea30>] (runtime_board_check+0x2c/0x40)
            [<c0a1ea30>] (runtime_board_check) from [<c0101fac>]
(do_one_initcall+0xe0/0x278)
            [<c0101fac>] (do_one_initcall) from [<c0a01034>]
(kernel_init_freeable+0x174/0x1c0)
            [<c0a01034>] (kernel_init_freeable) from [<c05fd568>]
(kernel_init+0x8/0x118)
            [<c05fd568>] (kernel_init) from [<c010011c>]
(ret_from_fork+0x14/0x38)
            Exception stack(0xc19c9fb0 to 0xc19c9ff8)
            9fa0:                                     00000000
00000000 00000000 00000000
            9fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
            9fe0: 00000000 00000000 00000000 00000000 00000013 00000000

            unreferenced object 0xc4134e00 (size 512):
              comm "swapper", pid 1, jiffies 4294937296 (age 3541.930s)
              hex dump (first 32 bytes):
                00 4e 13 c4 00 4e 13 c4 ff ff ff 7f ff ff ff 7f
.N...N..........
                ff ff ff 7f 02 00 00 00 00 5f 13 c4 1c 4e 13 c4
........._...N..
              backtrace:
                [<de1a3c34>] dev_pm_qos_constraints_allocate+0x10/0xcc
                [<d21cf6e4>] dev_pm_qos_add_notifier+0x6c/0xd0
                [<e04bbc90>] genpd_add_device+0x178/0x1c0
                [<95067303>] of_genpd_add_device+0x34/0x4c
                [<c334b97a>] board_staging_register_device+0xf8/0x118
                [<01bd495a>] board_staging_register_devices+0x24/0x28
                [<fb25a5d8>] runtime_board_check+0x2c/0x40
                [<65aed679>] do_one_initcall+0xe0/0x278
                [<97e3f4f7>] kernel_init_freeable+0x174/0x1c0
                [<63c8fed0>] kernel_init+0x8/0x118
                [<f704d96c>] ret_from_fork+0x14/0x38
                [<00000000>] 0x0

  3. RZ/A1 and RZ/A2: No issues.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-10 11:44       ` Tudor Ambarus
@ 2021-02-11 13:00         ` Greg KH
  2021-02-13  0:37           ` Stephen Boyd
       [not found]         ` <CGME20210325133159eucas1p297b769beb681743fb32d362a86cc6e3e@eucas1p2.samsung.com>
  2021-04-21  3:26         ` Guenter Roeck
  2 siblings, 1 reply; 55+ messages in thread
From: Greg KH @ 2021-02-11 13:00 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: corbet, rafael, khilman, ulf.hansson, len.brown, lenb, pavel,
	mturquette, sboyd, robh+dt, frowand.list, maz, tglx, saravanak,
	nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, m.szyprowski, geert,
	kernel-team

On Wed, Feb 10, 2021 at 01:44:35PM +0200, Tudor Ambarus wrote:
> This is a follow-up for:
> commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> 
> The above commit updated the deprecated of_clk_add_provider(),
> but missed to update the preferred of_clk_add_hw_provider().
> Update it now.
> 
> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> ---
>  drivers/clk/clk.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 27ff90eacb1f..9370e4dfecae 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
>  	if (ret < 0)
>  		of_clk_del_provider(np);
>  
> +	fwnode_dev_initialized(&np->fwnode, true);
> +
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);
> -- 
> 2.25.1
> 

Any objection for me taking this in my tree as well?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-11 13:00   ` Geert Uytterhoeven
@ 2021-02-11 13:05     ` Geert Uytterhoeven
  2021-02-12  2:59     ` Saravana Kannan
  2021-02-15 11:19     ` Geert Uytterhoeven
  2 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-11 13:05 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Thu, Feb 11, 2021 at 2:00 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>       - Disabling CONFIG_RCAR_DMAC works for most devices, except for
>         sound:

Please ignore.  DMA is mandatory for sound, and thus fails in the same
way on v5.11-rc5.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-11 13:00   ` Geert Uytterhoeven
  2021-02-11 13:05     ` Geert Uytterhoeven
@ 2021-02-12  2:59     ` Saravana Kannan
  2021-02-12  8:14       ` Geert Uytterhoeven
                         ` (2 more replies)
  2021-02-15 11:19     ` Geert Uytterhoeven
  2 siblings, 3 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-12  2:59 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan <saravanak@google.com> wrote:
> > There are a lot of devices/drivers where they never have a struct device
> > created for them or the driver initializes the hardware without ever
> > binding to the struct device.
> >
> > This series is intended to avoid any boot regressions due to such
> > devices/drivers when fw_devlink=on and also address the handling of
> > optional suppliers.
> >
> > Patch 1 and 2 addresses the issue of firmware nodes that look like
> > they'll have struct devices created for them, but will never actually
> > have struct devices added for them. For example, DT nodes with a
> > compatible property that don't have devices added for them.
> >
> > Patch 3 and 4 allow for handling optional DT bindings.
> >
> > Patch 5 sets up a generic API to handle drivers that never bind with
> > their devices.
> >
> > Patch 6 through 8 update different frameworks to use the new API.
> >
> > Thanks,
> > Saravana
> >
> > Saravana Kannan (8):
> >   driver core: fw_devlink: Detect supplier devices that will never be
> >     added
> >   of: property: Don't add links to absent suppliers
> >   driver core: Add fw_devlink.strict kernel param
> >   of: property: Add fw_devlink support for optional properties
> >   driver core: fw_devlink: Handle suppliers that don't use driver core
> >   irqdomain: Mark fwnodes when their irqdomain is added/removed
> >   PM: domains: Mark fwnodes when their powerdomain is added/removed
> >   clk: Mark fwnodes when their clock provider is added/removed
>
> Thanks for your series, which is now part of driver-core-next.
> I gave driver-core-next + [1] a try on various Renesas boards.

Thanks!

> Test results are below.
> In general, the result looks much better than before.

Ah, good to hear this.

> [1] - https://lore.kernel.org/lkml/20210210114435.122242-1-tudor.ambarus@microchip.com/
>
>   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
>
>       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
>         node OF_POPULATED after init") is no longer needed (but already
>         queued for v5.12 anyway)

Rob doesn't like the proliferation of OF_POPULATED and we don't need
it anymore, so maybe work it out with him? It's a balance between some
wasted memory (struct device(s)) vs not proliferating OF_POPULATED.

>       - Some devices are reprobed, despite their drivers returning
>         a real error code, and not -EPROBE_DEFER:

Sorry, it's not obvious from the logs below where "reprobing" is
happening. Can you give more pointers please?

Also, thinking more about this, the only way I could see this happen is:
1. Device fails with error that's not -EPROBE_DEFER
2. It somehow gets added to a device link (with AUTOPROBE_CONSUMER
flag) where it's a consumer.
3. The supplier probes and the device gets added to the deferred probe
list again.

But I can't see how this sequence can happen. Device links are created
only when a device is added. And is the supplier isn't added yet, the
consumer wouldn't have probed in the first place.

Other than "annoying waste of time" is this causing any other problems?

>             renesas_wdt e6020000.watchdog: Watchdog blacklisted on r8a7791 ES1.*
>             (rwdt_probe() returns -ENODEV)
>
>             sh-pfc e6060000.pinctrl: pin GP_7_23 already requested by
> ee090000.pci; cannot claim for e6590000.usb
>             sh-pfc e6060000.pinctrl: pin-247 (e6590000.usb) status -22
>             sh-pfc e6060000.pinctrl: could not request pin 247
> (GP_7_23) from group usb0  on device sh-pfc
>             renesas_usbhs e6590000.usb: Error applying setting,
> reverse things back
>             renesas_usbhs: probe of e6590000.usb failed with error -22
>
>             rcar-pcie fe000000.pcie: host bridge /soc/pcie@fe000000 ranges:
>             rcar-pcie fe000000.pcie:       IO
> 0x00fe100000..0x00fe1fffff -> 0x0000000000
>             rcar-pcie fe000000.pcie:      MEM
> 0x00fe200000..0x00fe3fffff -> 0x00fe200000
>             rcar-pcie fe000000.pcie:      MEM
> 0x0030000000..0x0037ffffff -> 0x0030000000
>             rcar-pcie fe000000.pcie:      MEM
> 0x0038000000..0x003fffffff -> 0x0038000000
>             rcar-pcie fe000000.pcie:   IB MEM
> 0x0040000000..0x00bfffffff -> 0x0040000000
>             rcar-pcie fe000000.pcie:   IB MEM
> 0x0200000000..0x02ffffffff -> 0x0200000000
>             rcar-pcie fe000000.pcie: PCIe link down
>             (rcar_pcie_probe() returns -ENODEV)
>
>             xhci-hcd ee000000.usb: xHCI Host Controller
>             xhci-hcd ee000000.usb: new USB bus registered, assigned bus number 7
>             xhci-hcd ee000000.usb: Direct firmware load for
> r8a779x_usb3_v3.dlmem failed with error -2
>             xhci-hcd ee000000.usb: can't setup: -2
>             xhci-hcd ee000000.usb: USB bus 7 deregistered
>             xhci-hcd: probe of ee000000.usb failed with error -2
>
>       - The PCI reprobing leads to a memory leak, for which I've sent a fix
>         "[PATCH] PCI: Fix memory leak in pci_register_io_range()"
>         https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@glider.be/

Wrt PCI reprobing,
1. Is this PCI never expected to probe, but it's being reattempted
despite the NOT EPROBE_DEFER error? Or
2. The PCI was deferred probe when it should have probed and then when
it's finally reattemped and it could succeed, we are hitting this mem
leak issue?

I'm basically trying to distinguish between "this stuff should never
be retried" vs "this/it's suppliers got probe deferred with
fw_devlink=on vs but didn't get probe deferred with
fw_devlink=permissive and that's causing issues"

>       - I2C on R-Car Gen3 does not seem to use DMA, according to
>         /sys/kernel/debug/dmaengine/summary:
>
>             -dma4chan0    | e66d8000.i2c:tx
>             -dma4chan1    | e66d8000.i2c:rx
>             -dma5chan0    | e6510000.i2c:tx

I think I need more context on the problem before I can try to fix it.
I'm also very unfamiliar with that file. With fw_devlink=permissive,
I2C was using DMA? If so, the next step is to see if the I2C relative
probe order with DMA is getting changed and if so, why.

>       - Disabling CONFIG_IPMMU_VMSA (IOMMU) now works, good!
>
>            ignoring dependency for device, assuming no driver
>
>       - Disabling CONFIG_RCAR_DMAC works for most devices, except for
>         sound:
>
>             -rcar_sound ec500000.sound: probed
>
>              ALSA device list:
>             -  #0: rcar-sound
>             +  No soundcards found.
>
>             # cat  /sys/kernel/debug/devices_deferred
>             2-0010
>             sound
>             ec500000.sound
>
>             platform e6510000.i2c: Linked as a sync state only
> consumer to ec500000.sound
>             platform ec500000.sound: Linked as a consumer to e6060000.pinctrl
>             platform ec500000.sound: Linked as a consumer to
> e6150000.clock-controller
>             i2c 2-0010: Linked as a consumer to ec500000.sound
>             platform ec500000.sound: Linked as a consumer to 2-004f
>             cs2000-cp 2-004f: revision - C1
>             i2c-rcar e6510000.i2c: probed
>             i2c-rcar e6510000.i2c: Dropping the link to ec500000.sound
>             i2c 2-0010: probe deferral - supplier ec500000.sound not ready
>
>         With CONFIG_RCAR_DMAC=y, ec500000.sound is probed quite early.

I saw your other reply, so I'll ignore this sound/DMA issue.

>
>             arch/arm64/boot/dts/renesas/r8a77951-salvator-xs.dts
>
>             ak4613: codec@10 {
>                     clocks = <&rcar_sound 3>;
>
>                     port {
>                             ak4613_endpoint: endpoint {
>                                     remote-endpoint = <&rsnd_endpoint0>;
>                             };
>                     };
>             };
>
>             sound_card: sound {
>                     dais = <&rsnd_port0     /* ak4613 */
>                             &rsnd_port1     /* HDMI0  */
>                             &rsnd_port2>;   /* HDMI1  */
>             };
>
>             rcar_sound: sound@ec500000 {
>                     ports {
>                             rsnd_port0: port@0 {
>                                     rsnd_endpoint0: endpoint {
>                                             remote-endpoint =
> <&ak4613_endpoint>;
>                                     }
>                             }
>                     }
>             };
>
>
>   2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva)
>
>       - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb
>         reset handling" is no longer needed
>         https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/

Good to see more evidence that this series is fixing things at a more
generic level.

>       - On R-Mobile A1, I get a BUG and a memory leak:
>
>             BUG: spinlock bad magic on CPU#0, swapper/1
>              lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner:
> <none>/-1, .owner_cpu: 0
>             CPU: 0 PID: 1 Comm: swapper Not tainted
> 5.11.0-rc5-armadillo-00032-gf0a85c26907e #266
>             Hardware name: Generic R8A7740 (Flattened Device Tree)
>             [<c010c3c8>] (unwind_backtrace) from [<c010a49c>]
> (show_stack+0x10/0x14)
>             [<c010a49c>] (show_stack) from [<c0159534>]
> (do_raw_spin_lock+0x20/0x94)
>             [<c0159534>] (do_raw_spin_lock) from [<c04089d8>]
> (dev_pm_get_subsys_data+0x30/0xa0)
>             [<c04089d8>] (dev_pm_get_subsys_data) from [<c0413698>]
> (genpd_add_device+0x34/0x1c0)
>             [<c0413698>] (genpd_add_device) from [<c041389c>]
> (of_genpd_add_device+0x34/0x4c)
>             [<c041389c>] (of_genpd_add_device) from [<c0a1e9bc>]
> (board_staging_register_device+0xf8/0x118)
>             [<c0a1e9bc>] (board_staging_register_device) from
> [<c0a1ea00>] (board_staging_register_devices+0x24/0x28)
>             [<c0a1ea00>] (board_staging_register_devices) from
> [<c0a1ea30>] (runtime_board_check+0x2c/0x40)
>             [<c0a1ea30>] (runtime_board_check) from [<c0101fac>]
> (do_one_initcall+0xe0/0x278)
>             [<c0101fac>] (do_one_initcall) from [<c0a01034>]
> (kernel_init_freeable+0x174/0x1c0)
>             [<c0a01034>] (kernel_init_freeable) from [<c05fd568>]
> (kernel_init+0x8/0x118)
>             [<c05fd568>] (kernel_init) from [<c010011c>]
> (ret_from_fork+0x14/0x38)
>             Exception stack(0xc19c9fb0 to 0xc19c9ff8)
>             9fa0:                                     00000000
> 00000000 00000000 00000000
>             9fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
>             9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>
>             unreferenced object 0xc4134e00 (size 512):
>               comm "swapper", pid 1, jiffies 4294937296 (age 3541.930s)
>               hex dump (first 32 bytes):
>                 00 4e 13 c4 00 4e 13 c4 ff ff ff 7f ff ff ff 7f
> .N...N..........
>                 ff ff ff 7f 02 00 00 00 00 5f 13 c4 1c 4e 13 c4
> ........._...N..
>               backtrace:
>                 [<de1a3c34>] dev_pm_qos_constraints_allocate+0x10/0xcc
>                 [<d21cf6e4>] dev_pm_qos_add_notifier+0x6c/0xd0
>                 [<e04bbc90>] genpd_add_device+0x178/0x1c0
>                 [<95067303>] of_genpd_add_device+0x34/0x4c
>                 [<c334b97a>] board_staging_register_device+0xf8/0x118
>                 [<01bd495a>] board_staging_register_devices+0x24/0x28
>                 [<fb25a5d8>] runtime_board_check+0x2c/0x40
>                 [<65aed679>] do_one_initcall+0xe0/0x278
>                 [<97e3f4f7>] kernel_init_freeable+0x174/0x1c0
>                 [<63c8fed0>] kernel_init+0x8/0x118
>                 [<f704d96c>] ret_from_fork+0x14/0x38
>                 [<00000000>] 0x0

Hmm... I looked at this in bits and pieces throughout the day. At
least spent an hour looking at this. This doesn't make a lot of sense
to me. I don't even touch anything in this code path AFAICT.  Are
modules/kernel mixed up somehow? I need more info before I can help.
Does reverting my pm domain change make any difference (assume it
boots this far without it).

>
>   3. RZ/A1 and RZ/A2: No issues.

Great!

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-12  2:59     ` Saravana Kannan
@ 2021-02-12  8:14       ` Geert Uytterhoeven
  2021-02-12 20:57         ` Saravana Kannan
  2021-02-15 12:38       ` Geert Uytterhoeven
  2021-02-15 15:16       ` Geert Uytterhoeven
  2 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-12  8:14 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
> >
> >       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
> >         node OF_POPULATED after init") is no longer needed (but already
> >         queued for v5.12 anyway)
>
> Rob doesn't like the proliferation of OF_POPULATED and we don't need
> it anymore, so maybe work it out with him? It's a balance between some
> wasted memory (struct device(s)) vs not proliferating OF_POPULATED.

Rob: should it be reverted?  For v5.13?
I guess other similar "fixes" went in in the mean time.

> >       - Some devices are reprobed, despite their drivers returning
> >         a real error code, and not -EPROBE_DEFER:
>
> Sorry, it's not obvious from the logs below where "reprobing" is
> happening. Can you give more pointers please?

My log was indeed not a full log, but just the reprobes happening.
I'll send you a full log by private email.

> Also, thinking more about this, the only way I could see this happen is:
> 1. Device fails with error that's not -EPROBE_DEFER
> 2. It somehow gets added to a device link (with AUTOPROBE_CONSUMER
> flag) where it's a consumer.
> 3. The supplier probes and the device gets added to the deferred probe
> list again.
>
> But I can't see how this sequence can happen. Device links are created
> only when a device is added. And is the supplier isn't added yet, the
> consumer wouldn't have probed in the first place.

The full log doesn't show any evidence of the device being added
to a list in between the two probes.

> Other than "annoying waste of time" is this causing any other problems?

Probably not.  But see below.

> >       - The PCI reprobing leads to a memory leak, for which I've sent a fix
> >         "[PATCH] PCI: Fix memory leak in pci_register_io_range()"
> >         https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@glider.be/
>
> Wrt PCI reprobing,
> 1. Is this PCI never expected to probe, but it's being reattempted
> despite the NOT EPROBE_DEFER error? Or

There is no PCIe card present, so the failure is expected.
Later it is reprobed, which of course fails again.

> 2. The PCI was deferred probe when it should have probed and then when
> it's finally reattemped and it could succeed, we are hitting this mem
> leak issue?

I think the leak has always been there, but it was just exposed by
this unneeded reprobe.  I don't think a reprobe after that specific
error path had ever happened before.

> I'm basically trying to distinguish between "this stuff should never
> be retried" vs "this/it's suppliers got probe deferred with
> fw_devlink=on vs but didn't get probe deferred with
> fw_devlink=permissive and that's causing issues"

There should not be a probe deferral, as no -EPROBE_DEFER was
returned.

> >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> >         /sys/kernel/debug/dmaengine/summary:
> >
> >             -dma4chan0    | e66d8000.i2c:tx
> >             -dma4chan1    | e66d8000.i2c:rx
> >             -dma5chan0    | e6510000.i2c:tx
>
> I think I need more context on the problem before I can try to fix it.
> I'm also very unfamiliar with that file. With fw_devlink=permissive,
> I2C was using DMA? If so, the next step is to see if the I2C relative
> probe order with DMA is getting changed and if so, why.

Yes, I plan to dig deeper to see what really happens...

> >       - On R-Mobile A1, I get a BUG and a memory leak:
> >
> >             BUG: spinlock bad magic on CPU#0, swapper/1

>
> Hmm... I looked at this in bits and pieces throughout the day. At
> least spent an hour looking at this. This doesn't make a lot of sense
> to me. I don't even touch anything in this code path AFAICT.  Are
> modules/kernel mixed up somehow? I need more info before I can help.
> Does reverting my pm domain change make any difference (assume it
> boots this far without it).

I plan to dig deeper to see what really happens...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-12  8:14       ` Geert Uytterhoeven
@ 2021-02-12 20:57         ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-12 20:57 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Fri, Feb 12, 2021 at 12:15 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
> > >
> > >       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
> > >         node OF_POPULATED after init") is no longer needed (but already
> > >         queued for v5.12 anyway)
> >
> > Rob doesn't like the proliferation of OF_POPULATED and we don't need
> > it anymore, so maybe work it out with him? It's a balance between some
> > wasted memory (struct device(s)) vs not proliferating OF_POPULATED.
>
> Rob: should it be reverted?  For v5.13?
> I guess other similar "fixes" went in in the mean time.
>
> > >       - Some devices are reprobed, despite their drivers returning
> > >         a real error code, and not -EPROBE_DEFER:
> >
> > Sorry, it's not obvious from the logs below where "reprobing" is
> > happening. Can you give more pointers please?
>
> My log was indeed not a full log, but just the reprobes happening.
> I'll send you a full log by private email.
>
> > Also, thinking more about this, the only way I could see this happen is:
> > 1. Device fails with error that's not -EPROBE_DEFER
> > 2. It somehow gets added to a device link (with AUTOPROBE_CONSUMER
> > flag) where it's a consumer.
> > 3. The supplier probes and the device gets added to the deferred probe
> > list again.
> >
> > But I can't see how this sequence can happen. Device links are created
> > only when a device is added. And is the supplier isn't added yet, the
> > consumer wouldn't have probed in the first place.
>
> The full log doesn't show any evidence of the device being added
> to a list in between the two probes.
>
> > Other than "annoying waste of time" is this causing any other problems?
>
> Probably not.  But see below.
>
> > >       - The PCI reprobing leads to a memory leak, for which I've sent a fix
> > >         "[PATCH] PCI: Fix memory leak in pci_register_io_range()"
> > >         https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@glider.be/
> >
> > Wrt PCI reprobing,
> > 1. Is this PCI never expected to probe, but it's being reattempted
> > despite the NOT EPROBE_DEFER error? Or
>
> There is no PCIe card present, so the failure is expected.
> Later it is reprobed, which of course fails again.
>
> > 2. The PCI was deferred probe when it should have probed and then when
> > it's finally reattemped and it could succeed, we are hitting this mem
> > leak issue?
>
> I think the leak has always been there, but it was just exposed by
> this unneeded reprobe.  I don't think a reprobe after that specific
> error path had ever happened before.
>
> > I'm basically trying to distinguish between "this stuff should never
> > be retried" vs "this/it's suppliers got probe deferred with
> > fw_devlink=on vs but didn't get probe deferred with
> > fw_devlink=permissive and that's causing issues"
>
> There should not be a probe deferral, as no -EPROBE_DEFER was
> returned.
>
> > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > >         /sys/kernel/debug/dmaengine/summary:
> > >
> > >             -dma4chan0    | e66d8000.i2c:tx
> > >             -dma4chan1    | e66d8000.i2c:rx
> > >             -dma5chan0    | e6510000.i2c:tx
> >
> > I think I need more context on the problem before I can try to fix it.
> > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > I2C was using DMA? If so, the next step is to see if the I2C relative
> > probe order with DMA is getting changed and if so, why.
>
> Yes, I plan to dig deeper to see what really happens...

Try fw_devlink.strict (you'll need IOMMU enabled too). If that fixes
it and you also don't see this issue with fw_devlink=permissive, then
it means there's probably some unnecessary probe deferral that we
should try to avoid. At least, that's my hunch right now.

Thanks,
Saravana

>
> > >       - On R-Mobile A1, I get a BUG and a memory leak:
> > >
> > >             BUG: spinlock bad magic on CPU#0, swapper/1
>
> >
> > Hmm... I looked at this in bits and pieces throughout the day. At
> > least spent an hour looking at this. This doesn't make a lot of sense
> > to me. I don't even touch anything in this code path AFAICT.  Are
> > modules/kernel mixed up somehow? I need more info before I can help.
> > Does reverting my pm domain change make any difference (assume it
> > boots this far without it).
>
> I plan to dig deeper to see what really happens...
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-11 13:00         ` Greg KH
@ 2021-02-13  0:37           ` Stephen Boyd
  0 siblings, 0 replies; 55+ messages in thread
From: Stephen Boyd @ 2021-02-13  0:37 UTC (permalink / raw)
  To: Greg KH, Tudor Ambarus
  Cc: corbet, rafael, khilman, ulf.hansson, len.brown, lenb, pavel,
	mturquette, robh+dt, frowand.list, maz, tglx, saravanak,
	nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, m.szyprowski, geert,
	kernel-team

Quoting Greg KH (2021-02-11 05:00:51)
> On Wed, Feb 10, 2021 at 01:44:35PM +0200, Tudor Ambarus wrote:
> > This is a follow-up for:
> > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> > 
> > The above commit updated the deprecated of_clk_add_provider(),
> > but missed to update the preferred of_clk_add_hw_provider().
> > Update it now.
> > 
> > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> > ---

Acked-by: Stephen Boyd <sboyd@kernel.org>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 8/8] clk: Mark fwnodes when their clock provider is added/removed
       [not found]     ` <161317679292.1254594.15797939257637374295@swboyd.mtv.corp.google.com>
@ 2021-02-14 21:15       ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-02-14 21:15 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Frank Rowand, Greg Kroah-Hartman, Jonathan Corbet, Kevin Hilman,
	Len Brown, Len Brown, Marc Zyngier, Michael Turquette,
	Pavel Machek, Rafael J. Wysocki, Rob Herring, Thomas Gleixner,
	Ulf Hansson, Linux Doc Mailing List, LKML, Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Fri, Feb 12, 2021 at 4:39 PM Stephen Boyd <sboyd@kernel.org> wrote:
>
> Quoting Saravana Kannan (2021-02-05 14:26:44)
> > This allows fw_devlink to recognize clock provider drivers that don't
> > use the device-driver model to initialize the device. fw_devlink will
> > use this information to make sure consumers of such clock providers
> > aren't indefinitely blocked from probing, waiting for the power domain
> > device to appear and bind to a driver.
>
> The "power domain" part of this commit text doesn't make any sense. Is
> it copy/pasted from some other patch? Should probably say "waiting for
> the clk providing device"?

Yeah, copy-pasta.

>
> >
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
>
> Acked-by: Stephen Boyd <sboyd@kernel.org>

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-11 13:00   ` Geert Uytterhoeven
  2021-02-11 13:05     ` Geert Uytterhoeven
  2021-02-12  2:59     ` Saravana Kannan
@ 2021-02-15 11:19     ` Geert Uytterhoeven
  2 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-15 11:19 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Thu, Feb 11, 2021 at 2:00 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan <saravanak@google.com> wrote:
> > There are a lot of devices/drivers where they never have a struct device
> > created for them or the driver initializes the hardware without ever
> > binding to the struct device.
> >
> > This series is intended to avoid any boot regressions due to such
> > devices/drivers when fw_devlink=on and also address the handling of
> > optional suppliers.

>       - Some devices are reprobed, despite their drivers returning
>         a real error code, and not -EPROBE_DEFER:
>
>             renesas_wdt e6020000.watchdog: Watchdog blacklisted on r8a7791 ES1.*
>             (rwdt_probe() returns -ENODEV)
>
>             sh-pfc e6060000.pinctrl: pin GP_7_23 already requested by
> ee090000.pci; cannot claim for e6590000.usb
>             sh-pfc e6060000.pinctrl: pin-247 (e6590000.usb) status -22
>             sh-pfc e6060000.pinctrl: could not request pin 247
> (GP_7_23) from group usb0  on device sh-pfc
>             renesas_usbhs e6590000.usb: Error applying setting,
> reverse things back
>             renesas_usbhs: probe of e6590000.usb failed with error -22
>
>             rcar-pcie fe000000.pcie: host bridge /soc/pcie@fe000000 ranges:
>             rcar-pcie fe000000.pcie:       IO
> 0x00fe100000..0x00fe1fffff -> 0x0000000000
>             rcar-pcie fe000000.pcie:      MEM
> 0x00fe200000..0x00fe3fffff -> 0x00fe200000
>             rcar-pcie fe000000.pcie:      MEM
> 0x0030000000..0x0037ffffff -> 0x0030000000
>             rcar-pcie fe000000.pcie:      MEM
> 0x0038000000..0x003fffffff -> 0x0038000000
>             rcar-pcie fe000000.pcie:   IB MEM
> 0x0040000000..0x00bfffffff -> 0x0040000000
>             rcar-pcie fe000000.pcie:   IB MEM
> 0x0200000000..0x02ffffffff -> 0x0200000000
>             rcar-pcie fe000000.pcie: PCIe link down
>             (rcar_pcie_probe() returns -ENODEV)
>
>             xhci-hcd ee000000.usb: xHCI Host Controller
>             xhci-hcd ee000000.usb: new USB bus registered, assigned bus number 7
>             xhci-hcd ee000000.usb: Direct firmware load for
> r8a779x_usb3_v3.dlmem failed with error -2
>             xhci-hcd ee000000.usb: can't setup: -2
>             xhci-hcd ee000000.usb: USB bus 7 deregistered
>             xhci-hcd: probe of ee000000.usb failed with error -2

Consumers are added to the deferred probe pending list before
they are probed, but not removed on probe failure.
Patch sent
"[PATCH] driver core: Fix double failed probing with fw_devlink=on"
https://lore.kernel.org/linux-renesas-soc/20210215111619.2385030-1-geert+renesas@glider.be/

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-12  2:59     ` Saravana Kannan
  2021-02-12  8:14       ` Geert Uytterhoeven
@ 2021-02-15 12:38       ` Geert Uytterhoeven
  2021-02-15 21:26         ` Saravana Kannan
  2021-02-15 15:16       ` Geert Uytterhoeven
  2 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-15 12:38 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> >         /sys/kernel/debug/dmaengine/summary:
> >
> >             -dma4chan0    | e66d8000.i2c:tx
> >             -dma4chan1    | e66d8000.i2c:rx
> >             -dma5chan0    | e6510000.i2c:tx
>
> I think I need more context on the problem before I can try to fix it.
> I'm also very unfamiliar with that file. With fw_devlink=permissive,
> I2C was using DMA? If so, the next step is to see if the I2C relative
> probe order with DMA is getting changed and if so, why.

More detailed log:

    platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
    platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio

Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?

    platform e6700000.dma-controller: Linked as a consumer to
e6150000.clock-controller
    platform e66d8000.i2c: Added to deferred list
    platform e6700000.dma-controller: Added to deferred list

    bus: 'platform': driver_probe_device: matched device
e6700000.dma-controller with driver rcar-dmac
    bus: 'platform': really_probe: probing driver rcar-dmac with
device e6700000.dma-controller
    platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral

    bus: 'platform': driver_probe_device: matched device e66d8000.i2c
with driver i2c-rcar
    bus: 'platform': really_probe: probing driver i2c-rcar with device
e66d8000.i2c

I2C becomes available...

    i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
    [...]

but DMA is not available yet, so the driver falls back to PIO.

    driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
    bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar

    platform e6700000.dma-controller: Retrying from deferred list
    bus: 'platform': driver_probe_device: matched device
e6700000.dma-controller with driver rcar-dmac
    bus: 'platform': really_probe: probing driver rcar-dmac with
device e6700000.dma-controller
    platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
    platform e6700000.dma-controller: Added to deferred list
    platform e6700000.dma-controller: Retrying from deferred list
    bus: 'platform': driver_probe_device: matched device
e6700000.dma-controller with driver rcar-dmac
    bus: 'platform': really_probe: probing driver rcar-dmac with
device e6700000.dma-controller
    driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
    bus: 'platform': really_probe: bound device
e6700000.dma-controller to driver rcar-dmac

DMA becomes available.

Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
that the I2C controllers do not have DMA channels allocated, as the
kernel has performed no more I2C transfers after DMA became available.

Using i2cdetect shows that DMA is used, which is good:

    i2c-rcar e66d8000.i2c: got DMA channel for rx

With permissive devlinks, the clock controller consumers are not added
to the deferred probing list, and probe order is slightly different.
The I2C controllers are still probed before the DMA controllers.
But DMA becomes available a bit earlier, before the probing of the last
I2C slave driver.  Hence /sys/kernel/debug/dmaengine/summary shows that
some I2C transfers did use DMA.

So the real issue is that e66d8000.i2c not linked as a consumer to
e6700000.dma-controller.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-12  2:59     ` Saravana Kannan
  2021-02-12  8:14       ` Geert Uytterhoeven
  2021-02-15 12:38       ` Geert Uytterhoeven
@ 2021-02-15 15:16       ` Geert Uytterhoeven
  2021-02-15 21:57         ` Saravana Kannan
  2 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-15 15:16 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
> >
> >       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
> >         node OF_POPULATED after init") is no longer needed (but already
> >         queued for v5.12 anyway)
>
> Rob doesn't like the proliferation of OF_POPULATED and we don't need
> it anymore, so maybe work it out with him? It's a balance between some
> wasted memory (struct device(s)) vs not proliferating OF_POPULATED.

> >   2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva)
> >
> >       - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb
> >         reset handling" is no longer needed
> >         https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/
>
> Good to see more evidence that this series is fixing things at a more
> generic level.

I spoke too soon: if CONFIG_POWER_RESET_RMOBILE=n,
booting fails again, as everything is waiting on the system controller,
which never becomes available.
Rcar-sysc doesn't suffer from this problem, cfr. above.
Perhaps because the rmobile-sysc bindings use a hierarchical instead
of a linear PM domain description, and thus consumers point to the
children of the system controller node?
Cfr. system-controller@e6180000 in arch/arm/boot/dts/r8a7740.dtsi.

> >       - On R-Mobile A1, I get a BUG and a memory leak:
> >
> >             BUG: spinlock bad magic on CPU#0, swapper/1
> >              lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner:
> > <none>/-1, .owner_cpu: 0
> >             CPU: 0 PID: 1 Comm: swapper Not tainted
> > 5.11.0-rc5-armadillo-00032-gf0a85c26907e #266
> >             Hardware name: Generic R8A7740 (Flattened Device Tree)
> >             [<c010c3c8>] (unwind_backtrace) from [<c010a49c>]
> > (show_stack+0x10/0x14)
> >             [<c010a49c>] (show_stack) from [<c0159534>]
> > (do_raw_spin_lock+0x20/0x94)
> >             [<c0159534>] (do_raw_spin_lock) from [<c04089d8>]
> > (dev_pm_get_subsys_data+0x30/0xa0)
> >             [<c04089d8>] (dev_pm_get_subsys_data) from [<c0413698>]
> > (genpd_add_device+0x34/0x1c0)
> >             [<c0413698>] (genpd_add_device) from [<c041389c>]
> > (of_genpd_add_device+0x34/0x4c)
> >             [<c041389c>] (of_genpd_add_device) from [<c0a1e9bc>]
> > (board_staging_register_device+0xf8/0x118)
> >             [<c0a1e9bc>] (board_staging_register_device) from

This is indeed a pre-existing problem.
of_genpd_add_device() is called before platform_device_register(),
as it needs to attach the genpd before the device is probed.
But the spinlock is only initialized when the device is registered.
This was masked before due to an unrelated wait context check failure,
which disabled any further spinlock checks, and exposed by fw_devlinks
changing probe order.
Patch sent.
"[PATCH] staging: board: Fix uninitialized spinlock when attaching genpd"
https://lore.kernel.org/r/20210215151405.2551143-1-geert+renesas@glider.be



Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-15 12:38       ` Geert Uytterhoeven
@ 2021-02-15 21:26         ` Saravana Kannan
  2021-02-16  8:05           ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-15 21:26 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > >         /sys/kernel/debug/dmaengine/summary:
> > >
> > >             -dma4chan0    | e66d8000.i2c:tx
> > >             -dma4chan1    | e66d8000.i2c:rx
> > >             -dma5chan0    | e6510000.i2c:tx
> >
> > I think I need more context on the problem before I can try to fix it.
> > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > I2C was using DMA? If so, the next step is to see if the I2C relative
> > probe order with DMA is getting changed and if so, why.
>
> More detailed log:
>
>     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
>     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
>
> Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?

Because fw_devlink.strict=1 is not set and dma/iommu is considered an
"optional"/"driver decides" dependency.

>     platform e6700000.dma-controller: Linked as a consumer to
> e6150000.clock-controller

Is this the only supplier of dma-controller?

>     platform e66d8000.i2c: Added to deferred list
>     platform e6700000.dma-controller: Added to deferred list
>
>     bus: 'platform': driver_probe_device: matched device
> e6700000.dma-controller with driver rcar-dmac
>     bus: 'platform': really_probe: probing driver rcar-dmac with
> device e6700000.dma-controller
>     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
>
>     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> with driver i2c-rcar
>     bus: 'platform': really_probe: probing driver i2c-rcar with device
> e66d8000.i2c
>
> I2C becomes available...
>
>     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
>     [...]
>
> but DMA is not available yet, so the driver falls back to PIO.
>
>     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
>     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
>
>     platform e6700000.dma-controller: Retrying from deferred list
>     bus: 'platform': driver_probe_device: matched device
> e6700000.dma-controller with driver rcar-dmac
>     bus: 'platform': really_probe: probing driver rcar-dmac with
> device e6700000.dma-controller
>     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
>     platform e6700000.dma-controller: Added to deferred list
>     platform e6700000.dma-controller: Retrying from deferred list
>     bus: 'platform': driver_probe_device: matched device
> e6700000.dma-controller with driver rcar-dmac
>     bus: 'platform': really_probe: probing driver rcar-dmac with
> device e6700000.dma-controller
>     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
>     bus: 'platform': really_probe: bound device
> e6700000.dma-controller to driver rcar-dmac
>
> DMA becomes available.
>
> Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> that the I2C controllers do not have DMA channels allocated, as the
> kernel has performed no more I2C transfers after DMA became available.
>
> Using i2cdetect shows that DMA is used, which is good:
>
>     i2c-rcar e66d8000.i2c: got DMA channel for rx
>
> With permissive devlinks, the clock controller consumers are not added
> to the deferred probing list, and probe order is slightly different.
> The I2C controllers are still probed before the DMA controllers.
> But DMA becomes available a bit earlier, before the probing of the last
> I2C slave driver.

This seems like a race? I'm guessing it's two different threads
probing those two devices? And it just happens to work for
"permissive" assuming the boot timing doesn't change?

> Hence /sys/kernel/debug/dmaengine/summary shows that
> some I2C transfers did use DMA.
>
> So the real issue is that e66d8000.i2c not linked as a consumer to
> e6700000.dma-controller.

That's because fw_devlink.strict=1 isn't set. If you need DMA to be
treated as a mandatory supplier, you'll need to set the flag.

Is fw_devlink=on really breaking anything here? It just seems like
"permissive" got lucky with the timing and it could break at any point
in the future. Thought?

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-15 15:16       ` Geert Uytterhoeven
@ 2021-02-15 21:57         ` Saravana Kannan
  2021-02-16 12:58           ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-15 21:57 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Geert,

On Mon, Feb 15, 2021 at 7:16 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
> > >
> > >       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
> > >         node OF_POPULATED after init") is no longer needed (but already
> > >         queued for v5.12 anyway)
> >
> > Rob doesn't like the proliferation of OF_POPULATED and we don't need
> > it anymore, so maybe work it out with him? It's a balance between some
> > wasted memory (struct device(s)) vs not proliferating OF_POPULATED.
>
> > >   2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva)
> > >
> > >       - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb
> > >         reset handling" is no longer needed
> > >         https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/
> >
> > Good to see more evidence that this series is fixing things at a more
> > generic level.
>
> I spoke too soon: if CONFIG_POWER_RESET_RMOBILE=n,
> booting fails again, as everything is waiting on the system controller,
> which never becomes available.
> Rcar-sysc doesn't suffer from this problem, cfr. above.
> Perhaps because the rmobile-sysc bindings use a hierarchical instead
> of a linear PM domain description, and thus consumers point to the
> children of the system controller node?
> Cfr. system-controller@e6180000 in arch/arm/boot/dts/r8a7740.dtsi.

Ok, I see what's going on. The problem is that the "power domain"
fwnode being registered is not the node that contains the "compatible"
property and becomes a device. So this patch[1] is not helping here.
Fix is to do something like this (to avoid using OF_POPULATED flag and
breaking reset):

diff --git a/drivers/soc/renesas/rmobile-sysc.c
b/drivers/soc/renesas/rmobile-sysc.c
index 9046b8c933cb..b7e66139ef7d 100644
--- a/drivers/soc/renesas/rmobile-sysc.c
+++ b/drivers/soc/renesas/rmobile-sysc.c
@@ -344,6 +344,7 @@ static int __init rmobile_init_pm_domains(void)
                        of_node_put(np);
                        break;
                }
+               fwnode_dev_initialized(&np->fwnode, true);
        }

        put_special_pds();

Can you give it a shot?

[1] - https://lore.kernel.org/lkml/20210205222644.2357303-8-saravanak@google.com/

> > >       - On R-Mobile A1, I get a BUG and a memory leak:
> > >
> > >             BUG: spinlock bad magic on CPU#0, swapper/1
> > >              lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner:
> > > <none>/-1, .owner_cpu: 0
> > >             CPU: 0 PID: 1 Comm: swapper Not tainted
> > > 5.11.0-rc5-armadillo-00032-gf0a85c26907e #266
> > >             Hardware name: Generic R8A7740 (Flattened Device Tree)
> > >             [<c010c3c8>] (unwind_backtrace) from [<c010a49c>]
> > > (show_stack+0x10/0x14)
> > >             [<c010a49c>] (show_stack) from [<c0159534>]
> > > (do_raw_spin_lock+0x20/0x94)
> > >             [<c0159534>] (do_raw_spin_lock) from [<c04089d8>]
> > > (dev_pm_get_subsys_data+0x30/0xa0)
> > >             [<c04089d8>] (dev_pm_get_subsys_data) from [<c0413698>]
> > > (genpd_add_device+0x34/0x1c0)
> > >             [<c0413698>] (genpd_add_device) from [<c041389c>]
> > > (of_genpd_add_device+0x34/0x4c)
> > >             [<c041389c>] (of_genpd_add_device) from [<c0a1e9bc>]
> > > (board_staging_register_device+0xf8/0x118)
> > >             [<c0a1e9bc>] (board_staging_register_device) from
>
> This is indeed a pre-existing problem.
> of_genpd_add_device() is called before platform_device_register(),
> as it needs to attach the genpd before the device is probed.
> But the spinlock is only initialized when the device is registered.
> This was masked before due to an unrelated wait context check failure,
> which disabled any further spinlock checks, and exposed by fw_devlinks
> changing probe order.
> Patch sent.
> "[PATCH] staging: board: Fix uninitialized spinlock when attaching genpd"
> https://lore.kernel.org/r/20210215151405.2551143-1-geert+renesas@glider.be
>

Great!

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-15 21:26         ` Saravana Kannan
@ 2021-02-16  8:05           ` Geert Uytterhoeven
  2021-02-16 18:48             ` Saravana Kannan
  0 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-16  8:05 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Mon, Feb 15, 2021 at 10:27 PM Saravana Kannan <saravanak@google.com> wrote:
> On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > > >         /sys/kernel/debug/dmaengine/summary:
> > > >
> > > >             -dma4chan0    | e66d8000.i2c:tx
> > > >             -dma4chan1    | e66d8000.i2c:rx
> > > >             -dma5chan0    | e6510000.i2c:tx
> > >
> > > I think I need more context on the problem before I can try to fix it.
> > > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > > I2C was using DMA? If so, the next step is to see if the I2C relative
> > > probe order with DMA is getting changed and if so, why.
> >
> > More detailed log:
> >
> >     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
> >     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
> >
> > Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?
>
> Because fw_devlink.strict=1 is not set and dma/iommu is considered an
> "optional"/"driver decides" dependency.

Oh, I thought dma/iommu were considered mandatory initially,
but dropped as dependencies in the late boot process?

>
> >     platform e6700000.dma-controller: Linked as a consumer to
> > e6150000.clock-controller
>
> Is this the only supplier of dma-controller?

No, e6180000.system-controller is also a supplier.

> >     platform e66d8000.i2c: Added to deferred list
> >     platform e6700000.dma-controller: Added to deferred list
> >
> >     bus: 'platform': driver_probe_device: matched device
> > e6700000.dma-controller with driver rcar-dmac
> >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > device e6700000.dma-controller
> >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> >
> >     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> > with driver i2c-rcar
> >     bus: 'platform': really_probe: probing driver i2c-rcar with device
> > e66d8000.i2c
> >
> > I2C becomes available...
> >
> >     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
> >     [...]
> >
> > but DMA is not available yet, so the driver falls back to PIO.
> >
> >     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
> >     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
> >
> >     platform e6700000.dma-controller: Retrying from deferred list
> >     bus: 'platform': driver_probe_device: matched device
> > e6700000.dma-controller with driver rcar-dmac
> >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > device e6700000.dma-controller
> >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> >     platform e6700000.dma-controller: Added to deferred list
> >     platform e6700000.dma-controller: Retrying from deferred list
> >     bus: 'platform': driver_probe_device: matched device
> > e6700000.dma-controller with driver rcar-dmac
> >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > device e6700000.dma-controller
> >     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
> >     bus: 'platform': really_probe: bound device
> > e6700000.dma-controller to driver rcar-dmac
> >
> > DMA becomes available.
> >
> > Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> > that the I2C controllers do not have DMA channels allocated, as the
> > kernel has performed no more I2C transfers after DMA became available.
> >
> > Using i2cdetect shows that DMA is used, which is good:
> >
> >     i2c-rcar e66d8000.i2c: got DMA channel for rx
> >
> > With permissive devlinks, the clock controller consumers are not added
> > to the deferred probing list, and probe order is slightly different.
> > The I2C controllers are still probed before the DMA controllers.
> > But DMA becomes available a bit earlier, before the probing of the last
> > I2C slave driver.
>
> This seems like a race? I'm guessing it's two different threads
> probing those two devices? And it just happens to work for
> "permissive" assuming the boot timing doesn't change?
>
> > Hence /sys/kernel/debug/dmaengine/summary shows that
> > some I2C transfers did use DMA.
> >
> > So the real issue is that e66d8000.i2c not linked as a consumer to
> > e6700000.dma-controller.
>
> That's because fw_devlink.strict=1 isn't set. If you need DMA to be
> treated as a mandatory supplier, you'll need to set the flag.
>
> Is fw_devlink=on really breaking anything here? It just seems like
> "permissive" got lucky with the timing and it could break at any point
> in the future. Thought?

I don't think there is a race.  fw_devlinks calling driver_deferred_probe_add()
on all consumers has a big impact on probe order.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-15 21:57         ` Saravana Kannan
@ 2021-02-16 12:58           ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-16 12:58 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Mon, Feb 15, 2021 at 10:57 PM Saravana Kannan <saravanak@google.com> wrote:
> On Mon, Feb 15, 2021 at 7:16 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > >   1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu).
> > > >
> > > >       - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device
> > > >         node OF_POPULATED after init") is no longer needed (but already
> > > >         queued for v5.12 anyway)
> > >
> > > Rob doesn't like the proliferation of OF_POPULATED and we don't need
> > > it anymore, so maybe work it out with him? It's a balance between some
> > > wasted memory (struct device(s)) vs not proliferating OF_POPULATED.
> >
> > > >   2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva)
> > > >
> > > >       - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb
> > > >         reset handling" is no longer needed
> > > >         https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/
> > >
> > > Good to see more evidence that this series is fixing things at a more
> > > generic level.
> >
> > I spoke too soon: if CONFIG_POWER_RESET_RMOBILE=n,
> > booting fails again, as everything is waiting on the system controller,
> > which never becomes available.
> > Rcar-sysc doesn't suffer from this problem, cfr. above.
> > Perhaps because the rmobile-sysc bindings use a hierarchical instead
> > of a linear PM domain description, and thus consumers point to the
> > children of the system controller node?
> > Cfr. system-controller@e6180000 in arch/arm/boot/dts/r8a7740.dtsi.
>
> Ok, I see what's going on. The problem is that the "power domain"
> fwnode being registered is not the node that contains the "compatible"
> property and becomes a device. So this patch[1] is not helping here.
> Fix is to do something like this (to avoid using OF_POPULATED flag and
> breaking reset):
>
> diff --git a/drivers/soc/renesas/rmobile-sysc.c
> b/drivers/soc/renesas/rmobile-sysc.c
> index 9046b8c933cb..b7e66139ef7d 100644
> --- a/drivers/soc/renesas/rmobile-sysc.c
> +++ b/drivers/soc/renesas/rmobile-sysc.c
> @@ -344,6 +344,7 @@ static int __init rmobile_init_pm_domains(void)
>                         of_node_put(np);
>                         break;
>                 }
> +               fwnode_dev_initialized(&np->fwnode, true);
>         }
>
>         put_special_pds();
>
> Can you give it a shot?

Thanks, works.  Patch sent
"[PATCH v2] soc: renesas: rmobile-sysc: Mark fwnode when PM domain is added"
https://lore.kernel.org/linux-arm-kernel/20210216123958.3180014-1-geert+renesas@glider.be/

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-16  8:05           ` Geert Uytterhoeven
@ 2021-02-16 18:48             ` Saravana Kannan
  2021-02-16 20:31               ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-16 18:48 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Tue, Feb 16, 2021 at 12:05 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Mon, Feb 15, 2021 at 10:27 PM Saravana Kannan <saravanak@google.com> wrote:
> > On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > > > >         /sys/kernel/debug/dmaengine/summary:
> > > > >
> > > > >             -dma4chan0    | e66d8000.i2c:tx
> > > > >             -dma4chan1    | e66d8000.i2c:rx
> > > > >             -dma5chan0    | e6510000.i2c:tx
> > > >
> > > > I think I need more context on the problem before I can try to fix it.
> > > > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > > > I2C was using DMA? If so, the next step is to see if the I2C relative
> > > > probe order with DMA is getting changed and if so, why.
> > >
> > > More detailed log:
> > >
> > >     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
> > >     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
> > >
> > > Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?
> >
> > Because fw_devlink.strict=1 is not set and dma/iommu is considered an
> > "optional"/"driver decides" dependency.
>
> Oh, I thought dma/iommu were considered mandatory initially,
> but dropped as dependencies in the late boot process?

No, I didn't do that in case the drivers that didn't need the
IOMMU/DMA were sensitive to probe order.

My goal was for fw_devlink=on to not affect probe order for devices
that currently don't need to defer probe. But see below...

>
> >
> > >     platform e6700000.dma-controller: Linked as a consumer to
> > > e6150000.clock-controller
> >
> > Is this the only supplier of dma-controller?
>
> No, e6180000.system-controller is also a supplier.
>
> > >     platform e66d8000.i2c: Added to deferred list
> > >     platform e6700000.dma-controller: Added to deferred list
> > >
> > >     bus: 'platform': driver_probe_device: matched device
> > > e6700000.dma-controller with driver rcar-dmac
> > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > device e6700000.dma-controller
> > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > >
> > >     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> > > with driver i2c-rcar
> > >     bus: 'platform': really_probe: probing driver i2c-rcar with device
> > > e66d8000.i2c
> > >
> > > I2C becomes available...
> > >
> > >     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
> > >     [...]
> > >
> > > but DMA is not available yet, so the driver falls back to PIO.
> > >
> > >     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
> > >     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
> > >
> > >     platform e6700000.dma-controller: Retrying from deferred list
> > >     bus: 'platform': driver_probe_device: matched device
> > > e6700000.dma-controller with driver rcar-dmac
> > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > device e6700000.dma-controller
> > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > >     platform e6700000.dma-controller: Added to deferred list
> > >     platform e6700000.dma-controller: Retrying from deferred list
> > >     bus: 'platform': driver_probe_device: matched device
> > > e6700000.dma-controller with driver rcar-dmac
> > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > device e6700000.dma-controller
> > >     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
> > >     bus: 'platform': really_probe: bound device
> > > e6700000.dma-controller to driver rcar-dmac
> > >
> > > DMA becomes available.
> > >
> > > Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> > > that the I2C controllers do not have DMA channels allocated, as the
> > > kernel has performed no more I2C transfers after DMA became available.
> > >
> > > Using i2cdetect shows that DMA is used, which is good:
> > >
> > >     i2c-rcar e66d8000.i2c: got DMA channel for rx
> > >
> > > With permissive devlinks, the clock controller consumers are not added
> > > to the deferred probing list, and probe order is slightly different.
> > > The I2C controllers are still probed before the DMA controllers.
> > > But DMA becomes available a bit earlier, before the probing of the last
> > > I2C slave driver.
> >
> > This seems like a race? I'm guessing it's two different threads
> > probing those two devices? And it just happens to work for
> > "permissive" assuming the boot timing doesn't change?
> >
> > > Hence /sys/kernel/debug/dmaengine/summary shows that
> > > some I2C transfers did use DMA.
> > >
> > > So the real issue is that e66d8000.i2c not linked as a consumer to
> > > e6700000.dma-controller.
> >
> > That's because fw_devlink.strict=1 isn't set. If you need DMA to be
> > treated as a mandatory supplier, you'll need to set the flag.
> >
> > Is fw_devlink=on really breaking anything here? It just seems like
> > "permissive" got lucky with the timing and it could break at any point
> > in the future. Thought?
>
> I don't think there is a race.

Can you explain more please? This below makes it sound like DMA just
sneaks in at the last minute.

> > > The I2C controllers are still probed before the DMA controllers.
> > > But DMA becomes available a bit earlier, before the probing of the last
> > > I2C slave driver.

>  fw_devlinks calling driver_deferred_probe_add()
> on all consumers has a big impact on probe order.

Ugh... yeah. That's the real issue. This is really a device links
issue that fw_devlink is exposing. I already have a bunch of things in
my TODO list to improve deferred probing and probe ordering. Since
this is not causing boot issues (only DMA issue) with fw_devlink=on,
can we treat this as not a blocking item for fw_devlink=on? Once I go
through my TODO list, it should be fixed (by not changing probe
ordering unnecessarily). And if not, I can help find out a different
solution at that point.

Also, if you have IOMMU drivers, then fw_devlink.strict is also
another solution that's available. On a separate note (not a final
fix), I was wondering if we should have a config for fw_devlink.strict
default value and then have it selected when IOMMU drivers configs are
enabled.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-16 18:48             ` Saravana Kannan
@ 2021-02-16 20:31               ` Geert Uytterhoeven
  2021-02-17 23:57                 ` Saravana Kannan
  0 siblings, 1 reply; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-16 20:31 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Tue, Feb 16, 2021 at 7:49 PM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 16, 2021 at 12:05 AM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > On Mon, Feb 15, 2021 at 10:27 PM Saravana Kannan <saravanak@google.com> wrote:
> > > On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > > > > >         /sys/kernel/debug/dmaengine/summary:
> > > > > >
> > > > > >             -dma4chan0    | e66d8000.i2c:tx
> > > > > >             -dma4chan1    | e66d8000.i2c:rx
> > > > > >             -dma5chan0    | e6510000.i2c:tx
> > > > >
> > > > > I think I need more context on the problem before I can try to fix it.
> > > > > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > > > > I2C was using DMA? If so, the next step is to see if the I2C relative
> > > > > probe order with DMA is getting changed and if so, why.
> > > >
> > > > More detailed log:
> > > >
> > > >     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
> > > >     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
> > > >
> > > > Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?
> > >
> > > Because fw_devlink.strict=1 is not set and dma/iommu is considered an
> > > "optional"/"driver decides" dependency.
> >
> > Oh, I thought dma/iommu were considered mandatory initially,
> > but dropped as dependencies in the late boot process?
>
> No, I didn't do that in case the drivers that didn't need the
> IOMMU/DMA were sensitive to probe order.
>
> My goal was for fw_devlink=on to not affect probe order for devices
> that currently don't need to defer probe. But see below...
>
> >
> > >
> > > >     platform e6700000.dma-controller: Linked as a consumer to
> > > > e6150000.clock-controller
> > >
> > > Is this the only supplier of dma-controller?
> >
> > No, e6180000.system-controller is also a supplier.
> >
> > > >     platform e66d8000.i2c: Added to deferred list
> > > >     platform e6700000.dma-controller: Added to deferred list
> > > >
> > > >     bus: 'platform': driver_probe_device: matched device
> > > > e6700000.dma-controller with driver rcar-dmac
> > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > device e6700000.dma-controller
> > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > >
> > > >     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> > > > with driver i2c-rcar
> > > >     bus: 'platform': really_probe: probing driver i2c-rcar with device
> > > > e66d8000.i2c
> > > >
> > > > I2C becomes available...
> > > >
> > > >     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
> > > >     [...]
> > > >
> > > > but DMA is not available yet, so the driver falls back to PIO.
> > > >
> > > >     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
> > > >     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
> > > >
> > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > >     bus: 'platform': driver_probe_device: matched device
> > > > e6700000.dma-controller with driver rcar-dmac
> > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > device e6700000.dma-controller
> > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > >     platform e6700000.dma-controller: Added to deferred list
> > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > >     bus: 'platform': driver_probe_device: matched device
> > > > e6700000.dma-controller with driver rcar-dmac
> > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > device e6700000.dma-controller
> > > >     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
> > > >     bus: 'platform': really_probe: bound device
> > > > e6700000.dma-controller to driver rcar-dmac
> > > >
> > > > DMA becomes available.
> > > >
> > > > Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> > > > that the I2C controllers do not have DMA channels allocated, as the
> > > > kernel has performed no more I2C transfers after DMA became available.
> > > >
> > > > Using i2cdetect shows that DMA is used, which is good:
> > > >
> > > >     i2c-rcar e66d8000.i2c: got DMA channel for rx
> > > >
> > > > With permissive devlinks, the clock controller consumers are not added
> > > > to the deferred probing list, and probe order is slightly different.
> > > > The I2C controllers are still probed before the DMA controllers.
> > > > But DMA becomes available a bit earlier, before the probing of the last
> > > > I2C slave driver.
> > >
> > > This seems like a race? I'm guessing it's two different threads
> > > probing those two devices? And it just happens to work for
> > > "permissive" assuming the boot timing doesn't change?
> > >
> > > > Hence /sys/kernel/debug/dmaengine/summary shows that
> > > > some I2C transfers did use DMA.
> > > >
> > > > So the real issue is that e66d8000.i2c not linked as a consumer to
> > > > e6700000.dma-controller.
> > >
> > > That's because fw_devlink.strict=1 isn't set. If you need DMA to be
> > > treated as a mandatory supplier, you'll need to set the flag.
> > >
> > > Is fw_devlink=on really breaking anything here? It just seems like
> > > "permissive" got lucky with the timing and it could break at any point
> > > in the future. Thought?
> >
> > I don't think there is a race.
>
> Can you explain more please? This below makes it sound like DMA just
> sneaks in at the last minute.

Yes it does, as the DMAC also has a consumer link to the IOMMU.
If you ignore the consumer link from I2C to DMAC, the I2C device has
less dependencies than the DMAC, so the I2C device, and the
devices on the I2C bus, are probed much earlier than the DMAC.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-16 20:31               ` Geert Uytterhoeven
@ 2021-02-17 23:57                 ` Saravana Kannan
  2021-02-25  9:21                   ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Saravana Kannan @ 2021-02-17 23:57 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

On Tue, Feb 16, 2021 at 12:31 PM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Tue, Feb 16, 2021 at 7:49 PM Saravana Kannan <saravanak@google.com> wrote:
> > On Tue, Feb 16, 2021 at 12:05 AM Geert Uytterhoeven
> > <geert@linux-m68k.org> wrote:
> > > On Mon, Feb 15, 2021 at 10:27 PM Saravana Kannan <saravanak@google.com> wrote:
> > > > On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > > > > > >         /sys/kernel/debug/dmaengine/summary:
> > > > > > >
> > > > > > >             -dma4chan0    | e66d8000.i2c:tx
> > > > > > >             -dma4chan1    | e66d8000.i2c:rx
> > > > > > >             -dma5chan0    | e6510000.i2c:tx
> > > > > >
> > > > > > I think I need more context on the problem before I can try to fix it.
> > > > > > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > > > > > I2C was using DMA? If so, the next step is to see if the I2C relative
> > > > > > probe order with DMA is getting changed and if so, why.
> > > > >
> > > > > More detailed log:
> > > > >
> > > > >     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
> > > > >     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
> > > > >
> > > > > Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?
> > > >
> > > > Because fw_devlink.strict=1 is not set and dma/iommu is considered an
> > > > "optional"/"driver decides" dependency.
> > >
> > > Oh, I thought dma/iommu were considered mandatory initially,
> > > but dropped as dependencies in the late boot process?
> >
> > No, I didn't do that in case the drivers that didn't need the
> > IOMMU/DMA were sensitive to probe order.
> >
> > My goal was for fw_devlink=on to not affect probe order for devices
> > that currently don't need to defer probe. But see below...
> >
> > >
> > > >
> > > > >     platform e6700000.dma-controller: Linked as a consumer to
> > > > > e6150000.clock-controller
> > > >
> > > > Is this the only supplier of dma-controller?
> > >
> > > No, e6180000.system-controller is also a supplier.
> > >
> > > > >     platform e66d8000.i2c: Added to deferred list
> > > > >     platform e6700000.dma-controller: Added to deferred list
> > > > >
> > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > e6700000.dma-controller with driver rcar-dmac
> > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > device e6700000.dma-controller
> > > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > > >
> > > > >     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> > > > > with driver i2c-rcar
> > > > >     bus: 'platform': really_probe: probing driver i2c-rcar with device
> > > > > e66d8000.i2c
> > > > >
> > > > > I2C becomes available...
> > > > >
> > > > >     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
> > > > >     [...]
> > > > >
> > > > > but DMA is not available yet, so the driver falls back to PIO.
> > > > >
> > > > >     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
> > > > >     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
> > > > >
> > > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > e6700000.dma-controller with driver rcar-dmac
> > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > device e6700000.dma-controller
> > > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > > >     platform e6700000.dma-controller: Added to deferred list
> > > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > e6700000.dma-controller with driver rcar-dmac
> > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > device e6700000.dma-controller
> > > > >     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
> > > > >     bus: 'platform': really_probe: bound device
> > > > > e6700000.dma-controller to driver rcar-dmac
> > > > >
> > > > > DMA becomes available.
> > > > >
> > > > > Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> > > > > that the I2C controllers do not have DMA channels allocated, as the
> > > > > kernel has performed no more I2C transfers after DMA became available.
> > > > >
> > > > > Using i2cdetect shows that DMA is used, which is good:
> > > > >
> > > > >     i2c-rcar e66d8000.i2c: got DMA channel for rx
> > > > >
> > > > > With permissive devlinks, the clock controller consumers are not added
> > > > > to the deferred probing list, and probe order is slightly different.
> > > > > The I2C controllers are still probed before the DMA controllers.
> > > > > But DMA becomes available a bit earlier, before the probing of the last
> > > > > I2C slave driver.
> > > >
> > > > This seems like a race? I'm guessing it's two different threads
> > > > probing those two devices? And it just happens to work for
> > > > "permissive" assuming the boot timing doesn't change?
> > > >
> > > > > Hence /sys/kernel/debug/dmaengine/summary shows that
> > > > > some I2C transfers did use DMA.
> > > > >
> > > > > So the real issue is that e66d8000.i2c not linked as a consumer to
> > > > > e6700000.dma-controller.
> > > >
> > > > That's because fw_devlink.strict=1 isn't set. If you need DMA to be
> > > > treated as a mandatory supplier, you'll need to set the flag.
> > > >
> > > > Is fw_devlink=on really breaking anything here? It just seems like
> > > > "permissive" got lucky with the timing and it could break at any point
> > > > in the future. Thought?
> > >
> > > I don't think there is a race.
> >
> > Can you explain more please? This below makes it sound like DMA just
> > sneaks in at the last minute.
>
> Yes it does, as the DMAC also has a consumer link to the IOMMU.
> If you ignore the consumer link from I2C to DMAC, the I2C device has
> less dependencies than the DMAC, so the I2C device, and the
> devices on the I2C bus, are probed much earlier than the DMAC.

Can you give this a shot?
https://lore.kernel.org/lkml/20210217235130.1744843-1-saravanak@google.com/T/#u

It should make sure fw_devlink doesn't add a device to the deferred
probe list too soon and change the probe ordering unnecessarily.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving
  2021-02-17 23:57                 ` Saravana Kannan
@ 2021-02-25  9:21                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-02-25  9:21 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Android Kernel Team,
	Linux-Renesas

Hi Saravana,

On Thu, Feb 18, 2021 at 12:57 AM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 16, 2021 at 12:31 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > On Tue, Feb 16, 2021 at 7:49 PM Saravana Kannan <saravanak@google.com> wrote:
> > > On Tue, Feb 16, 2021 at 12:05 AM Geert Uytterhoeven
> > > <geert@linux-m68k.org> wrote:
> > > > On Mon, Feb 15, 2021 at 10:27 PM Saravana Kannan <saravanak@google.com> wrote:
> > > > > On Mon, Feb 15, 2021 at 4:38 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > > On Fri, Feb 12, 2021 at 4:00 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > > > > On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > > > >       - I2C on R-Car Gen3 does not seem to use DMA, according to
> > > > > > > >         /sys/kernel/debug/dmaengine/summary:
> > > > > > > >
> > > > > > > >             -dma4chan0    | e66d8000.i2c:tx
> > > > > > > >             -dma4chan1    | e66d8000.i2c:rx
> > > > > > > >             -dma5chan0    | e6510000.i2c:tx
> > > > > > >
> > > > > > > I think I need more context on the problem before I can try to fix it.
> > > > > > > I'm also very unfamiliar with that file. With fw_devlink=permissive,
> > > > > > > I2C was using DMA? If so, the next step is to see if the I2C relative
> > > > > > > probe order with DMA is getting changed and if so, why.
> > > > > >
> > > > > > More detailed log:
> > > > > >
> > > > > >     platform e66d8000.i2c: Linked as a consumer to e6150000.clock-controller
> > > > > >     platform e66d8000.i2c: Linked as a sync state only consumer to e6055400.gpio
> > > > > >
> > > > > > Why is e66d8000.i2c not linked as a consumer to e6700000.dma-controller?
> > > > >
> > > > > Because fw_devlink.strict=1 is not set and dma/iommu is considered an
> > > > > "optional"/"driver decides" dependency.
> > > >
> > > > Oh, I thought dma/iommu were considered mandatory initially,
> > > > but dropped as dependencies in the late boot process?
> > >
> > > No, I didn't do that in case the drivers that didn't need the
> > > IOMMU/DMA were sensitive to probe order.
> > >
> > > My goal was for fw_devlink=on to not affect probe order for devices
> > > that currently don't need to defer probe. But see below...
> > >
> > > >
> > > > >
> > > > > >     platform e6700000.dma-controller: Linked as a consumer to
> > > > > > e6150000.clock-controller
> > > > >
> > > > > Is this the only supplier of dma-controller?
> > > >
> > > > No, e6180000.system-controller is also a supplier.
> > > >
> > > > > >     platform e66d8000.i2c: Added to deferred list
> > > > > >     platform e6700000.dma-controller: Added to deferred list
> > > > > >
> > > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > > e6700000.dma-controller with driver rcar-dmac
> > > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > > device e6700000.dma-controller
> > > > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > > > >
> > > > > >     bus: 'platform': driver_probe_device: matched device e66d8000.i2c
> > > > > > with driver i2c-rcar
> > > > > >     bus: 'platform': really_probe: probing driver i2c-rcar with device
> > > > > > e66d8000.i2c
> > > > > >
> > > > > > I2C becomes available...
> > > > > >
> > > > > >     i2c-rcar e66d8000.i2c: request_channel failed for tx (-517)
> > > > > >     [...]
> > > > > >
> > > > > > but DMA is not available yet, so the driver falls back to PIO.
> > > > > >
> > > > > >     driver: 'i2c-rcar': driver_bound: bound to device 'e66d8000.i2c'
> > > > > >     bus: 'platform': really_probe: bound device e66d8000.i2c to driver i2c-rcar
> > > > > >
> > > > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > > e6700000.dma-controller with driver rcar-dmac
> > > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > > device e6700000.dma-controller
> > > > > >     platform e6700000.dma-controller: Driver rcar-dmac requests probe deferral
> > > > > >     platform e6700000.dma-controller: Added to deferred list
> > > > > >     platform e6700000.dma-controller: Retrying from deferred list
> > > > > >     bus: 'platform': driver_probe_device: matched device
> > > > > > e6700000.dma-controller with driver rcar-dmac
> > > > > >     bus: 'platform': really_probe: probing driver rcar-dmac with
> > > > > > device e6700000.dma-controller
> > > > > >     driver: 'rcar-dmac': driver_bound: bound to device 'e6700000.dma-controller'
> > > > > >     bus: 'platform': really_probe: bound device
> > > > > > e6700000.dma-controller to driver rcar-dmac
> > > > > >
> > > > > > DMA becomes available.
> > > > > >
> > > > > > Here userspace is entered. /sys/kernel/debug/dmaengine/summary shows
> > > > > > that the I2C controllers do not have DMA channels allocated, as the
> > > > > > kernel has performed no more I2C transfers after DMA became available.
> > > > > >
> > > > > > Using i2cdetect shows that DMA is used, which is good:
> > > > > >
> > > > > >     i2c-rcar e66d8000.i2c: got DMA channel for rx
> > > > > >
> > > > > > With permissive devlinks, the clock controller consumers are not added
> > > > > > to the deferred probing list, and probe order is slightly different.
> > > > > > The I2C controllers are still probed before the DMA controllers.
> > > > > > But DMA becomes available a bit earlier, before the probing of the last
> > > > > > I2C slave driver.
> > > > >
> > > > > This seems like a race? I'm guessing it's two different threads
> > > > > probing those two devices? And it just happens to work for
> > > > > "permissive" assuming the boot timing doesn't change?
> > > > >
> > > > > > Hence /sys/kernel/debug/dmaengine/summary shows that
> > > > > > some I2C transfers did use DMA.
> > > > > >
> > > > > > So the real issue is that e66d8000.i2c not linked as a consumer to
> > > > > > e6700000.dma-controller.
> > > > >
> > > > > That's because fw_devlink.strict=1 isn't set. If you need DMA to be
> > > > > treated as a mandatory supplier, you'll need to set the flag.
> > > > >
> > > > > Is fw_devlink=on really breaking anything here? It just seems like
> > > > > "permissive" got lucky with the timing and it could break at any point
> > > > > in the future. Thought?
> > > >
> > > > I don't think there is a race.
> > >
> > > Can you explain more please? This below makes it sound like DMA just
> > > sneaks in at the last minute.
> >
> > Yes it does, as the DMAC also has a consumer link to the IOMMU.
> > If you ignore the consumer link from I2C to DMAC, the I2C device has
> > less dependencies than the DMAC, so the I2C device, and the
> > devices on the I2C bus, are probed much earlier than the DMAC.
>
> Can you give this a shot?
> https://lore.kernel.org/lkml/20210217235130.1744843-1-saravanak@google.com/T/#u
>
> It should make sure fw_devlink doesn't add a device to the deferred
> probe list too soon and change the probe ordering unnecessarily.

(FTR, to keep all info in this thread)
Yes, this makes I2C use DMA again on Salvator-XS during kernel boot-up.
I haven't run any more elaborate tests on other platforms.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
       [not found]         ` <CGME20210325133159eucas1p297b769beb681743fb32d362a86cc6e3e@eucas1p2.samsung.com>
@ 2021-03-25 13:31           ` Marek Szyprowski
  2021-03-25 15:47             ` Geert Uytterhoeven
  2021-03-25 18:25             ` Nicolas Saenz Julienne
  0 siblings, 2 replies; 55+ messages in thread
From: Marek Szyprowski @ 2021-03-25 13:31 UTC (permalink / raw)
  To: Tudor Ambarus, corbet, gregkh, rafael, khilman, ulf.hansson,
	len.brown, lenb, pavel, mturquette, sboyd, robh+dt, frowand.list,
	maz, tglx, saravanak
  Cc: nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, geert, kernel-team,
	linux-rpi-kernel, Nicolas Saenz Julienne

Hi

On 10.02.2021 12:44, Tudor Ambarus wrote:
> This is a follow-up for:
> commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
>
> The above commit updated the deprecated of_clk_add_provider(),
> but missed to update the preferred of_clk_add_hw_provider().
> Update it now.
>
> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>

This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk: 
Mark fwnodes when their clock provider is added") causes the following 
NULL pointer dereference on Raspberry Pi 3b+ boards:

--->8---

raspberrypi-firmware soc:firmware: Attached to firmware from 
2020-01-06T13:05:25
Unable to handle kernel NULL pointer dereference at virtual address 
0000000000000050
Mem abort info:
   ESR = 0x96000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
Data abort info:
   ISV = 0, ISS = 0x00000004
   CM = 0, WnR = 0
[0000000000000050] user address but active_mm is swapper
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
Hardware name: Raspberry Pi 3 Model B (DT)
Workqueue: events deferred_probe_work_func
pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : of_clk_add_hw_provider+0xac/0xe8
lr : of_clk_add_hw_provider+0x94/0xe8
sp : ffff8000130936b0
x29: ffff8000130936b0 x28: ffff800012494e04
x27: ffff00003b18cb05 x26: ffff00003aa5c010
x25: 0000000000000000 x24: 0000000000000000
x23: ffff00003aa1e380 x22: ffff8000106830d0
x21: ffff80001233f180 x20: 0000000000000018
x19: 0000000000000000 x18: ffff8000124d38b0
x17: 0000000000000013 x16: 0000000000000014
x15: ffff8000125758b0 x14: 00000000000184e0
x13: 000000000000292e x12: ffff80001258dd98
x11: 0000000000000001 x10: 0101010101010101
x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
x7 : fefefefeff6c626f x6 : 5d636d8080808080
x5 : 00000000006d635d x4 : 0000000000000000
x3 : 0000000000000000 x2 : 540eb5edae191600
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
  of_clk_add_hw_provider+0xac/0xe8
  devm_of_clk_add_hw_provider+0x5c/0xb8
  raspberrypi_clk_probe+0x110/0x210
  platform_probe+0x90/0xd8
  really_probe+0x108/0x3c0
  driver_probe_device+0x60/0xc0
  __device_attach_driver+0x9c/0xd0
  bus_for_each_drv+0x70/0xc8
  __device_attach+0xec/0x150
  device_initial_probe+0x10/0x18
  bus_probe_device+0x94/0xa0
  device_add+0x47c/0x780
  platform_device_add+0x110/0x248
  platform_device_register_full+0x120/0x150
  rpi_firmware_probe+0x158/0x1f8
  platform_probe+0x90/0xd8
  really_probe+0x108/0x3c0
  driver_probe_device+0x60/0xc0
  __device_attach_driver+0x9c/0xd0
  bus_for_each_drv+0x70/0xc8
  __device_attach+0xec/0x150
  device_initial_probe+0x10/0x18
  bus_probe_device+0x94/0xa0
  deferred_probe_work_func+0x70/0xa8
  process_one_work+0x2a8/0x718
  worker_thread+0x48/0x460
  kthread+0x134/0x160
  ret_from_fork+0x10/0x18
Code: b1006294 540000c0 b140069f 54000088 (3940e280)
---[ end trace 7ead5ec2f0c51cfe ]---

This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls 
devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL 
dev->of_node. I'm not sure if adding a check for a NULL np in 
of_clk_add_hw_provider() is a right fix, though.

> ---
>   drivers/clk/clk.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 27ff90eacb1f..9370e4dfecae 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
>   	if (ret < 0)
>   		of_clk_del_provider(np);
>   
> +	fwnode_dev_initialized(&np->fwnode, true);
> +
>   	return ret;
>   }
>   EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-03-25 13:31           ` Marek Szyprowski
@ 2021-03-25 15:47             ` Geert Uytterhoeven
  2021-03-25 18:25             ` Nicolas Saenz Julienne
  1 sibling, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-03-25 15:47 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Tudor Ambarus, Jonathan Corbet, Greg KH, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Len Brown, Len Brown, Pavel Machek,
	Michael Turquette, Stephen Boyd, Rob Herring, Frank Rowand,
	Marc Zyngier, Thomas Gleixner, Saravana Kannan, Nicolas Ferre,
	Claudiu Beznea, open list:DOCUMENTATION,
	Linux Kernel Mailing List, Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team, linux-rpi-kernel,
	Nicolas Saenz Julienne

Hi Marek,

On Thu, Mar 25, 2021 at 2:32 PM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
> On 10.02.2021 12:44, Tudor Ambarus wrote:
> > This is a follow-up for:
> > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> >
> > The above commit updated the deprecated of_clk_add_provider(),
> > but missed to update the preferred of_clk_add_hw_provider().
> > Update it now.
> >
> > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
>
> This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk:
> Mark fwnodes when their clock provider is added") causes the following
> NULL pointer dereference on Raspberry Pi 3b+ boards:
>
> --->8---
>
> raspberrypi-firmware soc:firmware: Attached to firmware from
> 2020-01-06T13:05:25
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000050
> Mem abort info:
>    ESR = 0x96000004
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
> Data abort info:
>    ISV = 0, ISS = 0x00000004
>    CM = 0, WnR = 0
> [0000000000000050] user address but active_mm is swapper
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
> Hardware name: Raspberry Pi 3 Model B (DT)
> Workqueue: events deferred_probe_work_func
> pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
> pc : of_clk_add_hw_provider+0xac/0xe8
> lr : of_clk_add_hw_provider+0x94/0xe8
> sp : ffff8000130936b0
> x29: ffff8000130936b0 x28: ffff800012494e04
> x27: ffff00003b18cb05 x26: ffff00003aa5c010
> x25: 0000000000000000 x24: 0000000000000000
> x23: ffff00003aa1e380 x22: ffff8000106830d0
> x21: ffff80001233f180 x20: 0000000000000018
> x19: 0000000000000000 x18: ffff8000124d38b0
> x17: 0000000000000013 x16: 0000000000000014
> x15: ffff8000125758b0 x14: 00000000000184e0
> x13: 000000000000292e x12: ffff80001258dd98
> x11: 0000000000000001 x10: 0101010101010101
> x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
> x7 : fefefefeff6c626f x6 : 5d636d8080808080
> x5 : 00000000006d635d x4 : 0000000000000000
> x3 : 0000000000000000 x2 : 540eb5edae191600
> x1 : 0000000000000000 x0 : 0000000000000000
> Call trace:
>   of_clk_add_hw_provider+0xac/0xe8
>   devm_of_clk_add_hw_provider+0x5c/0xb8
>   raspberrypi_clk_probe+0x110/0x210
>   platform_probe+0x90/0xd8
>   really_probe+0x108/0x3c0
>   driver_probe_device+0x60/0xc0
>   __device_attach_driver+0x9c/0xd0
>   bus_for_each_drv+0x70/0xc8
>   __device_attach+0xec/0x150
>   device_initial_probe+0x10/0x18
>   bus_probe_device+0x94/0xa0
>   device_add+0x47c/0x780
>   platform_device_add+0x110/0x248
>   platform_device_register_full+0x120/0x150
>   rpi_firmware_probe+0x158/0x1f8

> This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls
> devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL
> dev->of_node. I'm not sure if adding a check for a NULL np in
> of_clk_add_hw_provider() is a right fix, though.

raspberrypi_clk_probe():

        /*
         * We can be probed either through the an old-fashioned
         * platform device registration or through a DT node that is a
         * child of the firmware node. Handle both cases.
         */

So the real issue is rpi_register_clk_driver() creating a platform
device for the firmware clocks if they're missing in DT.

Then, the clock driver calls devm_of_clk_add_hw_provider(),
regardless of a DT node being present or not.
I'm wondering how power consumers are supposed to refer
to these firmware clocks, without a DT node?

> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
> >       if (ret < 0)
> >               of_clk_del_provider(np);
> >
> > +     fwnode_dev_initialized(&np->fwnode, true);
> > +
> >       return ret;
> >   }
> >   EXPORT_SYMBOL_GPL(of_clk_add_hw_provider);

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-03-25 13:31           ` Marek Szyprowski
  2021-03-25 15:47             ` Geert Uytterhoeven
@ 2021-03-25 18:25             ` Nicolas Saenz Julienne
  2021-03-26 18:13               ` Stephen Boyd
  1 sibling, 1 reply; 55+ messages in thread
From: Nicolas Saenz Julienne @ 2021-03-25 18:25 UTC (permalink / raw)
  To: Marek Szyprowski, Tudor Ambarus, corbet, gregkh, rafael, khilman,
	ulf.hansson, len.brown, lenb, pavel, mturquette, sboyd, robh+dt,
	frowand.list, maz, tglx, saravanak
  Cc: nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, geert, kernel-team,
	linux-rpi-kernel


[-- Attachment #1: Type: text/plain, Size: 3981 bytes --]

On Thu, 2021-03-25 at 14:31 +0100, Marek Szyprowski wrote:
> Hi
> 
> On 10.02.2021 12:44, Tudor Ambarus wrote:
> > This is a follow-up for:
> > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> > 
> > The above commit updated the deprecated of_clk_add_provider(),
> > but missed to update the preferred of_clk_add_hw_provider().
> > Update it now.
> > 
> > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> 
> This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk: 
> Mark fwnodes when their clock provider is added") causes the following 
> NULL pointer dereference on Raspberry Pi 3b+ boards:
> 
> --->8---
> 
> raspberrypi-firmware soc:firmware: Attached to firmware from 
> 2020-01-06T13:05:25
> Unable to handle kernel NULL pointer dereference at virtual address 
> 0000000000000050
> Mem abort info:
>    ESR = 0x96000004
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
> Data abort info:
>    ISV = 0, ISS = 0x00000004
>    CM = 0, WnR = 0
> [0000000000000050] user address but active_mm is swapper
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
> Hardware name: Raspberry Pi 3 Model B (DT)
> Workqueue: events deferred_probe_work_func
> pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
> pc : of_clk_add_hw_provider+0xac/0xe8
> lr : of_clk_add_hw_provider+0x94/0xe8
> sp : ffff8000130936b0
> x29: ffff8000130936b0 x28: ffff800012494e04
> x27: ffff00003b18cb05 x26: ffff00003aa5c010
> x25: 0000000000000000 x24: 0000000000000000
> x23: ffff00003aa1e380 x22: ffff8000106830d0
> x21: ffff80001233f180 x20: 0000000000000018
> x19: 0000000000000000 x18: ffff8000124d38b0
> x17: 0000000000000013 x16: 0000000000000014
> x15: ffff8000125758b0 x14: 00000000000184e0
> x13: 000000000000292e x12: ffff80001258dd98
> x11: 0000000000000001 x10: 0101010101010101
> x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
> x7 : fefefefeff6c626f x6 : 5d636d8080808080
> x5 : 00000000006d635d x4 : 0000000000000000
> x3 : 0000000000000000 x2 : 540eb5edae191600
> x1 : 0000000000000000 x0 : 0000000000000000
> Call trace:
>   of_clk_add_hw_provider+0xac/0xe8
>   devm_of_clk_add_hw_provider+0x5c/0xb8
>   raspberrypi_clk_probe+0x110/0x210
>   platform_probe+0x90/0xd8
>   really_probe+0x108/0x3c0
>   driver_probe_device+0x60/0xc0
>   __device_attach_driver+0x9c/0xd0
>   bus_for_each_drv+0x70/0xc8
>   __device_attach+0xec/0x150
>   device_initial_probe+0x10/0x18
>   bus_probe_device+0x94/0xa0
>   device_add+0x47c/0x780
>   platform_device_add+0x110/0x248
>   platform_device_register_full+0x120/0x150
>   rpi_firmware_probe+0x158/0x1f8
>   platform_probe+0x90/0xd8
>   really_probe+0x108/0x3c0
>   driver_probe_device+0x60/0xc0
>   __device_attach_driver+0x9c/0xd0
>   bus_for_each_drv+0x70/0xc8
>   __device_attach+0xec/0x150
>   device_initial_probe+0x10/0x18
>   bus_probe_device+0x94/0xa0
>   deferred_probe_work_func+0x70/0xa8
>   process_one_work+0x2a8/0x718
>   worker_thread+0x48/0x460
>   kthread+0x134/0x160
>   ret_from_fork+0x10/0x18
> Code: b1006294 540000c0 b140069f 54000088 (3940e280)
> ---[ end trace 7ead5ec2f0c51cfe ]---
> 
> This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls 
> devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL 
> dev->of_node. I'm not sure if adding a check for a NULL np in 
> of_clk_add_hw_provider() is a right fix, though.

I believe the right fix is not to call 'devm_of_clk_add_hw_provider()' if
'pdev->dev.of_node == NULL'. In such case, which is RPi3's, only the CPU clock
is used, and it's defined and queried later through
devm_clk_hw_register_clkdev().

@Marek, I don't mind taking care of it if it's OK with you.

Regards,
Nicolas



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-03-25 18:25             ` Nicolas Saenz Julienne
@ 2021-03-26 18:13               ` Stephen Boyd
  2021-03-26 18:29                 ` Geert Uytterhoeven
  0 siblings, 1 reply; 55+ messages in thread
From: Stephen Boyd @ 2021-03-26 18:13 UTC (permalink / raw)
  To: Marek Szyprowski, Nicolas Saenz Julienne, Tudor Ambarus, corbet,
	frowand.list, gregkh, khilman, len.brown, lenb, maz, mturquette,
	pavel, rafael, robh+dt, saravanak, tglx, ulf.hansson
  Cc: nicolas.ferre, claudiu.beznea, linux-doc, linux-kernel, linux-pm,
	linux-clk, devicetree, linux-acpi, geert, kernel-team,
	linux-rpi-kernel

Quoting Nicolas Saenz Julienne (2021-03-25 11:25:24)
> On Thu, 2021-03-25 at 14:31 +0100, Marek Szyprowski wrote:
> > Hi
> > 
> > On 10.02.2021 12:44, Tudor Ambarus wrote:
> > > This is a follow-up for:
> > > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> > > 
> > > The above commit updated the deprecated of_clk_add_provider(),
> > > but missed to update the preferred of_clk_add_hw_provider().
> > > Update it now.
> > > 
> > > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> > 
> > This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk: 
> > Mark fwnodes when their clock provider is added") causes the following 
> > NULL pointer dereference on Raspberry Pi 3b+ boards:
> > 
> > --->8---
> > 
> > raspberrypi-firmware soc:firmware: Attached to firmware from 
> > 2020-01-06T13:05:25
> > Unable to handle kernel NULL pointer dereference at virtual address 
> > 0000000000000050
> > Mem abort info:
> >    ESR = 0x96000004
> >    EC = 0x25: DABT (current EL), IL = 32 bits
> >    SET = 0, FnV = 0
> >    EA = 0, S1PTW = 0
> > Data abort info:
> >    ISV = 0, ISS = 0x00000004
> >    CM = 0, WnR = 0
> > [0000000000000050] user address but active_mm is swapper
> > Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
> > Hardware name: Raspberry Pi 3 Model B (DT)
> > Workqueue: events deferred_probe_work_func
> > pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
> > pc : of_clk_add_hw_provider+0xac/0xe8
> > lr : of_clk_add_hw_provider+0x94/0xe8
> > sp : ffff8000130936b0
> > x29: ffff8000130936b0 x28: ffff800012494e04
> > x27: ffff00003b18cb05 x26: ffff00003aa5c010
> > x25: 0000000000000000 x24: 0000000000000000
> > x23: ffff00003aa1e380 x22: ffff8000106830d0
> > x21: ffff80001233f180 x20: 0000000000000018
> > x19: 0000000000000000 x18: ffff8000124d38b0
> > x17: 0000000000000013 x16: 0000000000000014
> > x15: ffff8000125758b0 x14: 00000000000184e0
> > x13: 000000000000292e x12: ffff80001258dd98
> > x11: 0000000000000001 x10: 0101010101010101
> > x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
> > x7 : fefefefeff6c626f x6 : 5d636d8080808080
> > x5 : 00000000006d635d x4 : 0000000000000000
> > x3 : 0000000000000000 x2 : 540eb5edae191600
> > x1 : 0000000000000000 x0 : 0000000000000000
> > Call trace:
> >   of_clk_add_hw_provider+0xac/0xe8
> >   devm_of_clk_add_hw_provider+0x5c/0xb8
> >   raspberrypi_clk_probe+0x110/0x210
> >   platform_probe+0x90/0xd8
> >   really_probe+0x108/0x3c0
> >   driver_probe_device+0x60/0xc0
> >   __device_attach_driver+0x9c/0xd0
> >   bus_for_each_drv+0x70/0xc8
> >   __device_attach+0xec/0x150
> >   device_initial_probe+0x10/0x18
> >   bus_probe_device+0x94/0xa0
> >   device_add+0x47c/0x780
> >   platform_device_add+0x110/0x248
> >   platform_device_register_full+0x120/0x150
> >   rpi_firmware_probe+0x158/0x1f8
> >   platform_probe+0x90/0xd8
> >   really_probe+0x108/0x3c0
> >   driver_probe_device+0x60/0xc0
> >   __device_attach_driver+0x9c/0xd0
> >   bus_for_each_drv+0x70/0xc8
> >   __device_attach+0xec/0x150
> >   device_initial_probe+0x10/0x18
> >   bus_probe_device+0x94/0xa0
> >   deferred_probe_work_func+0x70/0xa8
> >   process_one_work+0x2a8/0x718
> >   worker_thread+0x48/0x460
> >   kthread+0x134/0x160
> >   ret_from_fork+0x10/0x18
> > Code: b1006294 540000c0 b140069f 54000088 (3940e280)
> > ---[ end trace 7ead5ec2f0c51cfe ]---
> > 
> > This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls 
> > devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL 
> > dev->of_node. I'm not sure if adding a check for a NULL np in 
> > of_clk_add_hw_provider() is a right fix, though.
> 
> I believe the right fix is not to call 'devm_of_clk_add_hw_provider()' if
> 'pdev->dev.of_node == NULL'. In such case, which is RPi3's, only the CPU clock
> is used, and it's defined and queried later through
> devm_clk_hw_register_clkdev().
> 
> @Marek, I don't mind taking care of it if it's OK with you.
> 

Ah I see this is related to the patch I just reviewed. Can you reference
this in the commit text? And instead of putting the change into the clk
provider let's check for NULL 'np' in of_clk_add_hw_provider() instead
and return 0 if there's nothing to do. That way we don't visit this
problem over and over again.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-03-26 18:13               ` Stephen Boyd
@ 2021-03-26 18:29                 ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2021-03-26 18:29 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Marek Szyprowski, Nicolas Saenz Julienne, Tudor Ambarus,
	Jonathan Corbet, Frank Rowand, Greg KH, Kevin Hilman, Len Brown,
	Len Brown, Marc Zyngier, Michael Turquette, Pavel Machek,
	Rafael J. Wysocki, Rob Herring, Saravana Kannan, Thomas Gleixner,
	Ulf Hansson, Nicolas Ferre, Claudiu Beznea,
	open list:DOCUMENTATION, Linux Kernel Mailing List,
	Linux PM list, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team, linux-rpi-kernel

Hi Stephen,

On Fri, Mar 26, 2021 at 7:13 PM Stephen Boyd <sboyd@kernel.org> wrote:
> Quoting Nicolas Saenz Julienne (2021-03-25 11:25:24)
> > On Thu, 2021-03-25 at 14:31 +0100, Marek Szyprowski wrote:
> > > On 10.02.2021 12:44, Tudor Ambarus wrote:
> > > > This is a follow-up for:
> > > > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> > > >
> > > > The above commit updated the deprecated of_clk_add_provider(),
> > > > but missed to update the preferred of_clk_add_hw_provider().
> > > > Update it now.
> > > >
> > > > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> > >
> > > This patch, which landed in linux-next as commit 6579c8d97ad7 ("clk:
> > > Mark fwnodes when their clock provider is added") causes the following
> > > NULL pointer dereference on Raspberry Pi 3b+ boards:
> > >
> > > --->8---
> > >
> > > raspberrypi-firmware soc:firmware: Attached to firmware from
> > > 2020-01-06T13:05:25
> > > Unable to handle kernel NULL pointer dereference at virtual address
> > > 0000000000000050
> > > Mem abort info:
> > >    ESR = 0x96000004
> > >    EC = 0x25: DABT (current EL), IL = 32 bits
> > >    SET = 0, FnV = 0
> > >    EA = 0, S1PTW = 0
> > > Data abort info:
> > >    ISV = 0, ISS = 0x00000004
> > >    CM = 0, WnR = 0
> > > [0000000000000050] user address but active_mm is swapper
> > > Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > > Modules linked in:
> > > CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 5.12.0-rc4+ #2764
> > > Hardware name: Raspberry Pi 3 Model B (DT)
> > > Workqueue: events deferred_probe_work_func
> > > pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > pc : of_clk_add_hw_provider+0xac/0xe8
> > > lr : of_clk_add_hw_provider+0x94/0xe8
> > > sp : ffff8000130936b0
> > > x29: ffff8000130936b0 x28: ffff800012494e04
> > > x27: ffff00003b18cb05 x26: ffff00003aa5c010
> > > x25: 0000000000000000 x24: 0000000000000000
> > > x23: ffff00003aa1e380 x22: ffff8000106830d0
> > > x21: ffff80001233f180 x20: 0000000000000018
> > > x19: 0000000000000000 x18: ffff8000124d38b0
> > > x17: 0000000000000013 x16: 0000000000000014
> > > x15: ffff8000125758b0 x14: 00000000000184e0
> > > x13: 000000000000292e x12: ffff80001258dd98
> > > x11: 0000000000000001 x10: 0101010101010101
> > > x9 : ffff80001233f288 x8 : 7f7f7f7f7f7f7f7f
> > > x7 : fefefefeff6c626f x6 : 5d636d8080808080
> > > x5 : 00000000006d635d x4 : 0000000000000000
> > > x3 : 0000000000000000 x2 : 540eb5edae191600
> > > x1 : 0000000000000000 x0 : 0000000000000000
> > > Call trace:
> > >   of_clk_add_hw_provider+0xac/0xe8
> > >   devm_of_clk_add_hw_provider+0x5c/0xb8
> > >   raspberrypi_clk_probe+0x110/0x210
> > >   platform_probe+0x90/0xd8
> > >   really_probe+0x108/0x3c0
> > >   driver_probe_device+0x60/0xc0
> > >   __device_attach_driver+0x9c/0xd0
> > >   bus_for_each_drv+0x70/0xc8
> > >   __device_attach+0xec/0x150
> > >   device_initial_probe+0x10/0x18
> > >   bus_probe_device+0x94/0xa0
> > >   device_add+0x47c/0x780
> > >   platform_device_add+0x110/0x248
> > >   platform_device_register_full+0x120/0x150
> > >   rpi_firmware_probe+0x158/0x1f8
> > >   platform_probe+0x90/0xd8
> > >   really_probe+0x108/0x3c0
> > >   driver_probe_device+0x60/0xc0
> > >   __device_attach_driver+0x9c/0xd0
> > >   bus_for_each_drv+0x70/0xc8
> > >   __device_attach+0xec/0x150
> > >   device_initial_probe+0x10/0x18
> > >   bus_probe_device+0x94/0xa0
> > >   deferred_probe_work_func+0x70/0xa8
> > >   process_one_work+0x2a8/0x718
> > >   worker_thread+0x48/0x460
> > >   kthread+0x134/0x160
> > >   ret_from_fork+0x10/0x18
> > > Code: b1006294 540000c0 b140069f 54000088 (3940e280)
> > > ---[ end trace 7ead5ec2f0c51cfe ]---
> > >
> > > This patch mainly revealed that clk/bcm/clk-raspberrypi.c driver calls
> > > devm_of_clk_add_hw_provider(), with a device pointer, which has a NULL
> > > dev->of_node. I'm not sure if adding a check for a NULL np in
> > > of_clk_add_hw_provider() is a right fix, though.
> >
> > I believe the right fix is not to call 'devm_of_clk_add_hw_provider()' if
> > 'pdev->dev.of_node == NULL'. In such case, which is RPi3's, only the CPU clock
> > is used, and it's defined and queried later through
> > devm_clk_hw_register_clkdev().
> >
> > @Marek, I don't mind taking care of it if it's OK with you.
> >
>
> Ah I see this is related to the patch I just reviewed. Can you reference
> this in the commit text? And instead of putting the change into the clk
> provider let's check for NULL 'np' in of_clk_add_hw_provider() instead
> and return 0 if there's nothing to do. That way we don't visit this
> problem over and over again.

I'm not sure the latter is what we reall want: shouldn't calling
*of*_clk_add_hw_provider() with a NULL np be a bug in the provider?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-10 11:44     ` [PATCH] clk: Mark fwnodes when their clock provider is added Tudor Ambarus
  2021-02-10 11:44       ` Tudor Ambarus
  2021-02-10 19:13       ` Saravana Kannan
@ 2021-03-30 15:42       ` Guenter Roeck
  2021-03-30 16:26         ` Saravana Kannan
  2 siblings, 1 reply; 55+ messages in thread
From: Guenter Roeck @ 2021-03-30 15:42 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: corbet, gregkh, rafael, khilman, ulf.hansson, len.brown, lenb,
	pavel, mturquette, sboyd, robh+dt, frowand.list, maz, tglx,
	saravanak, nicolas.ferre, claudiu.beznea, linux-doc,
	linux-kernel, linux-pm, linux-clk, devicetree, linux-acpi,
	m.szyprowski, geert, kernel-team

On Wed, Feb 10, 2021 at 01:44:34PM +0200, Tudor Ambarus wrote:
> This is a follow-up for:
> commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> 
> The above commit updated the deprecated of_clk_add_provider(),
> but missed to update the preferred of_clk_add_hw_provider().
> Update it now.
> 
> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> Reviewed-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/clk/clk.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 27ff90eacb1f..9370e4dfecae 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
>  	if (ret < 0)
>  		of_clk_del_provider(np);
>  
> +	fwnode_dev_initialized(&np->fwnode, true);
> +

This causes a crash when booting raspi2 images in qemu.

[   22.123618] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[   22.123830] pgd = (ptrval)
[   22.123992] [00000028] *pgd=00000000
[   22.124579] Internal error: Oops: 5 [#1] SMP ARM
...
[   22.141624] [<c0a2f4d8>] (of_clk_add_hw_provider) from [<c0a2f54c>] (devm_of_clk_add_hw_provider+0x48/0x80)
[   22.141819] [<c0a2f54c>] (devm_of_clk_add_hw_provider) from [<c0a43ad8>] (raspberrypi_clk_probe+0x25c/0x384)
[   22.141976] [<c0a43ad8>] (raspberrypi_clk_probe) from [<c0c18da0>] (platform_probe+0x5c/0xb8)
[   22.142114] [<c0c18da0>] (platform_probe) from [<c0c16654>] (really_probe+0xf0/0x39c)
[   22.142246] [<c0c16654>] (really_probe) from [<c0c16968>] (driver_probe_device+0x68/0xc0)
[   22.142377] [<c0c16968>] (driver_probe_device) from [<c0c14834>] (bus_for_each_drv+0x84/0xc8)...

np can (and will) be NULL here. See of_clk_set_defaults().

Guenter

---
Bisect log:

# bad: [9d49ed9ca93b8c564033c1d6808017bc9052b5db] Add linux-next specific files for 20210329
# good: [0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b] Linux 5.12-rc4
git bisect start 'HEAD' 'v5.12-rc4'
# good: [4b15c5b50e5ec7f2d80275d5a150807b7b9a0a0c] Merge remote-tracking branch 'crypto/master'
git bisect good 4b15c5b50e5ec7f2d80275d5a150807b7b9a0a0c
# good: [c92ad2dfc1a3bc31afc3d41af82c38accf2c3783] Merge remote-tracking branch 'ftrace/for-next'
git bisect good c92ad2dfc1a3bc31afc3d41af82c38accf2c3783
# bad: [103c8000962c33e274244a02a0383d20fab734c9] Merge remote-tracking branch 'staging/staging-next'
git bisect bad 103c8000962c33e274244a02a0383d20fab734c9
# bad: [05ee9b17ef7af7d47395caa48db55d0e2dae64a2] Merge remote-tracking branch 'tty/tty-next'
git bisect bad 05ee9b17ef7af7d47395caa48db55d0e2dae64a2
# good: [bd91c862df59c0290c8a2e2ee6467c0ed5fb6432] Merge remote-tracking branch 'workqueues/for-next'
git bisect good bd91c862df59c0290c8a2e2ee6467c0ed5fb6432
# good: [24327c478b2fc17a01b21a4721f35f22a51fe12b] usb: mtu3: drop CONFIG_OF
git bisect good 24327c478b2fc17a01b21a4721f35f22a51fe12b
# bad: [04c49226ac79b3e5bd0f89da69a64ba2e0cccb91] Merge remote-tracking branch 'usb-serial/usb-next'
git bisect bad 04c49226ac79b3e5bd0f89da69a64ba2e0cccb91
# good: [21a292d73a6e69f31c57d36288482c40b355daf0] Merge remote-tracking branch 'drivers-x86/for-next'
git bisect good 21a292d73a6e69f31c57d36288482c40b355daf0
# bad: [b6688015151857ed3f61fa2344c4b220bc9dc4d7] regulator: qcom_spmi-regulator: Clean-up by using managed work init
git bisect bad b6688015151857ed3f61fa2344c4b220bc9dc4d7
# good: [ea718c699055c8566eb64432388a04974c43b2ea] Revert "Revert "driver core: Set fw_devlink=on by default""
git bisect good ea718c699055c8566eb64432388a04974c43b2ea
# bad: [0341ce5443949588e93581b49b934cdde2befbf8] workqueue: Add resource managed version of delayed work init
git bisect bad 0341ce5443949588e93581b49b934cdde2befbf8
# bad: [0b8bf06f67191e6a3184802a690d3f521c6d7e78] device property: Sync descriptions of swnode array and group APIs
git bisect bad 0b8bf06f67191e6a3184802a690d3f521c6d7e78
# bad: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added
git bisect bad 6579c8d97ad7fc5671ee60234f3b8388abee5f77
# first bad commit: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-03-30 15:42       ` Guenter Roeck
@ 2021-03-30 16:26         ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-03-30 16:26 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Tudor Ambarus, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Kevin Hilman, Ulf Hansson, Brown, Len,
	Len Brown, Pavel Machek, Michael Turquette, Stephen Boyd,
	Rob Herring, Frank Rowand, Marc Zyngier, Thomas Gleixner,
	Nicolas Ferre, Claudiu Beznea, Linux Doc Mailing List, LKML,
	Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Tue, Mar 30, 2021 at 8:42 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Wed, Feb 10, 2021 at 01:44:34PM +0200, Tudor Ambarus wrote:
> > This is a follow-up for:
> > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> >
> > The above commit updated the deprecated of_clk_add_provider(),
> > but missed to update the preferred of_clk_add_hw_provider().
> > Update it now.
> >
> > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
> > Reviewed-by: Saravana Kannan <saravanak@google.com>
> > ---
> >  drivers/clk/clk.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > index 27ff90eacb1f..9370e4dfecae 100644
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -4594,6 +4594,8 @@ int of_clk_add_hw_provider(struct device_node *np,
> >       if (ret < 0)
> >               of_clk_del_provider(np);
> >
> > +     fwnode_dev_initialized(&np->fwnode, true);
> > +
>
> This causes a crash when booting raspi2 images in qemu.
>
> [   22.123618] Unable to handle kernel NULL pointer dereference at virtual address 00000028
> [   22.123830] pgd = (ptrval)
> [   22.123992] [00000028] *pgd=00000000
> [   22.124579] Internal error: Oops: 5 [#1] SMP ARM
> ...
> [   22.141624] [<c0a2f4d8>] (of_clk_add_hw_provider) from [<c0a2f54c>] (devm_of_clk_add_hw_provider+0x48/0x80)
> [   22.141819] [<c0a2f54c>] (devm_of_clk_add_hw_provider) from [<c0a43ad8>] (raspberrypi_clk_probe+0x25c/0x384)
> [   22.141976] [<c0a43ad8>] (raspberrypi_clk_probe) from [<c0c18da0>] (platform_probe+0x5c/0xb8)
> [   22.142114] [<c0c18da0>] (platform_probe) from [<c0c16654>] (really_probe+0xf0/0x39c)
> [   22.142246] [<c0c16654>] (really_probe) from [<c0c16968>] (driver_probe_device+0x68/0xc0)
> [   22.142377] [<c0c16968>] (driver_probe_device) from [<c0c14834>] (bus_for_each_drv+0x84/0xc8)...
>
> np can (and will) be NULL here. See of_clk_set_defaults().

Thanks for the report. It was reported earlier by Marek and there's a
discussion going on about it in the thread.

-Saravana

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-02-10 11:44       ` Tudor Ambarus
  2021-02-11 13:00         ` Greg KH
       [not found]         ` <CGME20210325133159eucas1p297b769beb681743fb32d362a86cc6e3e@eucas1p2.samsung.com>
@ 2021-04-21  3:26         ` Guenter Roeck
  2021-04-21  7:00           ` Saravana Kannan
  2 siblings, 1 reply; 55+ messages in thread
From: Guenter Roeck @ 2021-04-21  3:26 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: corbet, gregkh, rafael, khilman, ulf.hansson, len.brown, lenb,
	pavel, mturquette, sboyd, robh+dt, frowand.list, maz, tglx,
	saravanak, nicolas.ferre, claudiu.beznea, linux-doc,
	linux-kernel, linux-pm, linux-clk, devicetree, linux-acpi,
	m.szyprowski, geert, kernel-team

Hi,

On Wed, Feb 10, 2021 at 01:44:35PM +0200, Tudor Ambarus wrote:
> This is a follow-up for:
> commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> 
> The above commit updated the deprecated of_clk_add_provider(),
> but missed to update the preferred of_clk_add_hw_provider().
> Update it now.
> 
> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>

This patch still causes a crash when booting a raspi2 image in linux-next.

[   21.456500] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[   21.456750] pgd = (ptrval)
[   21.456927] [00000028] *pgd=00000000
[   21.457567] Internal error: Oops: 5 [#1] SMP ARM
[   21.457882] Modules linked in:
[   21.458077] CPU: 0 PID: 77 Comm: kworker/u8:10 Not tainted 5.12.0-rc8-next-20210420 #1
[   21.458291] Hardware name: BCM2835
[   21.458525] Workqueue: events_unbound deferred_probe_work_func
[   21.458997] PC is at of_clk_add_hw_provider+0xbc/0xe8
[   21.459176] LR is at of_clk_add_hw_provider+0xa8/0xe8
...
[   21.477603] [<c0a32aec>] (of_clk_add_hw_provider) from [<c0a32b60>] (devm_of_clk_add_hw_provider+0x48/0x80)
[   21.477861] [<c0a32b60>] (devm_of_clk_add_hw_provider) from [<c0a471e4>] (raspberrypi_clk_probe+0x260/0x388)
[   21.478087] [<c0a471e4>] (raspberrypi_clk_probe) from [<c0c1c4d0>] (platform_probe+0x5c/0xb8)
[   21.478287] [<c0c1c4d0>] (platform_probe) from [<c0c19d84>] (really_probe+0xf0/0x39c)
[   21.478471] [<c0c19d84>] (really_probe) from [<c0c1a098>] (driver_probe_device+0x68/0xc0)
[   21.478659] [<c0c1a098>] (driver_probe_device) from [<c0c17f54>] (bus_for_each_drv+0x84/0xc8)
[   21.478860] [<c0c17f54>] (bus_for_each_drv) from [<c0c19c20>] (__device_attach+0xec/0x158)
[   21.479050] [<c0c19c20>] (__device_attach) from [<c0c18de8>] (bus_probe_device+0x88/0x90)
[   21.479236] [<c0c18de8>] (bus_probe_device) from [<c0c16a68>] (device_add+0x398/0x8ac)
[   21.479421] [<c0c16a68>] (device_add) from [<c0c1c1b4>] (platform_device_add+0xf0/0x200)
[   21.479607] [<c0c1c1b4>] (platform_device_add) from [<c0c1ccc0>] (platform_device_register_full+0xd0/0x110)
[   21.479836] [<c0c1ccc0>] (platform_device_register_full) from [<c104c130>] (rpi_firmware_probe+0x1a4/0x20c)
[   21.480061] [<c104c130>] (rpi_firmware_probe) from [<c0c1c4d0>] (platform_probe+0x5c/0xb8)
[   21.480255] [<c0c1c4d0>] (platform_probe) from [<c0c19d84>] (really_probe+0xf0/0x39c)
[   21.480437] [<c0c19d84>] (really_probe) from [<c0c1a098>] (driver_probe_device+0x68/0xc0)
[   21.480626] [<c0c1a098>] (driver_probe_device) from [<c0c17f54>] (bus_for_each_drv+0x84/0xc8)
[   21.480829] [<c0c17f54>] (bus_for_each_drv) from [<c0c19c20>] (__device_attach+0xec/0x158)
[   21.481018] [<c0c19c20>] (__device_attach) from [<c0c18de8>] (bus_probe_device+0x88/0x90)
[   21.481205] [<c0c18de8>] (bus_probe_device) from [<c0c192bc>] (deferred_probe_work_func+0x8c/0xbc)
[   21.481413] [<c0c192bc>] (deferred_probe_work_func) from [<c036802c>] (process_one_work+0x268/0x798)
[   21.481624] [<c036802c>] (process_one_work) from [<c0368774>] (worker_thread+0x218/0x4f4)
[   21.481822] [<c0368774>] (worker_thread) from [<c0370f28>] (kthread+0x140/0x174)
[   21.481999] [<c0370f28>] (kthread) from [<c030017c>] (ret_from_fork+0x14/0x38)
[   21.482185] Exception stack(0xc42b7fb0 to 0xc42b7ff8)

Updated bisect log is attached.

Guenter

---
# bad: [50b8b1d699ac313c0a07a3c185ffb23aecab8abb] Add linux-next specific files for 20210419
# good: [bf05bf16c76bb44ab5156223e1e58e26dfe30a88] Linux 5.12-rc8
git bisect start 'HEAD' 'v5.12-rc8'
# good: [c4bb91fc07e59241cde97f913d7a2fbedc248f0d] Merge remote-tracking branch 'crypto/master'
git bisect good c4bb91fc07e59241cde97f913d7a2fbedc248f0d
# good: [f15bbf170b40b48a43ed7076ce9f8ac9380e5752] Merge remote-tracking branch 'edac/edac-for-next'
git bisect good f15bbf170b40b48a43ed7076ce9f8ac9380e5752
# bad: [550a78090dcc4061e191312a757a127f0b6e6323] Merge remote-tracking branch 'vfio/next'
git bisect bad 550a78090dcc4061e191312a757a127f0b6e6323
# bad: [9f074d2a7bf49b2c9e1609703757b18de7611aef] Merge remote-tracking branch 'usb/usb-next'
git bisect bad 9f074d2a7bf49b2c9e1609703757b18de7611aef
# good: [855b2fdb7c543c94e7623e6ad0b492f04a5317db] Merge remote-tracking branch 'percpu/for-next'
git bisect good 855b2fdb7c543c94e7623e6ad0b492f04a5317db
# good: [1d08ed588c6a85a35a24c82eb4cf0807ec2b366a] usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
git bisect good 1d08ed588c6a85a35a24c82eb4cf0807ec2b366a
# good: [1b7ce8fab5fd0c406dbf165b12d44b301decf589] Merge remote-tracking branch 'ipmi/for-next'
git bisect good 1b7ce8fab5fd0c406dbf165b12d44b301decf589
# good: [fe8e488058c47e9a8a2c85321f7198a0a17b0131] dt-bindings: usb: mtk-xhci: add wakeup interrupt
git bisect good fe8e488058c47e9a8a2c85321f7198a0a17b0131
# bad: [3c652132ce9052e626bf509932fcacfebed1ccb4] platform-msi: fix kernel-doc warnings
git bisect bad 3c652132ce9052e626bf509932fcacfebed1ccb4
# bad: [7f2fac70b729d68a34e5eba8d1fb68eb69b05169] device property: Add test cases for fwnode_property_count_*() APIs
git bisect bad 7f2fac70b729d68a34e5eba8d1fb68eb69b05169
# good: [38f087de8947700d3b06d3d1594490e0f611c5d1] devtmpfs: fix placement of complete() call
git bisect good 38f087de8947700d3b06d3d1594490e0f611c5d1
# good: [b6f617df4fa936c1ab1831c2b23563f6c1add6c4] driver core: Update device link status properly for device_bind_driver()
git bisect good b6f617df4fa936c1ab1831c2b23563f6c1add6c4
# bad: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added
git bisect bad 6579c8d97ad7fc5671ee60234f3b8388abee5f77
# good: [ea718c699055c8566eb64432388a04974c43b2ea] Revert "Revert "driver core: Set fw_devlink=on by default""
git bisect good ea718c699055c8566eb64432388a04974c43b2ea
# first bad commit: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] clk: Mark fwnodes when their clock provider is added
  2021-04-21  3:26         ` Guenter Roeck
@ 2021-04-21  7:00           ` Saravana Kannan
  0 siblings, 0 replies; 55+ messages in thread
From: Saravana Kannan @ 2021-04-21  7:00 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Tudor Ambarus, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Kevin Hilman, Ulf Hansson, Brown, Len,
	Len Brown, Pavel Machek, Michael Turquette, Stephen Boyd,
	Rob Herring, Frank Rowand, Marc Zyngier, Thomas Gleixner,
	Nicolas Ferre, Claudiu Beznea, Linux Doc Mailing List, LKML,
	Linux PM, linux-clk,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Marek Szyprowski, Geert Uytterhoeven,
	Android Kernel Team

On Tue, Apr 20, 2021 at 8:27 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi,
>
> On Wed, Feb 10, 2021 at 01:44:35PM +0200, Tudor Ambarus wrote:
> > This is a follow-up for:
> > commit 3c9ea42802a1 ("clk: Mark fwnodes when their clock provider is added/removed")
> >
> > The above commit updated the deprecated of_clk_add_provider(),
> > but missed to update the preferred of_clk_add_hw_provider().
> > Update it now.
> >
> > Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
>
> This patch still causes a crash when booting a raspi2 image in linux-next.

Stephen,

Can we please just pick any one of the proposed fixes? This bug has
been unfixed for so long!

-Saravana

>
> [   21.456500] Unable to handle kernel NULL pointer dereference at virtual address 00000028
> [   21.456750] pgd = (ptrval)
> [   21.456927] [00000028] *pgd=00000000
> [   21.457567] Internal error: Oops: 5 [#1] SMP ARM
> [   21.457882] Modules linked in:
> [   21.458077] CPU: 0 PID: 77 Comm: kworker/u8:10 Not tainted 5.12.0-rc8-next-20210420 #1
> [   21.458291] Hardware name: BCM2835
> [   21.458525] Workqueue: events_unbound deferred_probe_work_func
> [   21.458997] PC is at of_clk_add_hw_provider+0xbc/0xe8
> [   21.459176] LR is at of_clk_add_hw_provider+0xa8/0xe8
> ...
> [   21.477603] [<c0a32aec>] (of_clk_add_hw_provider) from [<c0a32b60>] (devm_of_clk_add_hw_provider+0x48/0x80)
> [   21.477861] [<c0a32b60>] (devm_of_clk_add_hw_provider) from [<c0a471e4>] (raspberrypi_clk_probe+0x260/0x388)
> [   21.478087] [<c0a471e4>] (raspberrypi_clk_probe) from [<c0c1c4d0>] (platform_probe+0x5c/0xb8)
> [   21.478287] [<c0c1c4d0>] (platform_probe) from [<c0c19d84>] (really_probe+0xf0/0x39c)
> [   21.478471] [<c0c19d84>] (really_probe) from [<c0c1a098>] (driver_probe_device+0x68/0xc0)
> [   21.478659] [<c0c1a098>] (driver_probe_device) from [<c0c17f54>] (bus_for_each_drv+0x84/0xc8)
> [   21.478860] [<c0c17f54>] (bus_for_each_drv) from [<c0c19c20>] (__device_attach+0xec/0x158)
> [   21.479050] [<c0c19c20>] (__device_attach) from [<c0c18de8>] (bus_probe_device+0x88/0x90)
> [   21.479236] [<c0c18de8>] (bus_probe_device) from [<c0c16a68>] (device_add+0x398/0x8ac)
> [   21.479421] [<c0c16a68>] (device_add) from [<c0c1c1b4>] (platform_device_add+0xf0/0x200)
> [   21.479607] [<c0c1c1b4>] (platform_device_add) from [<c0c1ccc0>] (platform_device_register_full+0xd0/0x110)
> [   21.479836] [<c0c1ccc0>] (platform_device_register_full) from [<c104c130>] (rpi_firmware_probe+0x1a4/0x20c)
> [   21.480061] [<c104c130>] (rpi_firmware_probe) from [<c0c1c4d0>] (platform_probe+0x5c/0xb8)
> [   21.480255] [<c0c1c4d0>] (platform_probe) from [<c0c19d84>] (really_probe+0xf0/0x39c)
> [   21.480437] [<c0c19d84>] (really_probe) from [<c0c1a098>] (driver_probe_device+0x68/0xc0)
> [   21.480626] [<c0c1a098>] (driver_probe_device) from [<c0c17f54>] (bus_for_each_drv+0x84/0xc8)
> [   21.480829] [<c0c17f54>] (bus_for_each_drv) from [<c0c19c20>] (__device_attach+0xec/0x158)
> [   21.481018] [<c0c19c20>] (__device_attach) from [<c0c18de8>] (bus_probe_device+0x88/0x90)
> [   21.481205] [<c0c18de8>] (bus_probe_device) from [<c0c192bc>] (deferred_probe_work_func+0x8c/0xbc)
> [   21.481413] [<c0c192bc>] (deferred_probe_work_func) from [<c036802c>] (process_one_work+0x268/0x798)
> [   21.481624] [<c036802c>] (process_one_work) from [<c0368774>] (worker_thread+0x218/0x4f4)
> [   21.481822] [<c0368774>] (worker_thread) from [<c0370f28>] (kthread+0x140/0x174)
> [   21.481999] [<c0370f28>] (kthread) from [<c030017c>] (ret_from_fork+0x14/0x38)
> [   21.482185] Exception stack(0xc42b7fb0 to 0xc42b7ff8)
>
> Updated bisect log is attached.
>
> Guenter
>
> ---
> # bad: [50b8b1d699ac313c0a07a3c185ffb23aecab8abb] Add linux-next specific files for 20210419
> # good: [bf05bf16c76bb44ab5156223e1e58e26dfe30a88] Linux 5.12-rc8
> git bisect start 'HEAD' 'v5.12-rc8'
> # good: [c4bb91fc07e59241cde97f913d7a2fbedc248f0d] Merge remote-tracking branch 'crypto/master'
> git bisect good c4bb91fc07e59241cde97f913d7a2fbedc248f0d
> # good: [f15bbf170b40b48a43ed7076ce9f8ac9380e5752] Merge remote-tracking branch 'edac/edac-for-next'
> git bisect good f15bbf170b40b48a43ed7076ce9f8ac9380e5752
> # bad: [550a78090dcc4061e191312a757a127f0b6e6323] Merge remote-tracking branch 'vfio/next'
> git bisect bad 550a78090dcc4061e191312a757a127f0b6e6323
> # bad: [9f074d2a7bf49b2c9e1609703757b18de7611aef] Merge remote-tracking branch 'usb/usb-next'
> git bisect bad 9f074d2a7bf49b2c9e1609703757b18de7611aef
> # good: [855b2fdb7c543c94e7623e6ad0b492f04a5317db] Merge remote-tracking branch 'percpu/for-next'
> git bisect good 855b2fdb7c543c94e7623e6ad0b492f04a5317db
> # good: [1d08ed588c6a85a35a24c82eb4cf0807ec2b366a] usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
> git bisect good 1d08ed588c6a85a35a24c82eb4cf0807ec2b366a
> # good: [1b7ce8fab5fd0c406dbf165b12d44b301decf589] Merge remote-tracking branch 'ipmi/for-next'
> git bisect good 1b7ce8fab5fd0c406dbf165b12d44b301decf589
> # good: [fe8e488058c47e9a8a2c85321f7198a0a17b0131] dt-bindings: usb: mtk-xhci: add wakeup interrupt
> git bisect good fe8e488058c47e9a8a2c85321f7198a0a17b0131
> # bad: [3c652132ce9052e626bf509932fcacfebed1ccb4] platform-msi: fix kernel-doc warnings
> git bisect bad 3c652132ce9052e626bf509932fcacfebed1ccb4
> # bad: [7f2fac70b729d68a34e5eba8d1fb68eb69b05169] device property: Add test cases for fwnode_property_count_*() APIs
> git bisect bad 7f2fac70b729d68a34e5eba8d1fb68eb69b05169
> # good: [38f087de8947700d3b06d3d1594490e0f611c5d1] devtmpfs: fix placement of complete() call
> git bisect good 38f087de8947700d3b06d3d1594490e0f611c5d1
> # good: [b6f617df4fa936c1ab1831c2b23563f6c1add6c4] driver core: Update device link status properly for device_bind_driver()
> git bisect good b6f617df4fa936c1ab1831c2b23563f6c1add6c4
> # bad: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added
> git bisect bad 6579c8d97ad7fc5671ee60234f3b8388abee5f77
> # good: [ea718c699055c8566eb64432388a04974c43b2ea] Revert "Revert "driver core: Set fw_devlink=on by default""
> git bisect good ea718c699055c8566eb64432388a04974c43b2ea
> # first bad commit: [6579c8d97ad7fc5671ee60234f3b8388abee5f77] clk: Mark fwnodes when their clock provider is added

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, back to index

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20210205222651eucas1p28ef87073dea33c1c5224c14aa203bec5@eucas1p2.samsung.com>
2021-02-05 22:26 ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 1/8] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 2/8] of: property: Don't add links to absent suppliers Saravana Kannan
2021-02-09 21:25     ` Rob Herring
2021-02-05 22:26   ` [PATCH v4 3/8] driver core: Add fw_devlink.strict kernel param Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 4/8] of: property: Add fw_devlink support for optional properties Saravana Kannan
2021-02-09 21:33     ` Rob Herring
2021-02-09 21:54       ` Saravana Kannan
2021-02-10  8:25         ` Geert Uytterhoeven
2021-02-05 22:26   ` [PATCH v4 5/8] driver core: fw_devlink: Handle suppliers that don't use driver core Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 6/8] irqdomain: Mark fwnodes when their irqdomain is added/removed Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 7/8] PM: domains: Mark fwnodes when their powerdomain " Saravana Kannan
2021-02-05 22:26   ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider " Saravana Kannan
2021-02-08 15:38     ` Rob Herring
2021-02-08 23:34       ` Saravana Kannan
2021-02-10 11:44     ` [PATCH] clk: Mark fwnodes when their clock provider is added Tudor Ambarus
2021-02-10 11:44       ` Tudor Ambarus
2021-02-11 13:00         ` Greg KH
2021-02-13  0:37           ` Stephen Boyd
     [not found]         ` <CGME20210325133159eucas1p297b769beb681743fb32d362a86cc6e3e@eucas1p2.samsung.com>
2021-03-25 13:31           ` Marek Szyprowski
2021-03-25 15:47             ` Geert Uytterhoeven
2021-03-25 18:25             ` Nicolas Saenz Julienne
2021-03-26 18:13               ` Stephen Boyd
2021-03-26 18:29                 ` Geert Uytterhoeven
2021-04-21  3:26         ` Guenter Roeck
2021-04-21  7:00           ` Saravana Kannan
2021-02-10 19:13       ` Saravana Kannan
2021-03-30 15:42       ` Guenter Roeck
2021-03-30 16:26         ` Saravana Kannan
     [not found]     ` <161317679292.1254594.15797939257637374295@swboyd.mtv.corp.google.com>
2021-02-14 21:15       ` [PATCH v4 8/8] clk: Mark fwnodes when their clock provider is added/removed Saravana Kannan
2021-02-06  2:45   ` [PATCH v4 0/8] Make fw_devlink=on more forgiving Saravana Kannan
2021-02-06 19:41   ` Geert Uytterhoeven
2021-02-06 20:47     ` Saravana Kannan
2021-02-08  8:40   ` Marek Szyprowski
2021-02-08 23:57     ` Saravana Kannan
2021-02-10  8:19   ` Tudor.Ambarus
2021-02-10  8:54     ` Saravana Kannan
2021-02-10 10:02       ` Tudor.Ambarus
2021-02-10 19:14         ` Saravana Kannan
2021-02-11 13:00   ` Geert Uytterhoeven
2021-02-11 13:05     ` Geert Uytterhoeven
2021-02-12  2:59     ` Saravana Kannan
2021-02-12  8:14       ` Geert Uytterhoeven
2021-02-12 20:57         ` Saravana Kannan
2021-02-15 12:38       ` Geert Uytterhoeven
2021-02-15 21:26         ` Saravana Kannan
2021-02-16  8:05           ` Geert Uytterhoeven
2021-02-16 18:48             ` Saravana Kannan
2021-02-16 20:31               ` Geert Uytterhoeven
2021-02-17 23:57                 ` Saravana Kannan
2021-02-25  9:21                   ` Geert Uytterhoeven
2021-02-15 15:16       ` Geert Uytterhoeven
2021-02-15 21:57         ` Saravana Kannan
2021-02-16 12:58           ` Geert Uytterhoeven
2021-02-15 11:19     ` Geert Uytterhoeven

Linux-ACPI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-acpi/0 linux-acpi/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-acpi linux-acpi/ https://lore.kernel.org/linux-acpi \
		linux-acpi@vger.kernel.org
	public-inbox-index linux-acpi

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-acpi


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git