All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
@ 2018-12-17  7:56 ` frowand.list
  0 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Michael Ellerman, Tyrel Datwyler, Thomas Falcon, Juliet Kim,
	devicetree, linux-kernel

From: Frank Rowand <frank.rowand@sony.com>

Non-overlay dynamic devicetree node removal may leave the node in
the phandle cache.  Subsequent calls to of_find_node_by_phandle()
will incorrectly find the stale entry.  This bug exposed the foloowing
phandle cache refcount bug.

The refcount of phandle_cache entries is not incremented while in
the cache, allowing use after free error after kfree() of the
cached entry.

Changes since v1:
  - make __of_free_phandle_cache() static
  - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
  
Frank Rowand (2):
  of: of_node_get()/of_node_put() nodes held in phandle cache
  of: __of_detach_node() - remove node from phandle cache

 drivers/of/base.c       | 100 ++++++++++++++++++++++++++++++++++++------------
 drivers/of/dynamic.c    |   3 ++
 drivers/of/of_private.h |   4 ++
 3 files changed, 82 insertions(+), 25 deletions(-)

-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
@ 2018-12-17  7:56 ` frowand.list
  0 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Thomas Falcon, linux-kernel, Juliet Kim, Tyrel Datwyler

From: Frank Rowand <frank.rowand@sony.com>

Non-overlay dynamic devicetree node removal may leave the node in
the phandle cache.  Subsequent calls to of_find_node_by_phandle()
will incorrectly find the stale entry.  This bug exposed the foloowing
phandle cache refcount bug.

The refcount of phandle_cache entries is not incremented while in
the cache, allowing use after free error after kfree() of the
cached entry.

Changes since v1:
  - make __of_free_phandle_cache() static
  - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
  
Frank Rowand (2):
  of: of_node_get()/of_node_put() nodes held in phandle cache
  of: __of_detach_node() - remove node from phandle cache

 drivers/of/base.c       | 100 ++++++++++++++++++++++++++++++++++++------------
 drivers/of/dynamic.c    |   3 ++
 drivers/of/of_private.h |   4 ++
 3 files changed, 82 insertions(+), 25 deletions(-)

-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/2] of: of_node_get()/of_node_put() nodes held in phandle cache
  2018-12-17  7:56 ` frowand.list
@ 2018-12-17  7:56   ` frowand.list
  -1 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Michael Ellerman, Tyrel Datwyler, Thomas Falcon, Juliet Kim,
	devicetree, linux-kernel

From: Frank Rowand <frank.rowand@sony.com>

The phandle cache contains struct device_node pointers.  The refcount
of the pointers was not incremented while in the cache, allowing use
after free error after kfree() of the node.  Add the proper increment
and decrement of the use count.

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Signed-off-by: Frank Rowand <frank.rowand@sony.com>
---

changes since v1
  - make __of_free_phandle_cache() static

 drivers/of/base.c | 70 ++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 24 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 09692c9b32a7..6c33d63361b8 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -116,9 +116,6 @@ int __weak of_node_to_nid(struct device_node *np)
 }
 #endif
 
-static struct device_node **phandle_cache;
-static u32 phandle_cache_mask;
-
 /*
  * Assumptions behind phandle_cache implementation:
  *   - phandle property values are in a contiguous range of 1..n
@@ -127,6 +124,44 @@ int __weak of_node_to_nid(struct device_node *np)
  *   - the phandle lookup overhead reduction provided by the cache
  *     will likely be less
  */
+
+static struct device_node **phandle_cache;
+static u32 phandle_cache_mask;
+
+/*
+ * Caller must hold devtree_lock.
+ */
+static void __of_free_phandle_cache(void)
+{
+	u32 cache_entries = phandle_cache_mask + 1;
+	u32 k;
+
+	if (!phandle_cache)
+		return;
+
+	for (k = 0; k < cache_entries; k++)
+		of_node_put(phandle_cache[k]);
+
+	kfree(phandle_cache);
+	phandle_cache = NULL;
+}
+
+int of_free_phandle_cache(void)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&devtree_lock, flags);
+
+	__of_free_phandle_cache();
+
+	raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+	return 0;
+}
+#if !defined(CONFIG_MODULES)
+late_initcall_sync(of_free_phandle_cache);
+#endif
+
 void of_populate_phandle_cache(void)
 {
 	unsigned long flags;
@@ -136,8 +171,7 @@ void of_populate_phandle_cache(void)
 
 	raw_spin_lock_irqsave(&devtree_lock, flags);
 
-	kfree(phandle_cache);
-	phandle_cache = NULL;
+	__of_free_phandle_cache();
 
 	for_each_of_allnodes(np)
 		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
@@ -155,30 +189,15 @@ void of_populate_phandle_cache(void)
 		goto out;
 
 	for_each_of_allnodes(np)
-		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
+			of_node_get(np);
 			phandle_cache[np->phandle & phandle_cache_mask] = np;
+		}
 
 out:
 	raw_spin_unlock_irqrestore(&devtree_lock, flags);
 }
 
-int of_free_phandle_cache(void)
-{
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&devtree_lock, flags);
-
-	kfree(phandle_cache);
-	phandle_cache = NULL;
-
-	raw_spin_unlock_irqrestore(&devtree_lock, flags);
-
-	return 0;
-}
-#if !defined(CONFIG_MODULES)
-late_initcall_sync(of_free_phandle_cache);
-#endif
-
 void __init of_core_init(void)
 {
 	struct device_node *np;
@@ -1195,8 +1214,11 @@ struct device_node *of_find_node_by_phandle(phandle handle)
 	if (!np) {
 		for_each_of_allnodes(np)
 			if (np->phandle == handle) {
-				if (phandle_cache)
+				if (phandle_cache) {
+					/* will put when removed from cache */
+					of_node_get(np);
 					phandle_cache[masked_handle] = np;
+				}
 				break;
 			}
 	}
-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 1/2] of: of_node_get()/of_node_put() nodes held in phandle cache
@ 2018-12-17  7:56   ` frowand.list
  0 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Thomas Falcon, linux-kernel, Juliet Kim, Tyrel Datwyler

From: Frank Rowand <frank.rowand@sony.com>

The phandle cache contains struct device_node pointers.  The refcount
of the pointers was not incremented while in the cache, allowing use
after free error after kfree() of the node.  Add the proper increment
and decrement of the use count.

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Signed-off-by: Frank Rowand <frank.rowand@sony.com>
---

changes since v1
  - make __of_free_phandle_cache() static

 drivers/of/base.c | 70 ++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 24 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 09692c9b32a7..6c33d63361b8 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -116,9 +116,6 @@ int __weak of_node_to_nid(struct device_node *np)
 }
 #endif
 
-static struct device_node **phandle_cache;
-static u32 phandle_cache_mask;
-
 /*
  * Assumptions behind phandle_cache implementation:
  *   - phandle property values are in a contiguous range of 1..n
@@ -127,6 +124,44 @@ int __weak of_node_to_nid(struct device_node *np)
  *   - the phandle lookup overhead reduction provided by the cache
  *     will likely be less
  */
+
+static struct device_node **phandle_cache;
+static u32 phandle_cache_mask;
+
+/*
+ * Caller must hold devtree_lock.
+ */
+static void __of_free_phandle_cache(void)
+{
+	u32 cache_entries = phandle_cache_mask + 1;
+	u32 k;
+
+	if (!phandle_cache)
+		return;
+
+	for (k = 0; k < cache_entries; k++)
+		of_node_put(phandle_cache[k]);
+
+	kfree(phandle_cache);
+	phandle_cache = NULL;
+}
+
+int of_free_phandle_cache(void)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&devtree_lock, flags);
+
+	__of_free_phandle_cache();
+
+	raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+	return 0;
+}
+#if !defined(CONFIG_MODULES)
+late_initcall_sync(of_free_phandle_cache);
+#endif
+
 void of_populate_phandle_cache(void)
 {
 	unsigned long flags;
@@ -136,8 +171,7 @@ void of_populate_phandle_cache(void)
 
 	raw_spin_lock_irqsave(&devtree_lock, flags);
 
-	kfree(phandle_cache);
-	phandle_cache = NULL;
+	__of_free_phandle_cache();
 
 	for_each_of_allnodes(np)
 		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
@@ -155,30 +189,15 @@ void of_populate_phandle_cache(void)
 		goto out;
 
 	for_each_of_allnodes(np)
-		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
+			of_node_get(np);
 			phandle_cache[np->phandle & phandle_cache_mask] = np;
+		}
 
 out:
 	raw_spin_unlock_irqrestore(&devtree_lock, flags);
 }
 
-int of_free_phandle_cache(void)
-{
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&devtree_lock, flags);
-
-	kfree(phandle_cache);
-	phandle_cache = NULL;
-
-	raw_spin_unlock_irqrestore(&devtree_lock, flags);
-
-	return 0;
-}
-#if !defined(CONFIG_MODULES)
-late_initcall_sync(of_free_phandle_cache);
-#endif
-
 void __init of_core_init(void)
 {
 	struct device_node *np;
@@ -1195,8 +1214,11 @@ struct device_node *of_find_node_by_phandle(phandle handle)
 	if (!np) {
 		for_each_of_allnodes(np)
 			if (np->phandle == handle) {
-				if (phandle_cache)
+				if (phandle_cache) {
+					/* will put when removed from cache */
+					of_node_get(np);
 					phandle_cache[masked_handle] = np;
+				}
 				break;
 			}
 	}
-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-17  7:56 ` frowand.list
@ 2018-12-17  7:56   ` frowand.list
  -1 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Michael Ellerman, Tyrel Datwyler, Thomas Falcon, Juliet Kim,
	devicetree, linux-kernel

From: Frank Rowand <frank.rowand@sony.com>

Non-overlay dynamic devicetree node removal may leave the node in
the phandle cache.  Subsequent calls to of_find_node_by_phandle()
will incorrectly find the stale entry.  Remove the node from the
cache.

Add paranoia checks in of_find_node_by_phandle() as a second level
of defense (do not return cached node if detached, do not add node
to cache if detached).

Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
---

changes since v1:
  - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()

 drivers/of/base.c       | 30 +++++++++++++++++++++++++++++-
 drivers/of/dynamic.c    |  3 +++
 drivers/of/of_private.h |  4 ++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 6c33d63361b8..ad71864cecf5 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
 late_initcall_sync(of_free_phandle_cache);
 #endif
 
+/*
+ * Caller must hold devtree_lock.
+ */
+void __of_free_phandle_cache_entry(phandle handle)
+{
+	phandle masked_handle;
+
+	if (!handle)
+		return;
+
+	masked_handle = handle & phandle_cache_mask;
+
+	if (phandle_cache) {
+		if (phandle_cache[masked_handle] &&
+		    handle == phandle_cache[masked_handle]->phandle) {
+			of_node_put(phandle_cache[masked_handle]);
+			phandle_cache[masked_handle] = NULL;
+		}
+	}
+}
+
 void of_populate_phandle_cache(void)
 {
 	unsigned long flags;
@@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
 		if (phandle_cache[masked_handle] &&
 		    handle == phandle_cache[masked_handle]->phandle)
 			np = phandle_cache[masked_handle];
+		if (np && of_node_check_flag(np, OF_DETACHED)) {
+			WARN_ON(1);
+			of_node_put(np);
+			phandle_cache[masked_handle] = NULL;
+			np = NULL;
+		}
 	}
 
 	if (!np) {
 		for_each_of_allnodes(np)
-			if (np->phandle == handle) {
+			if (np->phandle == handle &&
+			    !of_node_check_flag(np, OF_DETACHED)) {
 				if (phandle_cache) {
 					/* will put when removed from cache */
 					of_node_get(np);
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index f4f8ed9b5454..ecea92f68c87 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -268,6 +268,9 @@ void __of_detach_node(struct device_node *np)
 	}
 
 	of_node_set_flag(np, OF_DETACHED);
+
+	/* race with of_find_node_by_phandle() prevented by devtree_lock */
+	__of_free_phandle_cache_entry(np->phandle);
 }
 
 /**
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 5d1567025358..24786818e32e 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -84,6 +84,10 @@ static inline void __of_detach_node_sysfs(struct device_node *np) {}
 int of_resolve_phandles(struct device_node *tree);
 #endif
 
+#if defined(CONFIG_OF_DYNAMIC)
+void __of_free_phandle_cache_entry(phandle handle);
+#endif
+
 #if defined(CONFIG_OF_OVERLAY)
 void of_overlay_mutex_lock(void);
 void of_overlay_mutex_unlock(void);
-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-17  7:56   ` frowand.list
  0 siblings, 0 replies; 28+ messages in thread
From: frowand.list @ 2018-12-17  7:56 UTC (permalink / raw)
  To: robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Thomas Falcon, linux-kernel, Juliet Kim, Tyrel Datwyler

From: Frank Rowand <frank.rowand@sony.com>

Non-overlay dynamic devicetree node removal may leave the node in
the phandle cache.  Subsequent calls to of_find_node_by_phandle()
will incorrectly find the stale entry.  Remove the node from the
cache.

Add paranoia checks in of_find_node_by_phandle() as a second level
of defense (do not return cached node if detached, do not add node
to cache if detached).

Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
---

changes since v1:
  - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()

 drivers/of/base.c       | 30 +++++++++++++++++++++++++++++-
 drivers/of/dynamic.c    |  3 +++
 drivers/of/of_private.h |  4 ++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 6c33d63361b8..ad71864cecf5 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
 late_initcall_sync(of_free_phandle_cache);
 #endif
 
+/*
+ * Caller must hold devtree_lock.
+ */
+void __of_free_phandle_cache_entry(phandle handle)
+{
+	phandle masked_handle;
+
+	if (!handle)
+		return;
+
+	masked_handle = handle & phandle_cache_mask;
+
+	if (phandle_cache) {
+		if (phandle_cache[masked_handle] &&
+		    handle == phandle_cache[masked_handle]->phandle) {
+			of_node_put(phandle_cache[masked_handle]);
+			phandle_cache[masked_handle] = NULL;
+		}
+	}
+}
+
 void of_populate_phandle_cache(void)
 {
 	unsigned long flags;
@@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
 		if (phandle_cache[masked_handle] &&
 		    handle == phandle_cache[masked_handle]->phandle)
 			np = phandle_cache[masked_handle];
+		if (np && of_node_check_flag(np, OF_DETACHED)) {
+			WARN_ON(1);
+			of_node_put(np);
+			phandle_cache[masked_handle] = NULL;
+			np = NULL;
+		}
 	}
 
 	if (!np) {
 		for_each_of_allnodes(np)
-			if (np->phandle == handle) {
+			if (np->phandle == handle &&
+			    !of_node_check_flag(np, OF_DETACHED)) {
 				if (phandle_cache) {
 					/* will put when removed from cache */
 					of_node_get(np);
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index f4f8ed9b5454..ecea92f68c87 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -268,6 +268,9 @@ void __of_detach_node(struct device_node *np)
 	}
 
 	of_node_set_flag(np, OF_DETACHED);
+
+	/* race with of_find_node_by_phandle() prevented by devtree_lock */
+	__of_free_phandle_cache_entry(np->phandle);
 }
 
 /**
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 5d1567025358..24786818e32e 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -84,6 +84,10 @@ static inline void __of_detach_node_sysfs(struct device_node *np) {}
 int of_resolve_phandles(struct device_node *tree);
 #endif
 
+#if defined(CONFIG_OF_DYNAMIC)
+void __of_free_phandle_cache_entry(phandle handle);
+#endif
+
 #if defined(CONFIG_OF_OVERLAY)
 void of_overlay_mutex_lock(void);
 void of_overlay_mutex_unlock(void);
-- 
Frank Rowand <frank.rowand@sony.com>


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] of: of_node_get()/of_node_put() nodes held in phandle cache
  2018-12-17  7:56   ` frowand.list
@ 2018-12-17 10:43     ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-17 10:43 UTC (permalink / raw)
  To: frowand.list, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Tyrel Datwyler, Thomas Falcon, Juliet Kim, devicetree, linux-kernel

Hi Frank,

frowand.list@gmail.com writes:
> From: Frank Rowand <frank.rowand@sony.com>
>
> The phandle cache contains struct device_node pointers.  The refcount
> of the pointers was not incremented while in the cache, allowing use
> after free error after kfree() of the node.  Add the proper increment
> and decrement of the use count.
>
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Can we also add:

Cc: stable@vger.kernel.org # v4.17+


This and the next patch solve WARN_ONs and other problems for us on some
systems so I think they meet the criteria for a stable backport.

Rest of the patch LGTM, I'm not able to test it unfortunately, I have to
defer to mwb for that.

cheers

> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 09692c9b32a7..6c33d63361b8 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -116,9 +116,6 @@ int __weak of_node_to_nid(struct device_node *np)
>  }
>  #endif
>  
> -static struct device_node **phandle_cache;
> -static u32 phandle_cache_mask;
> -
>  /*
>   * Assumptions behind phandle_cache implementation:
>   *   - phandle property values are in a contiguous range of 1..n
> @@ -127,6 +124,44 @@ int __weak of_node_to_nid(struct device_node *np)
>   *   - the phandle lookup overhead reduction provided by the cache
>   *     will likely be less
>   */
> +
> +static struct device_node **phandle_cache;
> +static u32 phandle_cache_mask;
> +
> +/*
> + * Caller must hold devtree_lock.
> + */
> +static void __of_free_phandle_cache(void)
> +{
> +	u32 cache_entries = phandle_cache_mask + 1;
> +	u32 k;
> +
> +	if (!phandle_cache)
> +		return;
> +
> +	for (k = 0; k < cache_entries; k++)
> +		of_node_put(phandle_cache[k]);
> +
> +	kfree(phandle_cache);
> +	phandle_cache = NULL;
> +}
> +
> +int of_free_phandle_cache(void)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&devtree_lock, flags);
> +
> +	__of_free_phandle_cache();
> +
> +	raw_spin_unlock_irqrestore(&devtree_lock, flags);
> +
> +	return 0;
> +}
> +#if !defined(CONFIG_MODULES)
> +late_initcall_sync(of_free_phandle_cache);
> +#endif
> +
>  void of_populate_phandle_cache(void)
>  {
>  	unsigned long flags;
> @@ -136,8 +171,7 @@ void of_populate_phandle_cache(void)
>  
>  	raw_spin_lock_irqsave(&devtree_lock, flags);
>  
> -	kfree(phandle_cache);
> -	phandle_cache = NULL;
> +	__of_free_phandle_cache();
>  
>  	for_each_of_allnodes(np)
>  		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> @@ -155,30 +189,15 @@ void of_populate_phandle_cache(void)
>  		goto out;
>  
>  	for_each_of_allnodes(np)
> -		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> +		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
> +			of_node_get(np);
>  			phandle_cache[np->phandle & phandle_cache_mask] = np;
> +		}
>  
>  out:
>  	raw_spin_unlock_irqrestore(&devtree_lock, flags);
>  }
>  
> -int of_free_phandle_cache(void)
> -{
> -	unsigned long flags;
> -
> -	raw_spin_lock_irqsave(&devtree_lock, flags);
> -
> -	kfree(phandle_cache);
> -	phandle_cache = NULL;
> -
> -	raw_spin_unlock_irqrestore(&devtree_lock, flags);
> -
> -	return 0;
> -}
> -#if !defined(CONFIG_MODULES)
> -late_initcall_sync(of_free_phandle_cache);
> -#endif
> -
>  void __init of_core_init(void)
>  {
>  	struct device_node *np;
> @@ -1195,8 +1214,11 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>  	if (!np) {
>  		for_each_of_allnodes(np)
>  			if (np->phandle == handle) {
> -				if (phandle_cache)
> +				if (phandle_cache) {
> +					/* will put when removed from cache */
> +					of_node_get(np);
>  					phandle_cache[masked_handle] = np;
> +				}
>  				break;
>  			}
>  	}
> -- 
> Frank Rowand <frank.rowand@sony.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] of: of_node_get()/of_node_put() nodes held in phandle cache
@ 2018-12-17 10:43     ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-17 10:43 UTC (permalink / raw)
  To: frowand.list, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Juliet Kim, Thomas Falcon, Tyrel Datwyler, linux-kernel

Hi Frank,

frowand.list@gmail.com writes:
> From: Frank Rowand <frank.rowand@sony.com>
>
> The phandle cache contains struct device_node pointers.  The refcount
> of the pointers was not incremented while in the cache, allowing use
> after free error after kfree() of the node.  Add the proper increment
> and decrement of the use count.
>
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Can we also add:

Cc: stable@vger.kernel.org # v4.17+


This and the next patch solve WARN_ONs and other problems for us on some
systems so I think they meet the criteria for a stable backport.

Rest of the patch LGTM, I'm not able to test it unfortunately, I have to
defer to mwb for that.

cheers

> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 09692c9b32a7..6c33d63361b8 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -116,9 +116,6 @@ int __weak of_node_to_nid(struct device_node *np)
>  }
>  #endif
>  
> -static struct device_node **phandle_cache;
> -static u32 phandle_cache_mask;
> -
>  /*
>   * Assumptions behind phandle_cache implementation:
>   *   - phandle property values are in a contiguous range of 1..n
> @@ -127,6 +124,44 @@ int __weak of_node_to_nid(struct device_node *np)
>   *   - the phandle lookup overhead reduction provided by the cache
>   *     will likely be less
>   */
> +
> +static struct device_node **phandle_cache;
> +static u32 phandle_cache_mask;
> +
> +/*
> + * Caller must hold devtree_lock.
> + */
> +static void __of_free_phandle_cache(void)
> +{
> +	u32 cache_entries = phandle_cache_mask + 1;
> +	u32 k;
> +
> +	if (!phandle_cache)
> +		return;
> +
> +	for (k = 0; k < cache_entries; k++)
> +		of_node_put(phandle_cache[k]);
> +
> +	kfree(phandle_cache);
> +	phandle_cache = NULL;
> +}
> +
> +int of_free_phandle_cache(void)
> +{
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&devtree_lock, flags);
> +
> +	__of_free_phandle_cache();
> +
> +	raw_spin_unlock_irqrestore(&devtree_lock, flags);
> +
> +	return 0;
> +}
> +#if !defined(CONFIG_MODULES)
> +late_initcall_sync(of_free_phandle_cache);
> +#endif
> +
>  void of_populate_phandle_cache(void)
>  {
>  	unsigned long flags;
> @@ -136,8 +171,7 @@ void of_populate_phandle_cache(void)
>  
>  	raw_spin_lock_irqsave(&devtree_lock, flags);
>  
> -	kfree(phandle_cache);
> -	phandle_cache = NULL;
> +	__of_free_phandle_cache();
>  
>  	for_each_of_allnodes(np)
>  		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> @@ -155,30 +189,15 @@ void of_populate_phandle_cache(void)
>  		goto out;
>  
>  	for_each_of_allnodes(np)
> -		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> +		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
> +			of_node_get(np);
>  			phandle_cache[np->phandle & phandle_cache_mask] = np;
> +		}
>  
>  out:
>  	raw_spin_unlock_irqrestore(&devtree_lock, flags);
>  }
>  
> -int of_free_phandle_cache(void)
> -{
> -	unsigned long flags;
> -
> -	raw_spin_lock_irqsave(&devtree_lock, flags);
> -
> -	kfree(phandle_cache);
> -	phandle_cache = NULL;
> -
> -	raw_spin_unlock_irqrestore(&devtree_lock, flags);
> -
> -	return 0;
> -}
> -#if !defined(CONFIG_MODULES)
> -late_initcall_sync(of_free_phandle_cache);
> -#endif
> -
>  void __init of_core_init(void)
>  {
>  	struct device_node *np;
> @@ -1195,8 +1214,11 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>  	if (!np) {
>  		for_each_of_allnodes(np)
>  			if (np->phandle == handle) {
> -				if (phandle_cache)
> +				if (phandle_cache) {
> +					/* will put when removed from cache */
> +					of_node_get(np);
>  					phandle_cache[masked_handle] = np;
> +				}
>  				break;
>  			}
>  	}
> -- 
> Frank Rowand <frank.rowand@sony.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-17  7:56   ` frowand.list
@ 2018-12-17 10:52     ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-17 10:52 UTC (permalink / raw)
  To: frowand.list, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Tyrel Datwyler, Thomas Falcon, Juliet Kim, devicetree, linux-kernel

Hi Frank,

frowand.list@gmail.com writes:
> From: Frank Rowand <frank.rowand@sony.com>
>
> Non-overlay dynamic devicetree node removal may leave the node in
> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> will incorrectly find the stale entry.  Remove the node from the
> cache.
>
> Add paranoia checks in of_find_node_by_phandle() as a second level
> of defense (do not return cached node if detached, do not add node
> to cache if detached).
>
> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> ---

Similarly here can we add:

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Cc: stable@vger.kernel.org # v4.17+


Thanks for doing this series.

Some minor comments below.

> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 6c33d63361b8..ad71864cecf5 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
>  late_initcall_sync(of_free_phandle_cache);
>  #endif
>  
> +/*
> + * Caller must hold devtree_lock.
> + */
> +void __of_free_phandle_cache_entry(phandle handle)
> +{
> +	phandle masked_handle;
> +
> +	if (!handle)
> +		return;

We could fold the phandle_cache check into that if and return early for
both cases couldn't we?

> +	masked_handle = handle & phandle_cache_mask;
> +
> +	if (phandle_cache) {

Meaning this wouldn't be necessary.

> +		if (phandle_cache[masked_handle] &&
> +		    handle == phandle_cache[masked_handle]->phandle) {
> +			of_node_put(phandle_cache[masked_handle]);
> +			phandle_cache[masked_handle] = NULL;
> +		}

A temporary would help the readability here I think, eg:

	struct device_node *np;
        np = phandle_cache[masked_handle];

	if (np && handle == np->phandle) {
		of_node_put(np);
		phandle_cache[masked_handle] = NULL;
	}

> @@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>  		if (phandle_cache[masked_handle] &&
>  		    handle == phandle_cache[masked_handle]->phandle)
>  			np = phandle_cache[masked_handle];
> +		if (np && of_node_check_flag(np, OF_DETACHED)) {
> +			WARN_ON(1);
> +			of_node_put(np);

Do we really want to do the put here?

We're here because something has gone wrong, possibly even memory
corruption such that np is not even pointing at a device node anymore.
So it seems like it would be safer to just leave the ref count alone,
possibly leak a small amount of memory, and NULL out the reference.


cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-17 10:52     ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-17 10:52 UTC (permalink / raw)
  To: frowand.list, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Juliet Kim, Thomas Falcon, Tyrel Datwyler, linux-kernel

Hi Frank,

frowand.list@gmail.com writes:
> From: Frank Rowand <frank.rowand@sony.com>
>
> Non-overlay dynamic devicetree node removal may leave the node in
> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> will incorrectly find the stale entry.  Remove the node from the
> cache.
>
> Add paranoia checks in of_find_node_by_phandle() as a second level
> of defense (do not return cached node if detached, do not add node
> to cache if detached).
>
> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> ---

Similarly here can we add:

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Cc: stable@vger.kernel.org # v4.17+


Thanks for doing this series.

Some minor comments below.

> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 6c33d63361b8..ad71864cecf5 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
>  late_initcall_sync(of_free_phandle_cache);
>  #endif
>  
> +/*
> + * Caller must hold devtree_lock.
> + */
> +void __of_free_phandle_cache_entry(phandle handle)
> +{
> +	phandle masked_handle;
> +
> +	if (!handle)
> +		return;

We could fold the phandle_cache check into that if and return early for
both cases couldn't we?

> +	masked_handle = handle & phandle_cache_mask;
> +
> +	if (phandle_cache) {

Meaning this wouldn't be necessary.

> +		if (phandle_cache[masked_handle] &&
> +		    handle == phandle_cache[masked_handle]->phandle) {
> +			of_node_put(phandle_cache[masked_handle]);
> +			phandle_cache[masked_handle] = NULL;
> +		}

A temporary would help the readability here I think, eg:

	struct device_node *np;
        np = phandle_cache[masked_handle];

	if (np && handle == np->phandle) {
		of_node_put(np);
		phandle_cache[masked_handle] = NULL;
	}

> @@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>  		if (phandle_cache[masked_handle] &&
>  		    handle == phandle_cache[masked_handle]->phandle)
>  			np = phandle_cache[masked_handle];
> +		if (np && of_node_check_flag(np, OF_DETACHED)) {
> +			WARN_ON(1);
> +			of_node_put(np);

Do we really want to do the put here?

We're here because something has gone wrong, possibly even memory
corruption such that np is not even pointing at a device node anymore.
So it seems like it would be safer to just leave the ref count alone,
possibly leak a small amount of memory, and NULL out the reference.


cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
  2018-12-17  7:56 ` frowand.list
@ 2018-12-18 15:43   ` Rob Herring
  -1 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 15:43 UTC (permalink / raw)
  To: Frank Rowand
  Cc: mwb, linuxppc-dev, Michael Ellerman, Tyrel Datwyler, tlfalcon,
	minkim, devicetree, linux-kernel

On Mon, Dec 17, 2018 at 1:56 AM <frowand.list@gmail.com> wrote:
>
> From: Frank Rowand <frank.rowand@sony.com>
>
> Non-overlay dynamic devicetree node removal may leave the node in
> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> will incorrectly find the stale entry.  This bug exposed the foloowing
> phandle cache refcount bug.
>
> The refcount of phandle_cache entries is not incremented while in
> the cache, allowing use after free error after kfree() of the
> cached entry.
>
> Changes since v1:
>   - make __of_free_phandle_cache() static
>   - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
>
> Frank Rowand (2):
>   of: of_node_get()/of_node_put() nodes held in phandle cache
>   of: __of_detach_node() - remove node from phandle cache

I'll send this to Linus this week if I get a tested by. Otherwise, it
will go in for 4.21.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
@ 2018-12-18 15:43   ` Rob Herring
  0 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 15:43 UTC (permalink / raw)
  To: Frank Rowand
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

On Mon, Dec 17, 2018 at 1:56 AM <frowand.list@gmail.com> wrote:
>
> From: Frank Rowand <frank.rowand@sony.com>
>
> Non-overlay dynamic devicetree node removal may leave the node in
> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> will incorrectly find the stale entry.  This bug exposed the foloowing
> phandle cache refcount bug.
>
> The refcount of phandle_cache entries is not incremented while in
> the cache, allowing use after free error after kfree() of the
> cached entry.
>
> Changes since v1:
>   - make __of_free_phandle_cache() static
>   - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
>
> Frank Rowand (2):
>   of: of_node_get()/of_node_put() nodes held in phandle cache
>   of: __of_detach_node() - remove node from phandle cache

I'll send this to Linus this week if I get a tested by. Otherwise, it
will go in for 4.21.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-17 10:52     ` Michael Ellerman
@ 2018-12-18 18:57       ` Frank Rowand
  -1 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 18:57 UTC (permalink / raw)
  To: Michael Ellerman, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: Tyrel Datwyler, Thomas Falcon, Juliet Kim, devicetree, linux-kernel

On 12/17/18 2:52 AM, Michael Ellerman wrote:
> Hi Frank,
> 
> frowand.list@gmail.com writes:
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Non-overlay dynamic devicetree node removal may leave the node in
>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> will incorrectly find the stale entry.  Remove the node from the
>> cache.
>>
>> Add paranoia checks in of_find_node_by_phandle() as a second level
>> of defense (do not return cached node if detached, do not add node
>> to cache if detached).
>>
>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> ---
> 
> Similarly here can we add:
> 
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Yes, thanks.


> Cc: stable@vger.kernel.org # v4.17+

Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
fix).  So the bug will not be in stable.

I've debated with myself over this, because there is a possibility that
0b3ce78e90fc could somehow be put into a stable despite not being a
bug fix.  We can always explicitly request this patch series be added
to stable in that case.


> Thanks for doing this series.
> 
> Some minor comments below.
> 
>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>> index 6c33d63361b8..ad71864cecf5 100644
>> --- a/drivers/of/base.c
>> +++ b/drivers/of/base.c
>> @@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
>>  late_initcall_sync(of_free_phandle_cache);
>>  #endif
>>  
>> +/*
>> + * Caller must hold devtree_lock.
>> + */
>> +void __of_free_phandle_cache_entry(phandle handle)
>> +{
>> +	phandle masked_handle;
>> +
>> +	if (!handle)
>> +		return;
> 
> We could fold the phandle_cache check into that if and return early for
> both cases couldn't we?

We could, but that would make the reason for checking phandle_cache
less obvious.  I would rather leave that check
> 
>> +	masked_handle = handle & phandle_cache_mask;
>> +
>> +	if (phandle_cache) {
> 
> Meaning this wouldn't be necessary.
> 
>> +		if (phandle_cache[masked_handle] &&
>> +		    handle == phandle_cache[masked_handle]->phandle) {
>> +			of_node_put(phandle_cache[masked_handle]);
>> +			phandle_cache[masked_handle] = NULL;
>> +		}
> 
> A temporary would help the readability here I think, eg:
> 
> 	struct device_node *np;
>         np = phandle_cache[masked_handle];
> 
> 	if (np && handle == np->phandle) {
> 		of_node_put(np);
> 		phandle_cache[masked_handle] = NULL;
> 	}

Yes, much cleaner.


>> @@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>>  		if (phandle_cache[masked_handle] &&
>>  		    handle == phandle_cache[masked_handle]->phandle)
>>  			np = phandle_cache[masked_handle];
>> +		if (np && of_node_check_flag(np, OF_DETACHED)) {
>> +			WARN_ON(1);
>> +			of_node_put(np);
>
> Do we really want to do the put here?
> 
> We're here because something has gone wrong, possibly even memory
> corruption such that np is not even pointing at a device node anymore.
> So it seems like it would be safer to just leave the ref count alone,
> possibly leak a small amount of memory, and NULL out the reference.

I like the concept of the code being a little bit paranoid.

But the bug that this check is likely to cache is the bug that led
to this series -- removing a devicetree node, but failing to remove
it from the cache as part of the removal.  So I think I'll leave
it as is.

> 
> 
> cheers
> 

Thanks for the thoughts and suggestions!

-Frank


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 18:57       ` Frank Rowand
  0 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 18:57 UTC (permalink / raw)
  To: Michael Ellerman, robh+dt, Michael Bringmann, linuxppc-dev
  Cc: devicetree, Juliet Kim, Thomas Falcon, Tyrel Datwyler, linux-kernel

On 12/17/18 2:52 AM, Michael Ellerman wrote:
> Hi Frank,
> 
> frowand.list@gmail.com writes:
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Non-overlay dynamic devicetree node removal may leave the node in
>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> will incorrectly find the stale entry.  Remove the node from the
>> cache.
>>
>> Add paranoia checks in of_find_node_by_phandle() as a second level
>> of defense (do not return cached node if detached, do not add node
>> to cache if detached).
>>
>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> ---
> 
> Similarly here can we add:
> 
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")

Yes, thanks.


> Cc: stable@vger.kernel.org # v4.17+

Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
fix).  So the bug will not be in stable.

I've debated with myself over this, because there is a possibility that
0b3ce78e90fc could somehow be put into a stable despite not being a
bug fix.  We can always explicitly request this patch series be added
to stable in that case.


> Thanks for doing this series.
> 
> Some minor comments below.
> 
>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>> index 6c33d63361b8..ad71864cecf5 100644
>> --- a/drivers/of/base.c
>> +++ b/drivers/of/base.c
>> @@ -162,6 +162,27 @@ int of_free_phandle_cache(void)
>>  late_initcall_sync(of_free_phandle_cache);
>>  #endif
>>  
>> +/*
>> + * Caller must hold devtree_lock.
>> + */
>> +void __of_free_phandle_cache_entry(phandle handle)
>> +{
>> +	phandle masked_handle;
>> +
>> +	if (!handle)
>> +		return;
> 
> We could fold the phandle_cache check into that if and return early for
> both cases couldn't we?

We could, but that would make the reason for checking phandle_cache
less obvious.  I would rather leave that check
> 
>> +	masked_handle = handle & phandle_cache_mask;
>> +
>> +	if (phandle_cache) {
> 
> Meaning this wouldn't be necessary.
> 
>> +		if (phandle_cache[masked_handle] &&
>> +		    handle == phandle_cache[masked_handle]->phandle) {
>> +			of_node_put(phandle_cache[masked_handle]);
>> +			phandle_cache[masked_handle] = NULL;
>> +		}
> 
> A temporary would help the readability here I think, eg:
> 
> 	struct device_node *np;
>         np = phandle_cache[masked_handle];
> 
> 	if (np && handle == np->phandle) {
> 		of_node_put(np);
> 		phandle_cache[masked_handle] = NULL;
> 	}

Yes, much cleaner.


>> @@ -1209,11 +1230,18 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>>  		if (phandle_cache[masked_handle] &&
>>  		    handle == phandle_cache[masked_handle]->phandle)
>>  			np = phandle_cache[masked_handle];
>> +		if (np && of_node_check_flag(np, OF_DETACHED)) {
>> +			WARN_ON(1);
>> +			of_node_put(np);
>
> Do we really want to do the put here?
> 
> We're here because something has gone wrong, possibly even memory
> corruption such that np is not even pointing at a device node anymore.
> So it seems like it would be safer to just leave the ref count alone,
> possibly leak a small amount of memory, and NULL out the reference.

I like the concept of the code being a little bit paranoid.

But the bug that this check is likely to cache is the bug that led
to this series -- removing a devicetree node, but failing to remove
it from the cache as part of the removal.  So I think I'll leave
it as is.

> 
> 
> cheers
> 

Thanks for the thoughts and suggestions!

-Frank


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-18 18:57       ` Frank Rowand
@ 2018-12-18 20:01         ` Rob Herring
  -1 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 20:01 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Michael Ellerman, mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon,
	minkim, devicetree, linux-kernel

On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 12/17/18 2:52 AM, Michael Ellerman wrote:
> > Hi Frank,
> >
> > frowand.list@gmail.com writes:
> >> From: Frank Rowand <frank.rowand@sony.com>
> >>
> >> Non-overlay dynamic devicetree node removal may leave the node in
> >> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> >> will incorrectly find the stale entry.  Remove the node from the
> >> cache.
> >>
> >> Add paranoia checks in of_find_node_by_phandle() as a second level
> >> of defense (do not return cached node if detached, do not add node
> >> to cache if detached).
> >>
> >> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> >> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> >> ---
> >
> > Similarly here can we add:
> >
> > Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>
> Yes, thanks.
>
>
> > Cc: stable@vger.kernel.org # v4.17+
>
> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
> fix).  So the bug will not be in stable.

0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
Annotating it with 4.17 only saves Greg from trying and then emailing
us to backport this patch as it wouldn't apply.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 20:01         ` Rob Herring
  0 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 20:01 UTC (permalink / raw)
  To: Frank Rowand
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 12/17/18 2:52 AM, Michael Ellerman wrote:
> > Hi Frank,
> >
> > frowand.list@gmail.com writes:
> >> From: Frank Rowand <frank.rowand@sony.com>
> >>
> >> Non-overlay dynamic devicetree node removal may leave the node in
> >> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> >> will incorrectly find the stale entry.  Remove the node from the
> >> cache.
> >>
> >> Add paranoia checks in of_find_node_by_phandle() as a second level
> >> of defense (do not return cached node if detached, do not add node
> >> to cache if detached).
> >>
> >> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> >> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> >> ---
> >
> > Similarly here can we add:
> >
> > Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>
> Yes, thanks.
>
>
> > Cc: stable@vger.kernel.org # v4.17+
>
> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
> fix).  So the bug will not be in stable.

0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
Annotating it with 4.17 only saves Greg from trying and then emailing
us to backport this patch as it wouldn't apply.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-18 20:01         ` Rob Herring
@ 2018-12-18 20:09           ` Frank Rowand
  -1 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 20:09 UTC (permalink / raw)
  To: Rob Herring
  Cc: Michael Ellerman, mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon,
	minkim, devicetree, linux-kernel

On 12/18/18 12:01 PM, Rob Herring wrote:
> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>>> Hi Frank,
>>>
>>> frowand.list@gmail.com writes:
>>>> From: Frank Rowand <frank.rowand@sony.com>
>>>>
>>>> Non-overlay dynamic devicetree node removal may leave the node in
>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>>>> will incorrectly find the stale entry.  Remove the node from the
>>>> cache.
>>>>
>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>>>> of defense (do not return cached node if detached, do not add node
>>>> to cache if detached).
>>>>
>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>>> ---
>>>
>>> Similarly here can we add:
>>>
>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>>
>> Yes, thanks.
>>
>>
>>> Cc: stable@vger.kernel.org # v4.17+
>>
>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>> fix).  So the bug will not be in stable.
> 
> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
> Annotating it with 4.17 only saves Greg from trying and then emailing
> us to backport this patch as it wouldn't apply.

Thanks for the correction.  I was both under-thinking and over-thinking,
ending up with an incorrect answer.

Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
you want me to re-spin?

-Frank

> 
> Rob
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 20:09           ` Frank Rowand
  0 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 20:09 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

On 12/18/18 12:01 PM, Rob Herring wrote:
> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>>> Hi Frank,
>>>
>>> frowand.list@gmail.com writes:
>>>> From: Frank Rowand <frank.rowand@sony.com>
>>>>
>>>> Non-overlay dynamic devicetree node removal may leave the node in
>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>>>> will incorrectly find the stale entry.  Remove the node from the
>>>> cache.
>>>>
>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>>>> of defense (do not return cached node if detached, do not add node
>>>> to cache if detached).
>>>>
>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>>> ---
>>>
>>> Similarly here can we add:
>>>
>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>>
>> Yes, thanks.
>>
>>
>>> Cc: stable@vger.kernel.org # v4.17+
>>
>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>> fix).  So the bug will not be in stable.
> 
> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
> Annotating it with 4.17 only saves Greg from trying and then emailing
> us to backport this patch as it wouldn't apply.

Thanks for the correction.  I was both under-thinking and over-thinking,
ending up with an incorrect answer.

Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
you want me to re-spin?

-Frank

> 
> Rob
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-18 20:09           ` Frank Rowand
@ 2018-12-18 20:33             ` Frank Rowand
  -1 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 20:33 UTC (permalink / raw)
  To: Rob Herring
  Cc: Michael Ellerman, mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon,
	minkim, devicetree, linux-kernel

On 12/18/18 12:09 PM, Frank Rowand wrote:
> On 12/18/18 12:01 PM, Rob Herring wrote:
>> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>>
>>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>>>> Hi Frank,
>>>>
>>>> frowand.list@gmail.com writes:
>>>>> From: Frank Rowand <frank.rowand@sony.com>
>>>>>
>>>>> Non-overlay dynamic devicetree node removal may leave the node in
>>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>>>>> will incorrectly find the stale entry.  Remove the node from the
>>>>> cache.
>>>>>
>>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>>>>> of defense (do not return cached node if detached, do not add node
>>>>> to cache if detached).
>>>>>
>>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>>>> ---
>>>>
>>>> Similarly here can we add:
>>>>
>>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>>>
>>> Yes, thanks.
>>>
>>>
>>>> Cc: stable@vger.kernel.org # v4.17+
>>>
>>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>>> fix).  So the bug will not be in stable.
>>
>> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
>> Annotating it with 4.17 only saves Greg from trying and then emailing
>> us to backport this patch as it wouldn't apply.
> 
> Thanks for the correction.  I was both under-thinking and over-thinking,
> ending up with an incorrect answer.
> 
> Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
> you want me to re-spin?

Now that my thinking has been straightened out, a little bit more checking
for the other pre-requisite patches show:

  v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
  v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")

These can be addressed by changing the "Cc:" to ... # v4.19+
because stable v4.17.* and v4.18.* are end of life.

Or the pre-requisites can be listed:

   # v4.17: b9952b5218ad of: overlay: update phandle cache
   # v4.17: e54192b48da7 of: fix phandle cache creation
   # v4.17

   # v4.18: e54192b48da7 of: fix phandle cache creation
   # v4.18

   # v4.19+

Do you have a preference?

-Frank
> 
> -Frank
> 
>>
>> Rob
>>
> 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 20:33             ` Frank Rowand
  0 siblings, 0 replies; 28+ messages in thread
From: Frank Rowand @ 2018-12-18 20:33 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

On 12/18/18 12:09 PM, Frank Rowand wrote:
> On 12/18/18 12:01 PM, Rob Herring wrote:
>> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>>
>>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>>>> Hi Frank,
>>>>
>>>> frowand.list@gmail.com writes:
>>>>> From: Frank Rowand <frank.rowand@sony.com>
>>>>>
>>>>> Non-overlay dynamic devicetree node removal may leave the node in
>>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>>>>> will incorrectly find the stale entry.  Remove the node from the
>>>>> cache.
>>>>>
>>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>>>>> of defense (do not return cached node if detached, do not add node
>>>>> to cache if detached).
>>>>>
>>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>>>> ---
>>>>
>>>> Similarly here can we add:
>>>>
>>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>>>
>>> Yes, thanks.
>>>
>>>
>>>> Cc: stable@vger.kernel.org # v4.17+
>>>
>>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>>> fix).  So the bug will not be in stable.
>>
>> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
>> Annotating it with 4.17 only saves Greg from trying and then emailing
>> us to backport this patch as it wouldn't apply.
> 
> Thanks for the correction.  I was both under-thinking and over-thinking,
> ending up with an incorrect answer.
> 
> Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
> you want me to re-spin?

Now that my thinking has been straightened out, a little bit more checking
for the other pre-requisite patches show:

  v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
  v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")

These can be addressed by changing the "Cc:" to ... # v4.19+
because stable v4.17.* and v4.18.* are end of life.

Or the pre-requisites can be listed:

   # v4.17: b9952b5218ad of: overlay: update phandle cache
   # v4.17: e54192b48da7 of: fix phandle cache creation
   # v4.17

   # v4.18: e54192b48da7 of: fix phandle cache creation
   # v4.18

   # v4.19+

Do you have a preference?

-Frank
> 
> -Frank
> 
>>
>> Rob
>>
> 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-18 20:33             ` Frank Rowand
@ 2018-12-18 20:58               ` Rob Herring
  -1 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 20:58 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Michael Ellerman, mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon,
	minkim, devicetree, linux-kernel

On Tue, Dec 18, 2018 at 2:33 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 12/18/18 12:09 PM, Frank Rowand wrote:
> > On 12/18/18 12:01 PM, Rob Herring wrote:
> >> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>
> >>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
> >>>> Hi Frank,
> >>>>
> >>>> frowand.list@gmail.com writes:
> >>>>> From: Frank Rowand <frank.rowand@sony.com>
> >>>>>
> >>>>> Non-overlay dynamic devicetree node removal may leave the node in
> >>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> >>>>> will incorrectly find the stale entry.  Remove the node from the
> >>>>> cache.
> >>>>>
> >>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
> >>>>> of defense (do not return cached node if detached, do not add node
> >>>>> to cache if detached).
> >>>>>
> >>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> >>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> >>>>> ---
> >>>>
> >>>> Similarly here can we add:
> >>>>
> >>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
> >>>
> >>> Yes, thanks.
> >>>
> >>>
> >>>> Cc: stable@vger.kernel.org # v4.17+
> >>>
> >>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
> >>> fix).  So the bug will not be in stable.
> >>
> >> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
> >> Annotating it with 4.17 only saves Greg from trying and then emailing
> >> us to backport this patch as it wouldn't apply.
> >
> > Thanks for the correction.  I was both under-thinking and over-thinking,
> > ending up with an incorrect answer.
> >
> > Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
> > you want me to re-spin?
>
> Now that my thinking has been straightened out, a little bit more checking
> for the other pre-requisite patches show:
>
>   v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
>   v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")
>
> These can be addressed by changing the "Cc:" to ... # v4.19+
> because stable v4.17.* and v4.18.* are end of life.

EOL shouldn't factor into it. There's always the possibility someone
else picks any kernel version.

> Or the pre-requisites can be listed:
>
>    # v4.17: b9952b5218ad of: overlay: update phandle cache
>    # v4.17: e54192b48da7 of: fix phandle cache creation
>    # v4.17
>
>    # v4.18: e54192b48da7 of: fix phandle cache creation
>    # v4.18
>
>    # v4.19+
>
> Do you have a preference?

I think we just list v4.17 and be done with it.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 20:58               ` Rob Herring
  0 siblings, 0 replies; 28+ messages in thread
From: Rob Herring @ 2018-12-18 20:58 UTC (permalink / raw)
  To: Frank Rowand
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

On Tue, Dec 18, 2018 at 2:33 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 12/18/18 12:09 PM, Frank Rowand wrote:
> > On 12/18/18 12:01 PM, Rob Herring wrote:
> >> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>
> >>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
> >>>> Hi Frank,
> >>>>
> >>>> frowand.list@gmail.com writes:
> >>>>> From: Frank Rowand <frank.rowand@sony.com>
> >>>>>
> >>>>> Non-overlay dynamic devicetree node removal may leave the node in
> >>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
> >>>>> will incorrectly find the stale entry.  Remove the node from the
> >>>>> cache.
> >>>>>
> >>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
> >>>>> of defense (do not return cached node if detached, do not add node
> >>>>> to cache if detached).
> >>>>>
> >>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> >>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> >>>>> ---
> >>>>
> >>>> Similarly here can we add:
> >>>>
> >>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
> >>>
> >>> Yes, thanks.
> >>>
> >>>
> >>>> Cc: stable@vger.kernel.org # v4.17+
> >>>
> >>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
> >>> fix).  So the bug will not be in stable.
> >>
> >> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
> >> Annotating it with 4.17 only saves Greg from trying and then emailing
> >> us to backport this patch as it wouldn't apply.
> >
> > Thanks for the correction.  I was both under-thinking and over-thinking,
> > ending up with an incorrect answer.
> >
> > Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
> > you want me to re-spin?
>
> Now that my thinking has been straightened out, a little bit more checking
> for the other pre-requisite patches show:
>
>   v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
>   v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")
>
> These can be addressed by changing the "Cc:" to ... # v4.19+
> because stable v4.17.* and v4.18.* are end of life.

EOL shouldn't factor into it. There's always the possibility someone
else picks any kernel version.

> Or the pre-requisites can be listed:
>
>    # v4.17: b9952b5218ad of: overlay: update phandle cache
>    # v4.17: e54192b48da7 of: fix phandle cache creation
>    # v4.17
>
>    # v4.18: e54192b48da7 of: fix phandle cache creation
>    # v4.18
>
>    # v4.19+
>
> Do you have a preference?

I think we just list v4.17 and be done with it.

Rob

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
  2018-12-18 20:58               ` Rob Herring
  (?)
@ 2018-12-18 23:44                 ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:44 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon, minkim, devicetree,
	linux-kernel

Rob Herring <robh+dt@kernel.org> writes:
> On Tue, Dec 18, 2018 at 2:33 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 12/18/18 12:09 PM, Frank Rowand wrote:
>> > On 12/18/18 12:01 PM, Rob Herring wrote:
>> >> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>> >>>
>> >>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>> >>>> Hi Frank,
>> >>>>
>> >>>> frowand.list@gmail.com writes:
>> >>>>> From: Frank Rowand <frank.rowand@sony.com>
>> >>>>>
>> >>>>> Non-overlay dynamic devicetree node removal may leave the node in
>> >>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> >>>>> will incorrectly find the stale entry.  Remove the node from the
>> >>>>> cache.
>> >>>>>
>> >>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>> >>>>> of defense (do not return cached node if detached, do not add node
>> >>>>> to cache if detached).
>> >>>>>
>> >>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> >>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> >>>>> ---
>> >>>>
>> >>>> Similarly here can we add:
>> >>>>
>> >>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>> >>>
>> >>> Yes, thanks.
>> >>>
>> >>>
>> >>>> Cc: stable@vger.kernel.org # v4.17+
>> >>>
>> >>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>> >>> fix).  So the bug will not be in stable.
>> >>
>> >> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
>> >> Annotating it with 4.17 only saves Greg from trying and then emailing
>> >> us to backport this patch as it wouldn't apply.
>> >
>> > Thanks for the correction.  I was both under-thinking and over-thinking,
>> > ending up with an incorrect answer.
>> >
>> > Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
>> > you want me to re-spin?
>>
>> Now that my thinking has been straightened out, a little bit more checking
>> for the other pre-requisite patches show:
>>
>>   v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
>>   v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")
>>
>> These can be addressed by changing the "Cc:" to ... # v4.19+
>> because stable v4.17.* and v4.18.* are end of life.
>
> EOL shouldn't factor into it. There's always the possibility someone
> else picks any kernel version.

Yeah, there are other stable branches out there, so the tag should point
to the correct version regardless of whether it's currently EOL on
kernel.org.

>> Or the pre-requisites can be listed:
>>
>>    # v4.17: b9952b5218ad of: overlay: update phandle cache
>>    # v4.17: e54192b48da7 of: fix phandle cache creation
>>    # v4.17
>>
>>    # v4.18: e54192b48da7 of: fix phandle cache creation
>>    # v4.18
>>
>>    # v4.19+
>>
>> Do you have a preference?
>
> I think we just list v4.17 and be done with it.

Yep, anyone who wants to backport it can work it out, or ask us.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 23:44                 ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:44 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon, minkim, devicetree,
	linux-kernel

Rob Herring <robh+dt@kernel.org> writes:
> On Tue, Dec 18, 2018 at 2:33 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 12/18/18 12:09 PM, Frank Rowand wrote:
>> > On 12/18/18 12:01 PM, Rob Herring wrote:
>> >> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>> >>>
>> >>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>> >>>> Hi Frank,
>> >>>>
>> >>>> frowand.list@gmail.com writes:
>> >>>>> From: Frank Rowand <frank.rowand@sony.com>
>> >>>>>
>> >>>>> Non-overlay dynamic devicetree node removal may leave the node in
>> >>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> >>>>> will incorrectly find the stale entry.  Remove the node from the
>> >>>>> cache.
>> >>>>>
>> >>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>> >>>>> of defense (do not return cached node if detached, do not add node
>> >>>>> to cache if detached).
>> >>>>>
>> >>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> >>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> >>>>> ---
>> >>>>
>> >>>> Similarly here can we add:
>> >>>>
>> >>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>> >>>
>> >>> Yes, thanks.
>> >>>
>> >>>
>> >>>> Cc: stable@vger.kernel.org # v4.17+
>> >>>
>> >>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>> >>> fix).  So the bug will not be in stable.
>> >>
>> >> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
>> >> Annotating it with 4.17 only saves Greg from trying and then emailing
>> >> us to backport this patch as it wouldn't apply.
>> >
>> > Thanks for the correction.  I was both under-thinking and over-thinking,
>> > ending up with an incorrect answer.
>> >
>> > Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
>> > you want me to re-spin?
>>
>> Now that my thinking has been straightened out, a little bit more checking
>> for the other pre-requisite patches show:
>>
>>   v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
>>   v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")
>>
>> These can be addressed by changing the "Cc:" to ... # v4.19+
>> because stable v4.17.* and v4.18.* are end of life.
>
> EOL shouldn't factor into it. There's always the possibility someone
> else picks any kernel version.

Yeah, there are other stable branches out there, so the tag should point
to the correct version regardless of whether it's currently EOL on
kernel.org.

>> Or the pre-requisites can be listed:
>>
>>    # v4.17: b9952b5218ad of: overlay: update phandle cache
>>    # v4.17: e54192b48da7 of: fix phandle cache creation
>>    # v4.17
>>
>>    # v4.18: e54192b48da7 of: fix phandle cache creation
>>    # v4.18
>>
>>    # v4.19+
>>
>> Do you have a preference?
>
> I think we just list v4.17 and be done with it.

Yep, anyone who wants to backport it can work it out, or ask us.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] of: __of_detach_node() - remove node from phandle cache
@ 2018-12-18 23:44                 ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:44 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

Rob Herring <robh+dt@kernel.org> writes:
> On Tue, Dec 18, 2018 at 2:33 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 12/18/18 12:09 PM, Frank Rowand wrote:
>> > On 12/18/18 12:01 PM, Rob Herring wrote:
>> >> On Tue, Dec 18, 2018 at 12:57 PM Frank Rowand <frowand.list@gmail.com> wrote:
>> >>>
>> >>> On 12/17/18 2:52 AM, Michael Ellerman wrote:
>> >>>> Hi Frank,
>> >>>>
>> >>>> frowand.list@gmail.com writes:
>> >>>>> From: Frank Rowand <frank.rowand@sony.com>
>> >>>>>
>> >>>>> Non-overlay dynamic devicetree node removal may leave the node in
>> >>>>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> >>>>> will incorrectly find the stale entry.  Remove the node from the
>> >>>>> cache.
>> >>>>>
>> >>>>> Add paranoia checks in of_find_node_by_phandle() as a second level
>> >>>>> of defense (do not return cached node if detached, do not add node
>> >>>>> to cache if detached).
>> >>>>>
>> >>>>> Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
>> >>>>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> >>>>> ---
>> >>>>
>> >>>> Similarly here can we add:
>> >>>>
>> >>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
>> >>>
>> >>> Yes, thanks.
>> >>>
>> >>>
>> >>>> Cc: stable@vger.kernel.org # v4.17+
>> >>>
>> >>> Nope, 0b3ce78e90fc does not belong in stable (it is a feature, not a bug
>> >>> fix).  So the bug will not be in stable.
>> >>
>> >> 0b3ce78e90fc landed in v4.17, so Michael's line above is correct.
>> >> Annotating it with 4.17 only saves Greg from trying and then emailing
>> >> us to backport this patch as it wouldn't apply.
>> >
>> > Thanks for the correction.  I was both under-thinking and over-thinking,
>> > ending up with an incorrect answer.
>> >
>> > Can you add the Cc: to version 3 patch comments (both 1/2 and 2/2) or do
>> > you want me to re-spin?
>>
>> Now that my thinking has been straightened out, a little bit more checking
>> for the other pre-requisite patches show:
>>
>>   v4.18: commit b9952b5218ad ("of: overlay: update phandle cache on overlay apply and remove")
>>   v4.19: commit e54192b48da7 ("of: fix phandle cache creation for DTs with no phandles")
>>
>> These can be addressed by changing the "Cc:" to ... # v4.19+
>> because stable v4.17.* and v4.18.* are end of life.
>
> EOL shouldn't factor into it. There's always the possibility someone
> else picks any kernel version.

Yeah, there are other stable branches out there, so the tag should point
to the correct version regardless of whether it's currently EOL on
kernel.org.

>> Or the pre-requisites can be listed:
>>
>>    # v4.17: b9952b5218ad of: overlay: update phandle cache
>>    # v4.17: e54192b48da7 of: fix phandle cache creation
>>    # v4.17
>>
>>    # v4.18: e54192b48da7 of: fix phandle cache creation
>>    # v4.18
>>
>>    # v4.19+
>>
>> Do you have a preference?
>
> I think we just list v4.17 and be done with it.

Yep, anyone who wants to backport it can work it out, or ask us.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
  2018-12-18 15:43   ` Rob Herring
  (?)
@ 2018-12-18 23:46     ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:46 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon, minkim, devicetree,
	linux-kernel

Rob Herring <robh+dt@kernel.org> writes:
> On Mon, Dec 17, 2018 at 1:56 AM <frowand.list@gmail.com> wrote:
>>
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Non-overlay dynamic devicetree node removal may leave the node in
>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> will incorrectly find the stale entry.  This bug exposed the foloowing
>> phandle cache refcount bug.
>>
>> The refcount of phandle_cache entries is not incremented while in
>> the cache, allowing use after free error after kfree() of the
>> cached entry.
>>
>> Changes since v1:
>>   - make __of_free_phandle_cache() static
>>   - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
>>
>> Frank Rowand (2):
>>   of: of_node_get()/of_node_put() nodes held in phandle cache
>>   of: __of_detach_node() - remove node from phandle cache
>
> I'll send this to Linus this week if I get a tested by. Otherwise, it
> will go in for 4.21.

I think it can wait to go into 4.21, it's not super critical and it's
not a regression since 4.19.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
@ 2018-12-18 23:46     ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:46 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: mwb, linuxppc-dev, Tyrel Datwyler, tlfalcon, minkim, devicetree,
	linux-kernel

Rob Herring <robh+dt@kernel.org> writes:
> On Mon, Dec 17, 2018 at 1:56 AM <frowand.list@gmail.com> wrote:
>>
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Non-overlay dynamic devicetree node removal may leave the node in
>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> will incorrectly find the stale entry.  This bug exposed the foloowing
>> phandle cache refcount bug.
>>
>> The refcount of phandle_cache entries is not incremented while in
>> the cache, allowing use after free error after kfree() of the
>> cached entry.
>>
>> Changes since v1:
>>   - make __of_free_phandle_cache() static
>>   - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
>>
>> Frank Rowand (2):
>>   of: of_node_get()/of_node_put() nodes held in phandle cache
>>   of: __of_detach_node() - remove node from phandle cache
>
> I'll send this to Linus this week if I get a tested by. Otherwise, it
> will go in for 4.21.

I think it can wait to go into 4.21, it's not super critical and it's
not a regression since 4.19.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry
@ 2018-12-18 23:46     ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2018-12-18 23:46 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand
  Cc: devicetree, tlfalcon, linux-kernel, mwb, minkim, Tyrel Datwyler,
	linuxppc-dev

Rob Herring <robh+dt@kernel.org> writes:
> On Mon, Dec 17, 2018 at 1:56 AM <frowand.list@gmail.com> wrote:
>>
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Non-overlay dynamic devicetree node removal may leave the node in
>> the phandle cache.  Subsequent calls to of_find_node_by_phandle()
>> will incorrectly find the stale entry.  This bug exposed the foloowing
>> phandle cache refcount bug.
>>
>> The refcount of phandle_cache entries is not incremented while in
>> the cache, allowing use after free error after kfree() of the
>> cached entry.
>>
>> Changes since v1:
>>   - make __of_free_phandle_cache() static
>>   - add WARN_ON(1) for unexpected condition in of_find_node_by_phandle()
>>
>> Frank Rowand (2):
>>   of: of_node_get()/of_node_put() nodes held in phandle cache
>>   of: __of_detach_node() - remove node from phandle cache
>
> I'll send this to Linus this week if I get a tested by. Otherwise, it
> will go in for 4.21.

I think it can wait to go into 4.21, it's not super critical and it's
not a regression since 4.19.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-12-18 23:49 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17  7:56 [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry frowand.list
2018-12-17  7:56 ` frowand.list
2018-12-17  7:56 ` [PATCH v2 1/2] of: of_node_get()/of_node_put() nodes held in phandle cache frowand.list
2018-12-17  7:56   ` frowand.list
2018-12-17 10:43   ` Michael Ellerman
2018-12-17 10:43     ` Michael Ellerman
2018-12-17  7:56 ` [PATCH v2 2/2] of: __of_detach_node() - remove node from " frowand.list
2018-12-17  7:56   ` frowand.list
2018-12-17 10:52   ` Michael Ellerman
2018-12-17 10:52     ` Michael Ellerman
2018-12-18 18:57     ` Frank Rowand
2018-12-18 18:57       ` Frank Rowand
2018-12-18 20:01       ` Rob Herring
2018-12-18 20:01         ` Rob Herring
2018-12-18 20:09         ` Frank Rowand
2018-12-18 20:09           ` Frank Rowand
2018-12-18 20:33           ` Frank Rowand
2018-12-18 20:33             ` Frank Rowand
2018-12-18 20:58             ` Rob Herring
2018-12-18 20:58               ` Rob Herring
2018-12-18 23:44               ` Michael Ellerman
2018-12-18 23:44                 ` Michael Ellerman
2018-12-18 23:44                 ` Michael Ellerman
2018-12-18 15:43 ` [PATCH v2 0/2] of: phandle_cache, fix refcounts, remove stale entry Rob Herring
2018-12-18 15:43   ` Rob Herring
2018-12-18 23:46   ` Michael Ellerman
2018-12-18 23:46     ` Michael Ellerman
2018-12-18 23:46     ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.