[v5,1/3] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
diff mbox series

Message ID 1520208889-3908-2-git-send-email-frowand.list@gmail.com
State New, archived
Headers show
Series
  • of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
Related show

Commit Message

Frank Rowand March 5, 2018, 12:14 a.m. UTC
From: Frank Rowand <frank.rowand@sony.com>

Create a cache of the nodes that contain a phandle property.  Use this
cache to find the node for a given phandle value instead of scanning
the devicetree to find the node.  If the phandle value is not found
in the cache, of_find_node_by_phandle() will fall back to the tree
scan algorithm.

The cache is initialized in of_core_init().

The cache is freed via a late_initcall_sync() if modules are not
enabled.

If the devicetree is created by the dtc compiler, with all phandle
property values auto generated, then the size required by the cache
could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
node lookup cost for a given phandle value.  Due to a concern that the
phandle property values might not be consistent with what is generated
by the dtc compiler, a mask has been added to the cache lookup algorithm.
To maintain the O(1) node lookup cost, the size of the cache has been
increased by rounding the number of entries up to the next power of
two.

The overhead of finding the devicetree node containing a given phandle
value has been noted by several people in the recent past, in some cases
with a patch to add a hashed index of devicetree nodes, based on the
phandle value of the node.  One concern with this approach is the extra
space added to each node.  This patch takes advantage of the phandle
property values auto generated by the dtc compiler, which begin with
one and monotonically increase by one, resulting in a range of 1..n
for n phandle values.  This implementation should also provide a good
reduction of overhead for any range of phandle values that are mostly
in a monotonic range.

Performance measurements by Chintan Pandya <cpandya@codeaurora.org>
of several implementations of patches that are similar to this one
suggest an expected reduction of boot time by ~400ms for his test
system.  If the cache size was decreased to 64 entries, the boot
time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
814 phandle values.

Reported-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
---

Changes since v3:
  - of_populate_phandle_cache(): add check for failed memory allocation

Changes since v2:
  - add mask to calculation of phandle cache entry
  - which results in better overhead reduction for devicetrees with
    phandle properties not allocated in the monotonically increasing
    range of 1..n
  - due to mask, number of entries in cache potentially increased to
    next power of two
  - minor fixes as suggested by reviewers
  - no longer using live_tree_max_phandle() so do not move it from
    drivers/of/resolver.c to drivers/of/base.c

Changes since v1:
  - change short description from
    of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
  - rebase on v4.16-rc1
  - reorder new functions in base.c to avoid forward declaration
  - add locking around kfree(phandle_cache) for memory ordering
  - add explicit check for non-null of phandle_cache in
    of_find_node_by_phandle().  There is already a check for !handle,
    which prevents accessing a null phandle_cache, but that dependency
    is not obvious, so this check makes it more apparent.
  - do not free phandle_cache if modules are enabled, so that
    cached phandles will be available when modules are loaded

 drivers/of/base.c       | 86 ++++++++++++++++++++++++++++++++++++++++++++++---
 drivers/of/of_private.h |  3 ++
 drivers/of/resolver.c   |  3 --
 3 files changed, 85 insertions(+), 7 deletions(-)

Comments

Rob Herring March 9, 2018, 11:03 p.m. UTC | #1
On Sun, Mar 04, 2018 at 04:14:47PM -0800, frowand.list@gmail.com wrote:
> From: Frank Rowand <frank.rowand@sony.com>
> 
> Create a cache of the nodes that contain a phandle property.  Use this
> cache to find the node for a given phandle value instead of scanning
> the devicetree to find the node.  If the phandle value is not found
> in the cache, of_find_node_by_phandle() will fall back to the tree
> scan algorithm.
> 
> The cache is initialized in of_core_init().
> 
> The cache is freed via a late_initcall_sync() if modules are not
> enabled.
> 
> If the devicetree is created by the dtc compiler, with all phandle
> property values auto generated, then the size required by the cache
> could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
> node lookup cost for a given phandle value.  Due to a concern that the
> phandle property values might not be consistent with what is generated
> by the dtc compiler, a mask has been added to the cache lookup algorithm.
> To maintain the O(1) node lookup cost, the size of the cache has been
> increased by rounding the number of entries up to the next power of
> two.
> 
> The overhead of finding the devicetree node containing a given phandle
> value has been noted by several people in the recent past, in some cases
> with a patch to add a hashed index of devicetree nodes, based on the
> phandle value of the node.  One concern with this approach is the extra
> space added to each node.  This patch takes advantage of the phandle
> property values auto generated by the dtc compiler, which begin with
> one and monotonically increase by one, resulting in a range of 1..n
> for n phandle values.  This implementation should also provide a good
> reduction of overhead for any range of phandle values that are mostly
> in a monotonic range.
> 
> Performance measurements by Chintan Pandya <cpandya@codeaurora.org>
> of several implementations of patches that are similar to this one
> suggest an expected reduction of boot time by ~400ms for his test
> system.  If the cache size was decreased to 64 entries, the boot
> time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
> 814 phandle values.
> 
> Reported-by: Chintan Pandya <cpandya@codeaurora.org>
> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> ---

I've applied this one, but not the others.

Rob
Frank Rowand March 10, 2018, 1:20 a.m. UTC | #2
On 03/09/18 15:03, Rob Herring wrote:
> On Sun, Mar 04, 2018 at 04:14:47PM -0800, frowand.list@gmail.com wrote:
>> From: Frank Rowand <frank.rowand@sony.com>
>>
>> Create a cache of the nodes that contain a phandle property.  Use this
>> cache to find the node for a given phandle value instead of scanning
>> the devicetree to find the node.  If the phandle value is not found
>> in the cache, of_find_node_by_phandle() will fall back to the tree
>> scan algorithm.
>>
>> The cache is initialized in of_core_init().
>>
>> The cache is freed via a late_initcall_sync() if modules are not
>> enabled.
>>
>> If the devicetree is created by the dtc compiler, with all phandle
>> property values auto generated, then the size required by the cache
>> could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
>> node lookup cost for a given phandle value.  Due to a concern that the
>> phandle property values might not be consistent with what is generated
>> by the dtc compiler, a mask has been added to the cache lookup algorithm.
>> To maintain the O(1) node lookup cost, the size of the cache has been
>> increased by rounding the number of entries up to the next power of
>> two.
>>
>> The overhead of finding the devicetree node containing a given phandle
>> value has been noted by several people in the recent past, in some cases
>> with a patch to add a hashed index of devicetree nodes, based on the
>> phandle value of the node.  One concern with this approach is the extra
>> space added to each node.  This patch takes advantage of the phandle
>> property values auto generated by the dtc compiler, which begin with
>> one and monotonically increase by one, resulting in a range of 1..n
>> for n phandle values.  This implementation should also provide a good
>> reduction of overhead for any range of phandle values that are mostly
>> in a monotonic range.
>>
>> Performance measurements by Chintan Pandya <cpandya@codeaurora.org>
>> of several implementations of patches that are similar to this one
>> suggest an expected reduction of boot time by ~400ms for his test
>> system.  If the cache size was decreased to 64 entries, the boot
>> time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
>> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
>> 814 phandle values.
>>
>> Reported-by: Chintan Pandya <cpandya@codeaurora.org>
>> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>> ---
> 
> I've applied this one, but not the others.
> 
> Rob
> 

Thank you.  I'll let the other two disappear into a distant memory.

-Frank
Alan Tull June 12, 2018, 6:16 p.m. UTC | #3
On Sun, Mar 4, 2018 at 6:14 PM,  <frowand.list@gmail.com> wrote:

Hi Frank,

I'm investigating a refcount use-after-free warning that happens after
overlays are applied, removed, reapplied a few (typically three) times
(see below).  This is new in v4.17, didn't happen in v4.16.  As I was
investigating I found that rebuilding the phandle_cache after overlays
are applied or removed seems to help.  I exported
of_populate_phandle_cache() and called it after overlays were applied
or removed such as the snippet below, it seemed to fix the problem.
I'll keep digging to understand the problem better.

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 848f549..4184273 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -102,7 +102,7 @@ static u32 phandle_cache_mask;
  *   - the phandle lookup overhead reduction provided by the cache
  *     will likely be less
  */
-static void of_populate_phandle_cache(void)
+void of_populate_phandle_cache(void)
 {
        unsigned long flags;
        u32 cache_entries;
@@ -133,6 +133,7 @@ static void of_populate_phandle_cache(void)
 out:
        raw_spin_unlock_irqrestore(&devtree_lock, flags);
 }
+EXPORT_SYMBOL_GPL(of_populate_phandle_cache);

 #ifndef CONFIG_MODULES
 static int __init of_free_phandle_cache(void)
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 7baa53e..55caf42 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -885,6 +885,8 @@ int of_overlay_fdt_apply(const void *overlay_fdt,
u32 overlay_fdt_size,
                goto out;
        }

+       of_populate_phandle_cache();
+
        return 0;


@@ -1070,6 +1072,7 @@ int of_overlay_remove(int *ovcs_id)
        }

        free_overlay_changeset(ovcs);
+       of_populate_phandle_cache();

 out_unlock:
        mutex_unlock(&of_mutex);
diff --git a/include/linux/of.h b/include/linux/of.h
index 4d25e4f..a662d4e 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1342,6 +1342,9 @@ static inline int
of_reconfig_get_state_change(unsigned long action,
 }
 #endif /* CONFIG_OF_DYNAMIC */

+//todo locate this more correctly, just testing for now
+void of_populate_phandle_cache(void);
+
 /**
  * of_device_is_system_power_controller - Tells if
system-power-controller is found for device_node
  * @np: Pointer to the given device_node


Dump (with my added pr_err's in of_node_get) without the above snippet
to help out.

[  226.115974] OF: Got from phandle_cache np=/soc/base_fpga_region/
[  226.121956] OF: about to get np=/soc/base_fpga_region/
[  226.127073] ------------[ cut here ]------------
[  226.131682] WARNING: CPU: 1 PID: 1529 at
/home/atull/repos/linux-socfpga/lib/refcount.c:153
refcount_inc+0x4c/0x50
[  226.141988] refcount_t: increment on 0; use-after-free.
[  226.147191] Modules linked in:
[  226.150241] CPU: 1 PID: 1529 Comm: python Not tainted
4.17.0-00134-gb6fd158 #10
[  226.157521] Hardware name: Altera SOCFPGA Arria10
[  226.162223] [<c01132d8>] (unwind_backtrace) from [<c010defc>]
(show_stack+0x20/0x24)
[  226.169943] [<c010defc>] (show_stack) from [<c07be4e0>]
(dump_stack+0x8c/0xa0)
[  226.177146] [<c07be4e0>] (dump_stack) from [<c0123620>] (__warn+0x104/0x11c)
[  226.184170] [<c0123620>] (__warn) from [<c012368c>]
(warn_slowpath_fmt+0x54/0x70)
[  226.191631] [<c012368c>] (warn_slowpath_fmt) from [<c0481ac8>]
(refcount_inc+0x4c/0x50)
[  226.199613] [<c0481ac8>] (refcount_inc) from [<c07c2ff4>]
(kobject_get+0x2c/0x5c)
[  226.207073] [<c07c2ff4>] (kobject_get) from [<c06485bc>]
(of_node_get.part.0+0x30/0x44)
[  226.215050] [<c06485bc>] (of_node_get.part.0) from [<c06485f8>]
(of_node_get+0x28/0x2c)
[  226.223028] [<c06485f8>] (of_node_get) from [<c06434c8>]
(of_find_node_by_phandle+0xa0/0xec)
[  226.231438] [<c06434c8>] (of_find_node_by_phandle) from
[<c06435c4>] (of_phandle_iterator_next+0xb0/0x178)
[  226.241057] [<c06435c4>] (of_phandle_iterator_next) from
[<c0643f48>] (__of_parse_phandle_with_args+0x50/0xf8)
[  226.251022] [<c0643f48>] (__of_parse_phandle_with_args) from
[<c0644038>] (of_parse_phandle+0x48/0x78)
[  226.260298] [<c0644038>] (of_parse_phandle) from [<c0657424>]
(of_fpga_region_get_bridges+0x140/0x1d0)
[  226.269572] [<c0657424>] (of_fpga_region_get_bridges) from
[<c0657070>] (fpga_region_program_fpga+0x98/0x184)
[  226.279450] [<c0657070>] (fpga_region_program_fpga) from
[<c06578b4>] (of_fpga_region_notify+0x2c4/0x340)
[  226.288984] [<c06578b4>] (of_fpga_region_notify) from [<c01472f4>]
(notifier_call_chain+0x54/0x94)
[  226.297913] [<c01472f4>] (notifier_call_chain) from [<c01476d8>]
(__blocking_notifier_call_chain+0x58/0x70)
[  226.307619] [<c01476d8>] (__blocking_notifier_call_chain) from
[<c0147718>] (blocking_notifier_call_chain+0x28/0x30)
[  226.318105] [<c0147718>] (blocking_notifier_call_chain) from
[<c064e930>] (overlay_notify+0x8c/0xec)
[  226.327206] [<c064e930>] (overlay_notify) from [<c064f2f4>]
(of_overlay_fdt_apply+0x41c/0x714)
[  226.335791] [<c064f2f4>] (of_overlay_fdt_apply) from [<c06481dc>]
(cfs_overlay_item_dtbo_write+0x68/0xbc)
[  226.345327] [<c06481dc>] (cfs_overlay_item_dtbo_write) from
[<c02e3d38>] (configfs_release_bin_file+0x6c/0xa0)
[  226.355294] [<c02e3d38>] (configfs_release_bin_file) from
[<c026b920>] (__fput+0x94/0x1e4)
[  226.363530] [<c026b920>] (__fput) from [<c026bae0>] (____fput+0x18/0x1c)
[  226.370207] [<c026bae0>] (____fput) from [<c0143aa8>]
(task_work_run+0xb4/0xd8)
[  226.377492] [<c0143aa8>] (task_work_run) from [<c010d438>]
(do_work_pending+0xac/0xcc)
[  226.385382] [<c010d438>] (do_work_pending) from [<c010106c>]
(slow_work_pending+0xc/0x20)

Alan

> From: Frank Rowand <frank.rowand@sony.com>
>
> Create a cache of the nodes that contain a phandle property.  Use this
> cache to find the node for a given phandle value instead of scanning
> the devicetree to find the node.  If the phandle value is not found
> in the cache, of_find_node_by_phandle() will fall back to the tree
> scan algorithm.
>
> The cache is initialized in of_core_init().
>
> The cache is freed via a late_initcall_sync() if modules are not
> enabled.
>
> If the devicetree is created by the dtc compiler, with all phandle
> property values auto generated, then the size required by the cache
> could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
> node lookup cost for a given phandle value.  Due to a concern that the
> phandle property values might not be consistent with what is generated
> by the dtc compiler, a mask has been added to the cache lookup algorithm.
> To maintain the O(1) node lookup cost, the size of the cache has been
> increased by rounding the number of entries up to the next power of
> two.
>
> The overhead of finding the devicetree node containing a given phandle
> value has been noted by several people in the recent past, in some cases
> with a patch to add a hashed index of devicetree nodes, based on the
> phandle value of the node.  One concern with this approach is the extra
> space added to each node.  This patch takes advantage of the phandle
> property values auto generated by the dtc compiler, which begin with
> one and monotonically increase by one, resulting in a range of 1..n
> for n phandle values.  This implementation should also provide a good
> reduction of overhead for any range of phandle values that are mostly
> in a monotonic range.
>
> Performance measurements by Chintan Pandya <cpandya@codeaurora.org>
> of several implementations of patches that are similar to this one
> suggest an expected reduction of boot time by ~400ms for his test
> system.  If the cache size was decreased to 64 entries, the boot
> time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
> 814 phandle values.
>
> Reported-by: Chintan Pandya <cpandya@codeaurora.org>
> Signed-off-by: Frank Rowand <frank.rowand@sony.com>
> ---
>
> Changes since v3:
>   - of_populate_phandle_cache(): add check for failed memory allocation
>
> Changes since v2:
>   - add mask to calculation of phandle cache entry
>   - which results in better overhead reduction for devicetrees with
>     phandle properties not allocated in the monotonically increasing
>     range of 1..n
>   - due to mask, number of entries in cache potentially increased to
>     next power of two
>   - minor fixes as suggested by reviewers
>   - no longer using live_tree_max_phandle() so do not move it from
>     drivers/of/resolver.c to drivers/of/base.c
>
> Changes since v1:
>   - change short description from
>     of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
>   - rebase on v4.16-rc1
>   - reorder new functions in base.c to avoid forward declaration
>   - add locking around kfree(phandle_cache) for memory ordering
>   - add explicit check for non-null of phandle_cache in
>     of_find_node_by_phandle().  There is already a check for !handle,
>     which prevents accessing a null phandle_cache, but that dependency
>     is not obvious, so this check makes it more apparent.
>   - do not free phandle_cache if modules are enabled, so that
>     cached phandles will be available when modules are loaded
>
>  drivers/of/base.c       | 86 ++++++++++++++++++++++++++++++++++++++++++++++---
>  drivers/of/of_private.h |  3 ++
>  drivers/of/resolver.c   |  3 --
>  3 files changed, 85 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index ad28de96e13f..e71d157d7149 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -91,10 +91,72 @@ int __weak of_node_to_nid(struct device_node *np)
>  }
>  #endif
>
> +static struct device_node **phandle_cache;
> +static u32 phandle_cache_mask;
> +
> +/*
> + * Assumptions behind phandle_cache implementation:
> + *   - phandle property values are in a contiguous range of 1..n
> + *
> + * If the assumptions do not hold, then
> + *   - the phandle lookup overhead reduction provided by the cache
> + *     will likely be less
> + */
> +static void of_populate_phandle_cache(void)
> +{
> +       unsigned long flags;
> +       u32 cache_entries;
> +       struct device_node *np;
> +       u32 phandles = 0;
> +
> +       raw_spin_lock_irqsave(&devtree_lock, flags);
> +
> +       kfree(phandle_cache);
> +       phandle_cache = NULL;
> +
> +       for_each_of_allnodes(np)
> +               if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> +                       phandles++;
> +
> +       cache_entries = roundup_pow_of_two(phandles);
> +       phandle_cache_mask = cache_entries - 1;
> +
> +       phandle_cache = kcalloc(cache_entries, sizeof(*phandle_cache),
> +                               GFP_ATOMIC);
> +       if (!phandle_cache)
> +               goto out;
> +
> +       for_each_of_allnodes(np)
> +               if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> +                       phandle_cache[np->phandle & phandle_cache_mask] = np;
> +
> +out:
> +       raw_spin_unlock_irqrestore(&devtree_lock, flags);
> +}
> +
> +#ifndef CONFIG_MODULES
> +static int __init of_free_phandle_cache(void)
> +{
> +       unsigned long flags;
> +
> +       raw_spin_lock_irqsave(&devtree_lock, flags);
> +
> +       kfree(phandle_cache);
> +       phandle_cache = NULL;
> +
> +       raw_spin_unlock_irqrestore(&devtree_lock, flags);
> +
> +       return 0;
> +}
> +late_initcall_sync(of_free_phandle_cache);
> +#endif
> +
>  void __init of_core_init(void)
>  {
>         struct device_node *np;
>
> +       of_populate_phandle_cache();
> +
>         /* Create the kset, and register existing nodes */
>         mutex_lock(&of_mutex);
>         of_kset = kset_create_and_add("devicetree", NULL, firmware_kobj);
> @@ -1021,16 +1083,32 @@ int of_modalias_node(struct device_node *node, char *modalias, int len)
>   */
>  struct device_node *of_find_node_by_phandle(phandle handle)
>  {
> -       struct device_node *np;
> +       struct device_node *np = NULL;
>         unsigned long flags;
> +       phandle masked_handle;
>
>         if (!handle)
>                 return NULL;
>
>         raw_spin_lock_irqsave(&devtree_lock, flags);
> -       for_each_of_allnodes(np)
> -               if (np->phandle == handle)
> -                       break;
> +
> +       masked_handle = handle & phandle_cache_mask;
> +
> +       if (phandle_cache) {
> +               if (phandle_cache[masked_handle] &&
> +                   handle == phandle_cache[masked_handle]->phandle)
> +                       np = phandle_cache[masked_handle];
> +       }
> +
> +       if (!np) {
> +               for_each_of_allnodes(np)
> +                       if (np->phandle == handle) {
> +                               if (phandle_cache)
> +                                       phandle_cache[masked_handle] = np;
> +                               break;
> +                       }
> +       }
> +
>         of_node_get(np);
>         raw_spin_unlock_irqrestore(&devtree_lock, flags);
>         return np;
> diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> index 0c609e7d0334..fa70650136b4 100644
> --- a/drivers/of/of_private.h
> +++ b/drivers/of/of_private.h
> @@ -131,6 +131,9 @@ extern void __of_update_property_sysfs(struct device_node *np,
>  extern void __of_sysfs_remove_bin_file(struct device_node *np,
>                                        struct property *prop);
>
> +/* illegal phandle value (set when unresolved) */
> +#define OF_PHANDLE_ILLEGAL     0xdeadbeef
> +
>  /* iterators for transactions, used for overlays */
>  /* forward iterator */
>  #define for_each_transaction_entry(_oft, _te) \
> diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
> index 740d19bde601..b2ca8185c8c6 100644
> --- a/drivers/of/resolver.c
> +++ b/drivers/of/resolver.c
> @@ -19,9 +19,6 @@
>
>  #include "of_private.h"
>
> -/* illegal phandle value (set when unresolved) */
> -#define OF_PHANDLE_ILLEGAL     0xdeadbeef
> -
>  static phandle live_tree_max_phandle(void)
>  {
>         struct device_node *node;
> --
> Frank Rowand <frank.rowand@sony.com>
>
Alan Tull June 13, 2018, 2:42 p.m. UTC | #4
On Tue, Jun 12, 2018 at 1:16 PM, Alan Tull <atull@kernel.org> wrote:
> On Sun, Mar 4, 2018 at 6:14 PM,  <frowand.list@gmail.com> wrote:
>
> Hi Frank,
>
> I'm investigating a refcount use-after-free warning that happens after
> overlays are applied, removed, reapplied a few (typically three) times
> (see below).  This is new in v4.17, didn't happen in v4.16.  As I was
> investigating I found that rebuilding the phandle_cache after overlays
> are applied or removed seems to help.

I was probably wrong about this.  The more I look at the phandle_cache code,
the more it looks looks good and straightforward.  Probably disabling
phandle_cache is 'fixing' things through some weird side effect.  I'll
keep investigating.  Sorry for the noise.

Alan
Frank Rowand June 13, 2018, 9:47 p.m. UTC | #5
On 06/13/18 07:42, Alan Tull wrote:
> On Tue, Jun 12, 2018 at 1:16 PM, Alan Tull <atull@kernel.org> wrote:
>> On Sun, Mar 4, 2018 at 6:14 PM,  <frowand.list@gmail.com> wrote:
>>
>> Hi Frank,
>>
>> I'm investigating a refcount use-after-free warning that happens after
>> overlays are applied, removed, reapplied a few (typically three) times
>> (see below).  This is new in v4.17, didn't happen in v4.16.  As I was
>> investigating I found that rebuilding the phandle_cache after overlays
>> are applied or removed seems to help.
> 
> I was probably wrong about this.  The more I look at the phandle_cache code,
> the more it looks looks good and straightforward.  Probably disabling
> phandle_cache is 'fixing' things through some weird side effect.  I'll
> keep investigating.  Sorry for the noise.

I suspect that you have found an issue, even if it is not the cause of
the refcount issue.  I noted in a reply to v4 of the patch:

  >> +static void of_populate_phandle_cache(void)
  >> +{
  >> +Â Â Â  unsigned long flags;
  >> +Â Â Â  u32 cache_entries;
  >> +Â Â Â  struct device_node *np;
  >> +Â Â Â  u32 phandles = 0;
  >> +
  >> +Â Â Â  raw_spin_lock_irqsave(&devtree_lock, flags);
  >> +
  >> +Â Â Â  kfree(phandle_cache);
  > 
  > I couldn't understood this. Everything else looks good to me.

  I will be adding a call to of_populate_phandle_cache() from the
  devicetree overlay code.  I put the kfree here so that the previous
  cache memory is freed when a new cache is created.
  
  Adding the call from the overlay code is not done in this
  series because I have a patch series modifying overlays and
  I do not want to create a conflict or ordering between that
  series and that patch.  The lack of the call from overlay
  code means that overlay code will gain some of the overhead
  reduction from this patch, but possibly not the entire reduction.

Sorry I'm not giving a link to the archive of this message - I have
a class I have to go to so I don't have enough time to find it.  The
email was

  Subject: Re: [PATCH v3] of: cache phandle nodes to reduce cost of
  of_find_node_by_phandle()
  Date: Fri, 16 Feb 2018 14:20:22 -0800
  Message-ID: <46d5fc76-33e3-d54a-26b8-e9bb8332924d@gmail.com>

Quickly looking at the current code, I don't see the overlay patch
that I mentioned.  I have to dig into what happened to that.

Leaving a phandle from an overlay in the phandle cache after the
overlay is removed would clearly be a bug.

-Frank
Alan Tull June 14, 2018, 8:59 p.m. UTC | #6
On Wed, Jun 13, 2018 at 4:47 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> On 06/13/18 07:42, Alan Tull wrote:
>> On Tue, Jun 12, 2018 at 1:16 PM, Alan Tull <atull@kernel.org> wrote:
>>> On Sun, Mar 4, 2018 at 6:14 PM,  <frowand.list@gmail.com> wrote:
>>>
>>> Hi Frank,
>>>
>>> I'm investigating a refcount use-after-free warning that happens after
>>> overlays are applied, removed, reapplied a few (typically three) times
>>> (see below).  This is new in v4.17, didn't happen in v4.16.  As I was
>>> investigating I found that rebuilding the phandle_cache after overlays
>>> are applied or removed seems to help.
>>
>> I was probably wrong about this.  The more I look at the phandle_cache code,
>> the more it looks looks good and straightforward.  Probably disabling
>> phandle_cache is 'fixing' things through some weird side effect.  I'll
>> keep investigating.  Sorry for the noise.
>
> I suspect that you have found an issue, even if it is not the cause of
> the refcount issue.  I noted in a reply to v4 of the patch:
>
>   >> +static void of_populate_phandle_cache(void)
>   >> +{
>   >> +Â Â Â  unsigned long flags;
>   >> +Â Â Â  u32 cache_entries;
>   >> +Â Â Â  struct device_node *np;
>   >> +Â Â Â  u32 phandles = 0;
>   >> +
>   >> +Â Â Â  raw_spin_lock_irqsave(&devtree_lock, flags);
>   >> +
>   >> +Â Â Â  kfree(phandle_cache);
>   >
>   > I couldn't understood this. Everything else looks good to me.
>
>   I will be adding a call to of_populate_phandle_cache() from the
>   devicetree overlay code.  I put the kfree here so that the previous
>   cache memory is freed when a new cache is created.
>
>   Adding the call from the overlay code is not done in this
>   series because I have a patch series modifying overlays and
>   I do not want to create a conflict or ordering between that
>   series and that patch.  The lack of the call from overlay
>   code means that overlay code will gain some of the overhead
>   reduction from this patch, but possibly not the entire reduction.
>
> Sorry I'm not giving a link to the archive of this message - I have
> a class I have to go to so I don't have enough time to find it.

I understand, that's cool.  I've found the email chain, thanks for
pointing me to it!

>  The
> email was
>
>   Subject: Re: [PATCH v3] of: cache phandle nodes to reduce cost of
>   of_find_node_by_phandle()
>   Date: Fri, 16 Feb 2018 14:20:22 -0800
>   Message-ID: <46d5fc76-33e3-d54a-26b8-e9bb8332924d@gmail.com>
>
> Quickly looking at the current code, I don't see the overlay patch
> that I mentioned.  I have to dig into what happened to that.
>
> Leaving a phandle from an overlay in the phandle cache after the
> overlay is removed would clearly be a bug.

Yes that's totally it, when I get a match of a phandle and the np
address is stale, then comes the kernel warning.

I added more debug code to print out the results of looking up dynamic
nodes only (nodes > the highest node added when the cache was
originally built).  Also I added debug code to also force the lookup
code to go back and look up dynamic nodes the slow, uncached way so I
could see if it got the correct np address.  I'm unloading and
reloading an overlay, so sometimes the new np (after the overlay was
removed and reapplied) is at the same address, but sometimes it isn't
and then I get the kernel warning.  Just for entertainment value, here
is my prints of the np from the cache and the correct np that' at a
different address.  The '88' is the phandle, after that is the np
address, then the np=<full_name>.  Note also the corruption in the
cached node's full_name

OF: of_find_node_by_phandle line 1108 found dynamic node in cache 88
ee518680 np=/soc/base_fpga_region/\300\252~\356ze_controller@0x100000450

OF: of_find_node_by_phandle line 1119 found the slow way: 88 eeaf2d00
np=/soc/base_fpga_region/freeze_controller@0x100000450

I'm just scratching my head and wondering: should we be zeroing out
some things in memory after an overlay is removed?  Then this bug
would led to a pointer that would have thrown an oops.

If it's useful for you, next week I can send a cleaned up version of
my patch that rebuilds the cache after overlays are applied or
removed.

Alan

>
> -Frank
Finn Thain Aug. 30, 2018, 12:44 a.m. UTC | #7
Hi Frank,

Linux v4.17 and later will no longer boot on a G3 PowerMac. The boot hangs 
very early, before any video driver loads.

Stan and I were able to bisect the regression between v4.16 and v4.17 and 
arrived at commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of 
of_find_node_by_phandle()").

I don't see any obvious bug in 0b3ce78e90fc or b9952b5218ad. But if you 
revert these from v4.18 (which is also affected) that certainly resolves 
the issue.

I did see this in the kernel messages:

Duplicate name in PowerPC,750, renamed to "l2-cache#1"
Duplicate name in mac-io, renamed to "ide#1"
Duplicate name in ide#1, renamed to "atapi-disk#1"
Duplicate name in multifunc-device, renamed to "pci1799,1#1"

No idea whether that's relevant; I haven't done any further investigation. 
Complete dmesg output is attached. Please let me know if there's any more 
information you need to help find the bug.

Thanks.
Rob Herring Aug. 30, 2018, 1:05 a.m. UTC | #8
On Wed, Aug 29, 2018 at 7:44 PM Finn Thain <fthain@telegraphics.com.au> wrote:
>
> Hi Frank,
>
> Linux v4.17 and later will no longer boot on a G3 PowerMac. The boot hangs
> very early, before any video driver loads.
>
> Stan and I were able to bisect the regression between v4.16 and v4.17 and
> arrived at commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
> of_find_node_by_phandle()").
>
> I don't see any obvious bug in 0b3ce78e90fc or b9952b5218ad. But if you
> revert these from v4.18 (which is also affected) that certainly resolves
> the issue.

Perhaps a bad assumption on phandle values causing a problem. Can you
provide a dump of all the phandle or linux,phandle values from
/proc/device-tree.

Rob
Benjamin Herrenschmidt Aug. 31, 2018, 4:35 a.m. UTC | #9
On Thu, 2018-08-30 at 20:39 -0600, Mac User wrote:
> On 8/29/18 7:05 PM, Rob Herring wrote:
> 
> > On Wed, Aug 29, 2018 at 7:44 PM Finn Thain <fthain@telegraphics.com.au> wrote:
> > > Hi Frank,
> > > 
> > > Linux v4.17 and later will no longer boot on a G3 PowerMac. The boot hangs
> > > very early, before any video driver loads.
> > > 
> > > Stan and I were able to bisect the regression between v4.16 and v4.17 and
> > > arrived at commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
> > > of_find_node_by_phandle()").
> > > 
> > > I don't see any obvious bug in 0b3ce78e90fc or b9952b5218ad. But if you
> > > revert these from v4.18 (which is also affected) that certainly resolves
> > > the issue.
> > 
> > Perhaps a bad assumption on phandle values causing a problem. Can you
> > provide a dump of all the phandle or linux,phandle values from
> > /proc/device-tree.
> > 
> > Rob
> 
> Rob,
> 
> As suggested by Finn, I installed device-tree-compiler and
> powerpc-ibm-utils.
>  
> Running "dtc -I fs -H both /sys/firmware/devicetree/base"
> resulted in the following errors:
> 
> DTC: fs->dts on file "/sys/firmware/devicetree/base"
> ERROR (name_properties): "name" property in
> /pci/multifunc-device/pci1799,1#1 is incorrect ("pci1799,1" instead of
> base node name)
> ERROR (name_properties): "name" property in /pci/mac-io/ide#1 is
> incorrect ("ide" instead of base node name)
> ERROR (name_properties): "name" property in
> /pci/mac-io/ide#1/atapi-disk#1 is incorrect ("atapi-disk" instead of
> base node name)
> ERROR (name_properties): "name" property in /cpus/PowerPC,750/l2-cache#1
> is incorrect ("l2-cache" instead of base node name)
> ERROR: Input tree has errors, aborting (use -f to force output)
> 
> If I force output with "-f", the resulting file has no occurrences 
> of "phandle".

Are you booting with BootX or Open Firmware ?

> Running "lsprop /proc/device-tree | grep -i phandle" results in no 
> output.
> 
> Please let me know if there's some other way to get information that 
> would be helpful.
> 
> thanks
> 
> -Stan
Frank Rowand Aug. 31, 2018, 4:36 a.m. UTC | #10
Hi Finn,

On 08/29/18 17:44, Finn Thain wrote:
> Hi Frank,
> 
> Linux v4.17 and later will no longer boot on a G3 PowerMac. The boot hangs 
> very early, before any video driver loads.
> 
> Stan and I were able to bisect the regression between v4.16 and v4.17 and 
> arrived at commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of 
> of_find_node_by_phandle()").
> 
> I don't see any obvious bug in 0b3ce78e90fc or b9952b5218ad. But if you 
> revert these from v4.18 (which is also affected) that certainly resolves 
> the issue.
> 
> I did see this in the kernel messages:
> 
> Duplicate name in PowerPC,750, renamed to "l2-cache#1"
> Duplicate name in mac-io, renamed to "ide#1"
> Duplicate name in ide#1, renamed to "atapi-disk#1"
> Duplicate name in multifunc-device, renamed to "pci1799,1#1"
> 
> No idea whether that's relevant; I haven't done any further investigation. 
> Complete dmesg output is attached. Please let me know if there's any more 
> information you need to help find the bug.
> 
> Thanks.

I don't have any useful answers yet, but I am following the thread and have
also quickly scanned the two commits for any obvious cause.  I will look
into this some more, but have a few other tasks that I need to complete
first.

A long shot, but something to consider, is that I failed to cover the
cases of dynamic devicetree updates (removing nodes that contain a
phandle) in ways other than overlays.  Michael Ellerman has reported
such a problem for powerpc/mobility with of_detach_node().  A patch to
fix that is one of the tasks I need to complete.

-Frank
Benjamin Herrenschmidt Aug. 31, 2018, 4:49 a.m. UTC | #11
On Fri, 2018-08-31 at 14:35 +1000, Benjamin Herrenschmidt wrote:
> 
> > If I force output with "-f", the resulting file has no occurrences 
> > of "phandle".
> 
> Are you booting with BootX or Open Firmware ?

Assuming you are using BootX (or miBoot), can you try this patch ?

--- a/arch/powerpc/platforms/powermac/bootx_init.c
+++ b/arch/powerpc/platforms/powermac/bootx_init.c
@@ -37,6 +37,7 @@ static unsigned long __initdata bootx_dt_strend;
 static unsigned long __initdata bootx_node_chosen;
 static boot_infos_t * __initdata bootx_info;
 static char __initdata bootx_disp_path[256];
+static int __initdata bootx_phandle;
 
 /* Is boot-info compatible ? */
 #define BOOT_INFO_IS_COMPATIBLE(bi) \
@@ -258,6 +259,8 @@ static void __init bootx_scan_dt_build_strings(unsigned long base,
                namep = pp->name ? (char *)(base + pp->name) : NULL;
                if (namep == NULL || strcmp(namep, "name") == 0)
                        goto next;
+               if (!strcmp(namep, "phandle") || !strcmp(namep, "linux,phandle"))
+                       bootx_phandle = -1;
                /* get/create string entry */
                soff = bootx_dt_find_string(namep);
                if (soff == 0)
@@ -310,6 +313,7 @@ static void __init bootx_scan_dt_build_struct(unsigned long base,
        *mem_end = _ALIGN_UP((unsigned long)lp + 1, 4);
 
        /* get and store all properties */
+       has_phandle = false;
        while (*ppp) {
                struct bootx_dt_prop *pp =
                        (struct bootx_dt_prop *)(base + *ppp);
@@ -330,6 +334,12 @@ static void __init bootx_scan_dt_build_struct(unsigned long base,
                ppp = &pp->next;
        }
 
+       /* add a phandle */
+       if (bootx_phandle > 0) {
+               bootx_dt_add_prop("phandle", &bootx_phandle, 4, mem_end);
+               bootx_phandle++;
+       }
+
        if (node == bootx_node_chosen) {
                bootx_add_chosen_props(base, mem_end);
                if (bootx_info->dispDeviceRegEntryOffset == 0)
@@ -385,6 +395,8 @@ static unsigned long __init bootx_flatten_dt(unsigned long start)
        bootx_dt_add_string("linux,bootx-height", &mem_end);
        bootx_dt_add_string("linux,bootx-linebytes", &mem_end);
        bootx_dt_add_string("linux,bootx-addr", &mem_end);
+       if (bootx_phandle > 0)
+               bootx_dt_add_string("phandle", &mem_end);
        /* Wrap up strings */
        hdr->off_dt_strings = bootx_dt_strbase - mem_start;
        hdr->dt_strings_size = bootx_dt_strend - bootx_dt_strbase;
@@ -482,6 +494,7 @@ void __init bootx_init(unsigned long r3, unsigned long r4)
        bootx_dt_strbase = bootx_dt_strend = 0;
        bootx_node_chosen = 0;
        bootx_disp_path[0] = 0;
+       bootx_phandle = 1;
 
        if (!BOOT_INFO_IS_V2_COMPATIBLE(bi))
                bi->logicalDisplayBase = bi->dispDeviceBase;
Benjamin Herrenschmidt Aug. 31, 2018, 4:49 a.m. UTC | #12
On Fri, 2018-08-31 at 14:35 +1000, Benjamin Herrenschmidt wrote:
> 
> > If I force output with "-f", the resulting file has no occurrences 
> > of "phandle".
> 
> Are you booting with BootX or Open Firmware ?

Assuming you are using BootX (or miBoot), can you try this patch ?

--- a/arch/powerpc/platforms/powermac/bootx_init.c
+++ b/arch/powerpc/platforms/powermac/bootx_init.c
@@ -37,6 +37,7 @@ static unsigned long __initdata bootx_dt_strend;
 static unsigned long __initdata bootx_node_chosen;
 static boot_infos_t * __initdata bootx_info;
 static char __initdata bootx_disp_path[256];
+static int __initdata bootx_phandle;
 
 /* Is boot-info compatible ? */
 #define BOOT_INFO_IS_COMPATIBLE(bi) \
@@ -258,6 +259,8 @@ static void __init bootx_scan_dt_build_strings(unsigned long base,
                namep = pp->name ? (char *)(base + pp->name) : NULL;
                if (namep == NULL || strcmp(namep, "name") == 0)
                        goto next;
+               if (!strcmp(namep, "phandle") || !strcmp(namep, "linux,phandle"))
+                       bootx_phandle = -1;
                /* get/create string entry */
                soff = bootx_dt_find_string(namep);
                if (soff == 0)
@@ -330,6 +333,12 @@ static void __init bootx_scan_dt_build_struct(unsigned long base,
                ppp = &pp->next;
        }
 
+       /* add a phandle */
+       if (bootx_phandle > 0) {
+               bootx_dt_add_prop("phandle", &bootx_phandle, 4, mem_end);
+               bootx_phandle++;
+       }
+
        if (node == bootx_node_chosen) {
                bootx_add_chosen_props(base, mem_end);
                if (bootx_info->dispDeviceRegEntryOffset == 0)
@@ -385,6 +394,8 @@ static unsigned long __init bootx_flatten_dt(unsigned long start)
        bootx_dt_add_string("linux,bootx-height", &mem_end);
        bootx_dt_add_string("linux,bootx-linebytes", &mem_end);
        bootx_dt_add_string("linux,bootx-addr", &mem_end);
+       if (bootx_phandle > 0)
+               bootx_dt_add_string("phandle", &mem_end);
        /* Wrap up strings */
        hdr->off_dt_strings = bootx_dt_strbase - mem_start;
        hdr->dt_strings_size = bootx_dt_strend - bootx_dt_strbase;
@@ -482,6 +493,7 @@ void __init bootx_init(unsigned long r3, unsigned long r4)
        bootx_dt_strbase = bootx_dt_strend = 0;
        bootx_node_chosen = 0;
        bootx_disp_path[0] = 0;
+       bootx_phandle = 1;
 
        if (!BOOT_INFO_IS_V2_COMPATIBLE(bi))
                bi->logicalDisplayBase = bi->dispDeviceBase;
Benjamin Herrenschmidt Aug. 31, 2018, 4:58 a.m. UTC | #13
On Thu, 2018-08-30 at 21:36 -0700, Frank Rowand wrote:
> > No idea whether that's relevant; I haven't done any further investigation. 
> > Complete dmesg output is attached. Please let me know if there's any more 
> > information you need to help find the bug.
> > 
> > Thanks.
> 
> I don't have any useful answers yet, but I am following the thread and have
> also quickly scanned the two commits for any obvious cause.  I will look
> into this some more, but have a few other tasks that I need to complete
> first.
> 
> A long shot, but something to consider, is that I failed to cover the
> cases of dynamic devicetree updates (removing nodes that contain a
> phandle) in ways other than overlays.  Michael Ellerman has reported
> such a problem for powerpc/mobility with of_detach_node().  A patch to
> fix that is one of the tasks I need to complete.

The only thing I can think of is booting via the BootX bootloader on
those ancient macs results in a DT with no phandles. I didn't see an
obvious reason why that would cause that patch to break though.

Cheers,
Ben.
Benjamin Herrenschmidt Sept. 9, 2018, 5:04 p.m. UTC | #14
On Fri, 2018-08-31 at 14:58 +1000, Benjamin Herrenschmidt wrote:
> 
> > A long shot, but something to consider, is that I failed to cover the
> > cases of dynamic devicetree updates (removing nodes that contain a
> > phandle) in ways other than overlays.  Michael Ellerman has reported
> > such a problem for powerpc/mobility with of_detach_node().  A patch to
> > fix that is one of the tasks I need to complete.
> 
> The only thing I can think of is booting via the BootX bootloader on
> those ancient macs results in a DT with no phandles. I didn't see an
> obvious reason why that would cause that patch to break though.

Guys, we still don't have a fix for this one on its way upstream...

My test patch just creates phandle properties for all nodes, that was
not intended as a fix, more a way to check if the problem was related
to the lack of phandles.

I don't actually know why the new code causes things to fail when
phandles are absent. This needs to be looked at.

I'm travelling at the moment and generally caught up with other things,
I haven't had a chance to dig, so just a heads up. I don't intend to
submit my patch since it's just a band aid. We need to figure out what
the actual problem is.

Cheers,
Ben.
Frank Rowand Sept. 9, 2018, 11:52 p.m. UTC | #15
On 09/09/18 10:04, Benjamin Herrenschmidt wrote:
> On Fri, 2018-08-31 at 14:58 +1000, Benjamin Herrenschmidt wrote:
>>
>>> A long shot, but something to consider, is that I failed to cover the
>>> cases of dynamic devicetree updates (removing nodes that contain a
>>> phandle) in ways other than overlays.  Michael Ellerman has reported
>>> such a problem for powerpc/mobility with of_detach_node().  A patch to
>>> fix that is one of the tasks I need to complete.
>>
>> The only thing I can think of is booting via the BootX bootloader on
>> those ancient macs results in a DT with no phandles. I didn't see an
>> obvious reason why that would cause that patch to break though.
> 
> Guys, we still don't have a fix for this one on its way upstream...
> 
> My test patch just creates phandle properties for all nodes, that was
> not intended as a fix, more a way to check if the problem was related
> to the lack of phandles.
> 
> I don't actually know why the new code causes things to fail when
> phandles are absent. This needs to be looked at.
> 
> I'm travelling at the moment and generally caught up with other things,
> I haven't had a chance to dig, so just a heads up. I don't intend to
> submit my patch since it's just a band aid. We need to figure out what
> the actual problem is.
> 
> Cheers,
> Ben.

Thanks for chasing after this, and the current heads up.

I agree that we need to understand what is going on and do a real
fix.

-Frank
Rob Herring Sept. 10, 2018, 12:53 p.m. UTC | #16
On Sun, Sep 09, 2018 at 07:04:25PM +0200, Benjamin Herrenschmidt wrote:
> On Fri, 2018-08-31 at 14:58 +1000, Benjamin Herrenschmidt wrote:
> > 
> > > A long shot, but something to consider, is that I failed to cover the
> > > cases of dynamic devicetree updates (removing nodes that contain a
> > > phandle) in ways other than overlays.  Michael Ellerman has reported
> > > such a problem for powerpc/mobility with of_detach_node().  A patch to
> > > fix that is one of the tasks I need to complete.
> > 
> > The only thing I can think of is booting via the BootX bootloader on
> > those ancient macs results in a DT with no phandles. I didn't see an
> > obvious reason why that would cause that patch to break though.
> 
> Guys, we still don't have a fix for this one on its way upstream...
> 
> My test patch just creates phandle properties for all nodes, that was
> not intended as a fix, more a way to check if the problem was related
> to the lack of phandles.
> 
> I don't actually know why the new code causes things to fail when
> phandles are absent. This needs to be looked at.
> 
> I'm travelling at the moment and generally caught up with other things,
> I haven't had a chance to dig, so just a heads up. I don't intend to
> submit my patch since it's just a band aid. We need to figure out what
> the actual problem is.

Can you try this patch (w/o Ben's patch). I think the problem is if 
there are no phandles, then roundup_pow_of_two is passed 0 which is 
documented as undefined result.

Though, if a DT has no properties with phandles, then why are we doing a 
lookup in the first place?


8<----------------------------------------------------------------------

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 9095b8290150..74eaedd5b860 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -140,6 +140,9 @@ void of_populate_phandle_cache(void)
 		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
 			phandles++;
 
+	if (!phandles)
+		goto out;
+
 	cache_entries = roundup_pow_of_two(phandles);
 	phandle_cache_mask = cache_entries - 1;
Frank Rowand Sept. 11, 2018, 3:53 p.m. UTC | #17
On 09/10/18 05:53, Rob Herring wrote:
> On Sun, Sep 09, 2018 at 07:04:25PM +0200, Benjamin Herrenschmidt wrote:
>> On Fri, 2018-08-31 at 14:58 +1000, Benjamin Herrenschmidt wrote:
>>>
>>>> A long shot, but something to consider, is that I failed to cover the
>>>> cases of dynamic devicetree updates (removing nodes that contain a
>>>> phandle) in ways other than overlays.  Michael Ellerman has reported
>>>> such a problem for powerpc/mobility with of_detach_node().  A patch to
>>>> fix that is one of the tasks I need to complete.
>>>
>>> The only thing I can think of is booting via the BootX bootloader on
>>> those ancient macs results in a DT with no phandles. I didn't see an
>>> obvious reason why that would cause that patch to break though.
>>
>> Guys, we still don't have a fix for this one on its way upstream...
>>
>> My test patch just creates phandle properties for all nodes, that was
>> not intended as a fix, more a way to check if the problem was related
>> to the lack of phandles.
>>
>> I don't actually know why the new code causes things to fail when
>> phandles are absent. This needs to be looked at.
>>
>> I'm travelling at the moment and generally caught up with other things,
>> I haven't had a chance to dig, so just a heads up. I don't intend to
>> submit my patch since it's just a band aid. We need to figure out what
>> the actual problem is.
> 
> Can you try this patch (w/o Ben's patch). I think the problem is if 
> there are no phandles, then roundup_pow_of_two is passed 0 which is 
> documented as undefined result.
> 
> Though, if a DT has no properties with phandles, then why are we doing a 
> lookup in the first place?
> 
> 
> 8<----------------------------------------------------------------------
> 
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 9095b8290150..74eaedd5b860 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -140,6 +140,9 @@ void of_populate_phandle_cache(void)
>  		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
>  			phandles++;
>  
> +	if (!phandles)
> +		goto out;
> +
>  	cache_entries = roundup_pow_of_two(phandles);
>  	phandle_cache_mask = cache_entries - 1;
>  
> 

Thanks Rob!  That fix makes sense, and the test results look
promising.

Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Finn Thain Sept. 12, 2018, 12:15 a.m. UTC | #18
[The Cc list got pruned so I'm forwarding Stan's reply for the benefit of 
the list archives and any other interested parties.]

On Mon, 10 Sep 2018, Stan Johnson wrote:

> On 9/10/18 6:53 AM, Rob Herring wrote:
> 
> > ...
> > Can you try this patch (w/o Ben's patch). I think the problem is if 
> > there are no phandles, then roundup_pow_of_two is passed 0 which is 
> > documented as undefined result.
> >
> > Though, if a DT has no properties with phandles, then why are we doing a 
> > lookup in the first place?
> >
> >
> > 8<----------------------------------------------------------------------
> >
> > diff --git a/drivers/of/base.c b/drivers/of/base.c
> > index 9095b8290150..74eaedd5b860 100644
> > --- a/drivers/of/base.c
> > +++ b/drivers/of/base.c
> > @@ -140,6 +140,9 @@ void of_populate_phandle_cache(void)
> >  		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
> >  			phandles++;
> >  
> > +	if (!phandles)
> > +		goto out;
> > +
> >  	cache_entries = roundup_pow_of_two(phandles);
> >  	phandle_cache_mask = cache_entries - 1;
> >  
> 
> Using the attached .config file, I first compiled the stock 4.18
> kernel from kernel.org; it hung at the Mac OS background screen
> on my Beige G3 Desktop using the BootX extension and the BootX.app
> from within MacOS.  I then applied the above patch, and it booted
> without any problems from both the BootX extension and using the
> BootX.app from within MacOS.
> 
> -Stan
> 
>

Patch
diff mbox series

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ad28de96e13f..e71d157d7149 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -91,10 +91,72 @@  int __weak of_node_to_nid(struct device_node *np)
 }
 #endif
 
+static struct device_node **phandle_cache;
+static u32 phandle_cache_mask;
+
+/*
+ * Assumptions behind phandle_cache implementation:
+ *   - phandle property values are in a contiguous range of 1..n
+ *
+ * If the assumptions do not hold, then
+ *   - the phandle lookup overhead reduction provided by the cache
+ *     will likely be less
+ */
+static void of_populate_phandle_cache(void)
+{
+	unsigned long flags;
+	u32 cache_entries;
+	struct device_node *np;
+	u32 phandles = 0;
+
+	raw_spin_lock_irqsave(&devtree_lock, flags);
+
+	kfree(phandle_cache);
+	phandle_cache = NULL;
+
+	for_each_of_allnodes(np)
+		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+			phandles++;
+
+	cache_entries = roundup_pow_of_two(phandles);
+	phandle_cache_mask = cache_entries - 1;
+
+	phandle_cache = kcalloc(cache_entries, sizeof(*phandle_cache),
+				GFP_ATOMIC);
+	if (!phandle_cache)
+		goto out;
+
+	for_each_of_allnodes(np)
+		if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+			phandle_cache[np->phandle & phandle_cache_mask] = np;
+
+out:
+	raw_spin_unlock_irqrestore(&devtree_lock, flags);
+}
+
+#ifndef CONFIG_MODULES
+static int __init of_free_phandle_cache(void)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&devtree_lock, flags);
+
+	kfree(phandle_cache);
+	phandle_cache = NULL;
+
+	raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+	return 0;
+}
+late_initcall_sync(of_free_phandle_cache);
+#endif
+
 void __init of_core_init(void)
 {
 	struct device_node *np;
 
+	of_populate_phandle_cache();
+
 	/* Create the kset, and register existing nodes */
 	mutex_lock(&of_mutex);
 	of_kset = kset_create_and_add("devicetree", NULL, firmware_kobj);
@@ -1021,16 +1083,32 @@  int of_modalias_node(struct device_node *node, char *modalias, int len)
  */
 struct device_node *of_find_node_by_phandle(phandle handle)
 {
-	struct device_node *np;
+	struct device_node *np = NULL;
 	unsigned long flags;
+	phandle masked_handle;
 
 	if (!handle)
 		return NULL;
 
 	raw_spin_lock_irqsave(&devtree_lock, flags);
-	for_each_of_allnodes(np)
-		if (np->phandle == handle)
-			break;
+
+	masked_handle = handle & phandle_cache_mask;
+
+	if (phandle_cache) {
+		if (phandle_cache[masked_handle] &&
+		    handle == phandle_cache[masked_handle]->phandle)
+			np = phandle_cache[masked_handle];
+	}
+
+	if (!np) {
+		for_each_of_allnodes(np)
+			if (np->phandle == handle) {
+				if (phandle_cache)
+					phandle_cache[masked_handle] = np;
+				break;
+			}
+	}
+
 	of_node_get(np);
 	raw_spin_unlock_irqrestore(&devtree_lock, flags);
 	return np;
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 0c609e7d0334..fa70650136b4 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -131,6 +131,9 @@  extern void __of_update_property_sysfs(struct device_node *np,
 extern void __of_sysfs_remove_bin_file(struct device_node *np,
 				       struct property *prop);
 
+/* illegal phandle value (set when unresolved) */
+#define OF_PHANDLE_ILLEGAL	0xdeadbeef
+
 /* iterators for transactions, used for overlays */
 /* forward iterator */
 #define for_each_transaction_entry(_oft, _te) \
diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
index 740d19bde601..b2ca8185c8c6 100644
--- a/drivers/of/resolver.c
+++ b/drivers/of/resolver.c
@@ -19,9 +19,6 @@ 
 
 #include "of_private.h"
 
-/* illegal phandle value (set when unresolved) */
-#define OF_PHANDLE_ILLEGAL	0xdeadbeef
-
 static phandle live_tree_max_phandle(void)
 {
 	struct device_node *node;