* Re: [PATCH] cacheinfo: Decrement refcount in cache_setup_of_node()
[not found] <20221026185954.991547-1-pierre.gondois@arm.com>
@ 2022-11-15 13:06 ` Geert Uytterhoeven
[not found] ` <CGME20221115193015eucas1p18c4b8ba131b72bc9017a14568a6a54bc@eucas1p1.samsung.com>
0 siblings, 1 reply; 3+ messages in thread
From: Geert Uytterhoeven @ 2022-11-15 13:06 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux-kernel, Rob.Herring, sudeep.holla, Greg Kroah-Hartman,
Rafael J. Wysocki, Linux-Renesas
Hi Pierre,
On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
> Refcounts to DT nodes are only incremented in the function
> and never decremented. Decrease the refcounts when necessary.
>
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Thanks for your patch, which is now commit 3da72e18371c41a6
("cacheinfo: Decrement refcount in cache_setup_of_node()") in
driver-core-next.
I have bisected a refcount underflow during s2ram to this commit:
PM: suspend entry (deep)
Filesystems sync: 0.003 seconds
Freezing user space processes ... (elapsed 0.009 seconds) done.
OOM killer disabled.
Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done.
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
Disabling non-boot CPUs ...
psci: CPU1 killed (polled 0 ms)
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 2 PID: 22 at lib/refcount.c:25
refcount_warn_saturate+0x98/0x140
CPU: 2 PID: 22 Comm: cpuhp/2 Not tainted
6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0x98/0x140
lr : refcount_warn_saturate+0x98/0x140
sp : ffff80000a3b3710
x29: ffff80000a3b3710 x28: ffff0a00ffffff05 x27: ffff80000a3b3aa2
x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0
x23: ffff80000a3b3a88 x22: 0000000000000000 x21: ffff80000a3b3aa7
x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010
x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00
x14: 00000000000001e6 x13: ffff0004c028ca00 x12: 00000000ffffffea
x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850
x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4
x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
refcount_warn_saturate+0x98/0x140
kobject_get+0x94/0x98
of_node_get+0x1c/0x30
of_fwnode_get+0x30/0x50
fwnode_handle_get+0x2c/0x48
fwnode_get_nth_parent+0x70/0x80
fwnode_full_name_string+0x44/0xb8
device_node_string+0x318/0x3f0
pointer+0x248/0x350
vsnprintf+0x1fc/0x720
vprintk_store+0x104/0x4c8
vprintk_emit+0x104/0x340
vprintk_default+0x34/0x40
vprintk+0xc4/0xe8
_printk+0x5c/0x80
of_node_release+0xc0/0xe0
kobject_put+0xa4/0xf0
of_node_put+0x14/0x20
free_cache_attributes.part.0+0x138/0x190
cacheinfo_cpu_pre_down+0x70/0x88
cpuhp_invoke_callback+0x1d8/0xbc0
cpuhp_thread_fun+0x1b0/0x218
smpboot_thread_fn+0x1b4/0x270
kthread+0x10c/0x118
ret_from_fork+0x10/0x20
irq event stamp: 438
hardirqs last enabled at (437): [<ffff80000827c668>]
put_cpu_partial+0x1e8/0x258
hardirqs last disabled at (438): [<ffff8000080ff52c>]
vprintk_store+0x394/0x4c8
softirqs last enabled at (0): [<ffff8000080820dc>]
copy_process+0x6bc/0x1888
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 22 at lib/refcount.c:28
refcount_warn_saturate+0xec/0x140
CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W
6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xec/0x140
lr : refcount_warn_saturate+0xec/0x140
sp : ffff80000a3b3720
x29: ffff80000a3b3720 x28: ffff0a00ffffff05 x27: ffff80000a3b3aa2
x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0
x23: ffff80000a3b3a88 x22: 00000000ffffffff x21: ffff80000a3b3aba
x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010
x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00
x14: 0000000000000219 x13: ffff0004c028ca00 x12: 00000000ffffffea
x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850
x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4
x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
refcount_warn_saturate+0xec/0x140
kobject_put+0xe4/0xf0
of_node_put+0x14/0x20
of_fwnode_put+0x30/0x40
fwnode_handle_put.part.0+0x1c/0x28
fwnode_handle_put+0x18/0x28
fwnode_full_name_string+0x94/0xb8
device_node_string+0x318/0x3f0
pointer+0x248/0x350
vsnprintf+0x1fc/0x720
vprintk_store+0x104/0x4c8
vprintk_emit+0x104/0x340
vprintk_default+0x34/0x40
vprintk+0xc4/0xe8
_printk+0x5c/0x80
of_node_release+0xc0/0xe0
kobject_put+0xa4/0xf0
of_node_put+0x14/0x20
free_cache_attributes.part.0+0x138/0x190
cacheinfo_cpu_pre_down+0x70/0x88
cpuhp_invoke_callback+0x1d8/0xbc0
cpuhp_thread_fun+0x1b0/0x218
smpboot_thread_fn+0x1b4/0x270
kthread+0x10c/0x118
ret_from_fork+0x10/0x20
irq event stamp: 438
hardirqs last enabled at (437): [<ffff80000827c668>]
put_cpu_partial+0x1e8/0x258
hardirqs last disabled at (438): [<ffff8000080ff52c>]
vprintk_store+0x394/0x4c8
softirqs last enabled at (0): [<ffff8000080820dc>]
copy_process+0x6bc/0x1888
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---
OF: ERROR: Bad of_node_put() on /cpus/cache-controller-0
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 2 PID: 22 at lib/refcount.c:22
refcount_warn_saturate+0x6c/0x140
CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W
6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0x6c/0x140
lr : refcount_warn_saturate+0x6c/0x140
sp : ffff80000a3b3670
x29: ffff80000a3b3670 x28: ffff0a00ffffff05 x27: ffff80000a1f8bfa
x26: ffff8000091c5fe0 x25: ffff800008f2ade8 x24: ffff0006ff7c1bf0
x23: ffff80000a1f8c14 x22: 0000000000000000 x21: ffff80000a1f8bff
x20: ffff0000ffffff00 x19: ffff0006ff7c1c58 x18: 0000000000000010
x17: 0000000000000004 x16: 0000000000000000 x15: ffff0004c028ca00
x14: 000000000000024a x13: ffff0004c028ca00 x12: 00000000ffffffea
x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff8000095cc850
x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000bff4
x5 : ffff0006ff704b60 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000023 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
refcount_warn_saturate+0x6c/0x140
kobject_get+0x7c/0x98
of_node_get+0x1c/0x30
of_fwnode_get+0x30/0x50
fwnode_handle_get+0x2c/0x48
fwnode_get_nth_parent+0x70/0x80
fwnode_full_name_string+0x44/0xb8
device_node_string+0x318/0x3f0
pointer+0x248/0x350
vsnprintf+0x1fc/0x720
vscnprintf+0x30/0x50
printk_sprint+0x3c/0x1a8
vprintk_store+0x414/0x4c8
vprintk_emit+0x104/0x340
vprintk_default+0x34/0x40
vprintk+0xc4/0xe8
_printk+0x5c/0x80
of_node_release+0xc0/0xe0
kobject_put+0xa4/0xf0
of_node_put+0x14/0x20
free_cache_attributes.part.0+0x138/0x190
cacheinfo_cpu_pre_down+0x70/0x88
cpuhp_invoke_callback+0x1d8/0xbc0
cpuhp_thread_fun+0x1b0/0x218
smpboot_thread_fn+0x1b4/0x270
kthread+0x10c/0x118
ret_from_fork+0x10/0x20
irq event stamp: 438
hardirqs last enabled at (437): [<ffff80000827c668>]
put_cpu_partial+0x1e8/0x258
hardirqs last disabled at (438): [<ffff8000080ff52c>]
vprintk_store+0x394/0x4c8
softirqs last enabled at (0): [<ffff8000080820dc>]
copy_process+0x6bc/0x1888
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---
CPU: 2 PID: 22 Comm: cpuhp/2 Tainted: G W
6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
Call trace:
dump_backtrace+0xcc/0xd8
show_stack+0x14/0x38
dump_stack_lvl+0x88/0xb0
dump_stack+0x14/0x2c
of_node_release+0xc4/0xe0
kobject_put+0xa4/0xf0
of_node_put+0x14/0x20
free_cache_attributes.part.0+0x138/0x190
cacheinfo_cpu_pre_down+0x70/0x88
cpuhp_invoke_callback+0x1d8/0xbc0
cpuhp_thread_fun+0x1b0/0x218
smpboot_thread_fn+0x1b4/0x270
kthread+0x10c/0x118
ret_from_fork+0x10/0x20
psci: CPU2 killed (polled 0 ms)
psci: CPU3 killed (polled 0 ms)
psci: CPU4 killed (polled 4 ms)
OF: ERROR: Bad of_node_put() on /cpus/cache-controller-1
CPU: 5 PID: 37 Comm: cpuhp/5 Tainted: G W
6.1.0-rc1-arm64-renesas-00026-g3da72e18371c #1870
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
Call trace:
dump_backtrace+0xcc/0xd8
show_stack+0x14/0x38
dump_stack_lvl+0x88/0xb0
dump_stack+0x14/0x2c
of_node_release+0xc4/0xe0
kobject_put+0xa4/0xf0
of_node_put+0x14/0x20
free_cache_attributes.part.0+0x138/0x190
cacheinfo_cpu_pre_down+0x70/0x88
cpuhp_invoke_callback+0x1d8/0xbc0
cpuhp_thread_fun+0x1b0/0x218
smpboot_thread_fn+0x1b4/0x270
kthread+0x10c/0x118
ret_from_fork+0x10/0x20
psci: CPU5 killed (polled 0 ms)
psci: CPU6 killed (polled 0 ms)
psci: CPU7 killed (polled 0 ms)
Reverting this commit fixes the issue.
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -196,7 +196,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
>
> static int cache_setup_of_node(unsigned int cpu)
> {
> - struct device_node *np;
> + struct device_node *np, *prev;
> struct cacheinfo *this_leaf;
> unsigned int index = 0;
>
> @@ -206,19 +206,24 @@ static int cache_setup_of_node(unsigned int cpu)
> return -ENOENT;
> }
>
> + prev = np;
> +
> while (index < cache_leaves(cpu)) {
> this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> - if (this_leaf->level != 1)
> + if (this_leaf->level != 1) {
> np = of_find_next_cache_node(np);
> - else
> - np = of_node_get(np);/* cpu node itself */
> - if (!np)
> - break;
> + of_node_put(prev);
> + prev = np;
> + if (!np)
> + break;
> + }
> cache_of_set_props(this_leaf, np);
> this_leaf->fw_token = np;
> index++;
> }
>
> + of_node_put(np);
> +
> if (index != cache_leaves(cpu)) /* not all OF nodes populated */
> return -ENOENT;
>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] cacheinfo: Decrement refcount in cache_setup_of_node()
[not found] ` <CGME20221115193015eucas1p18c4b8ba131b72bc9017a14568a6a54bc@eucas1p1.samsung.com>
@ 2022-11-15 19:30 ` Marek Szyprowski
2022-11-15 22:10 ` Pierre Gondois
0 siblings, 1 reply; 3+ messages in thread
From: Marek Szyprowski @ 2022-11-15 19:30 UTC (permalink / raw)
To: Geert Uytterhoeven, Pierre Gondois
Cc: linux-kernel, Rob.Herring, sudeep.holla, Greg Kroah-Hartman,
Rafael J. Wysocki, Linux-Renesas
Hi Geert,
On 15.11.2022 14:06, Geert Uytterhoeven wrote:
> On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
>> Refcounts to DT nodes are only incremented in the function
>> and never decremented. Decrease the refcounts when necessary.
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> Thanks for your patch, which is now commit 3da72e18371c41a6
> ("cacheinfo: Decrement refcount in cache_setup_of_node()") in
> driver-core-next.
>
> I have bisected a refcount underflow during s2ram to this commit:
Similar issue can be reproduced with qemu/arm64 'virt' machine during boot:
cacheinfo: Unable to detect cache hierarchy for CPU 0
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 0 PID: 1 at lib/refcount.c:25
refcount_warn_saturate+0xa0/0x144
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc1+ #13076
Hardware name: linux,dummy-virt (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xa0/0x144
lr : refcount_warn_saturate+0xa0/0x144
...
Call trace:
refcount_warn_saturate+0xa0/0x144
kobject_get+0xbc/0xc0
of_node_get+0x20/0x34
of_fwnode_get+0x34/0x54
fwnode_handle_get+0x30/0x4c
fwnode_get_nth_parent+0xec/0x104
fwnode_full_name_string+0x48/0xc0
device_node_string+0x278/0x530
pointer+0x3a0/0x4bc
vsnprintf+0x23c/0x750
vprintk_store+0xf8/0x4a0
vprintk_emit+0x110/0x360
vprintk_default+0x38/0x44
vprintk+0xc0/0xe4
_printk+0x5c/0x84
of_node_release+0x140/0x150
kobject_put+0xa4/0x120
of_node_put+0x18/0x2c
free_cache_attributes+0x13c/0x1e0
detect_cache_attributes+0x4f4/0x580
update_siblings_masks+0x28/0x300
store_cpu_topology+0x58/0x80
smp_prepare_cpus+0x38/0x120
kernel_init_freeable+0x1b8/0x350
kernel_init+0x24/0x130
ret_from_fork+0x10/0x20
irq event stamp: 74
hardirqs last enabled at (73): [<ffffb81addf20ac0>]
__up_console_sem+0x6c/0x70
hardirqs last disabled at (74): [<ffffb81addf22658>]
vprintk_store+0x298/0x4a0
softirqs last enabled at (10): [<ffffb81adde1046c>] _stext+0x46c/0x5d8
softirqs last disabled at (3): [<ffffb81adde166d0>] ____do_softirq+0x10/0x20
---[ end trace 0000000000000000 ]---
I've also observed it during the manual CPU hot plug test (the log comes
from OdroidC4 board):
# for i in /sys/devices/system/cpu/cpu[1-8]; do echo 0 >$i/online;
psci: CPU1 killed (polled 4 ms)
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 2 PID: 23 at lib/refcount.c:25
refcount_warn_saturate+0xa0/0x144
Modules linked in: ipv6 meson_gxl dw_hdmi_i2s_audio dwmac_generic
rc_odroid axg_audio sclk_div clk_phase mdio_mux_meson_g12a meson_ir
snd_soc_meson_axg_tdmout snd
son_g12a_tohdmitx snd_soc_meson_codec_glue crct10dif_ce
snd_soc_meson_axg_fifo meson_gxbb_wdt reset_meson_audio_arb pwm_meson
meson_dw_hdmi meson_rng rtc_meson_vr
_platform snd_soc_meson_axg_sound_card drm_display_helper rng_core
snd_soc_meson_card_utils stmmac panfrost meson_drm drm_shmem_helper
snd_soc_meson_axg_tdm_inter
anvas nvmem_meson_efuse snd_soc_meson_axg_tdm_formatter display_connector
CPU: 2 PID: 23 Comm: cpuhp/2 Not tainted 6.1.0-rc5-next-20221115+ #5935
Hardware name: Hardkernel ODROID-C4 (DT)
pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xa0/0x144
lr : refcount_warn_saturate+0xa0/0x144
...
Call trace:
refcount_warn_saturate+0xa0/0x144
kobject_get+0xbc/0xc0
of_node_get+0x20/0x34
of_fwnode_get+0x34/0x54
fwnode_handle_get+0x30/0x4c
fwnode_get_nth_parent+0xec/0x104
fwnode_full_name_string+0x48/0xc0
device_node_string+0x278/0x530
pointer+0x3a0/0x4bc
vsnprintf+0x23c/0x750
vprintk_store+0xf8/0x4a0
vprintk_emit+0x10c/0x35c
vprintk_default+0x38/0x44
vprintk+0xc0/0xe4
_printk+0x5c/0x84
of_node_release+0x140/0x150
kobject_put+0xa4/0x120
of_node_put+0x18/0x2c
free_cache_attributes+0x13c/0x1e0
cacheinfo_cpu_pre_down+0x80/0x90
cpuhp_invoke_callback+0x16c/0x2b0
cpuhp_thread_fun+0x18c/0x250
smpboot_thread_fn+0x1f4/0x2a0
kthread+0x118/0x11c
ret_from_fork+0x10/0x20
irq event stamp: 204
hardirqs last enabled at (203): [<ffff800008330260>]
put_cpu_partial+0x1bc/0x1fc
hardirqs last disabled at (204): [<ffff8000081247ac>]
vprintk_store+0x298/0x4a0
softirqs last enabled at (16): [<ffff80000801046c>] _stext+0x46c/0x5d8
softirqs last disabled at (3): [<ffff800008016700>] ____do_softirq+0x10/0x20
---[ end trace 0000000000000000 ]---
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] cacheinfo: Decrement refcount in cache_setup_of_node()
2022-11-15 19:30 ` Marek Szyprowski
@ 2022-11-15 22:10 ` Pierre Gondois
0 siblings, 0 replies; 3+ messages in thread
From: Pierre Gondois @ 2022-11-15 22:10 UTC (permalink / raw)
To: Marek Szyprowski, Geert Uytterhoeven
Cc: linux-kernel, Rob.Herring, sudeep.holla, Greg Kroah-Hartman,
Rafael J. Wysocki, Linux-Renesas
On 11/15/22 20:30, Marek Szyprowski wrote:
> Hi Geert,
>
> On 15.11.2022 14:06, Geert Uytterhoeven wrote:
>> On Wed, Oct 26, 2022 at 9:03 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
>>> Refcounts to DT nodes are only incremented in the function
>>> and never decremented. Decrease the refcounts when necessary.
>>>
>>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>> Thanks for your patch, which is now commit 3da72e18371c41a6
>> ("cacheinfo: Decrement refcount in cache_setup_of_node()") in
>> driver-core-next.
>>
>> I have bisected a refcount underflow during s2ram to this commit:
>
> Similar issue can be reproduced with qemu/arm64 'virt' machine during boot:
>
Hello Geert, Marek,
Thanks for reporting the issue. The patch at:
https://lore.kernel.org/all/20221115220520.2076189-1-pierre.gondois@arm.com/
should fix it,
Regards,
Pierre
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-11-15 22:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20221026185954.991547-1-pierre.gondois@arm.com>
2022-11-15 13:06 ` [PATCH] cacheinfo: Decrement refcount in cache_setup_of_node() Geert Uytterhoeven
[not found] ` <CGME20221115193015eucas1p18c4b8ba131b72bc9017a14568a6a54bc@eucas1p1.samsung.com>
2022-11-15 19:30 ` Marek Szyprowski
2022-11-15 22:10 ` Pierre Gondois
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).