All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
@ 2020-03-05 15:57 Naresh Kamboju
  2020-03-05 19:18 ` Andy Shevchenko
  0 siblings, 1 reply; 19+ messages in thread
From: Naresh Kamboju @ 2020-03-05 15:57 UTC (permalink / raw)
  To: open list, open list:KERNEL SELFTEST FRAMEWORK
  Cc: rostedt, sergey.senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
i386, arm and arm64. The steps to reproduce is running kselftests lib
printf.sh test case.
Which is doing modprobe operations.

BTW, there are few RCU warnings from the boot log.
Please refer below link for more details.

Steps reproduce by using kselftests,

          - lsmod || true
          - cd /opt/kselftests/default-in-kernel/lib/
          - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
          - ./printf.sh || true
          - ./bitmap.sh || true
          - ./prime_numbers.sh || true
          - ./strscpy.sh || true

x86_64 kernel BUG dump.
+ ./printf.sh
[   32.594369] test_printf: loaded.
[   32.599859] BUG: kernel NULL pointer dereference, address: 00000000
[   32.606143] #PF: supervisor read access in kernel mode
[   32.611280] #PF: error_code(0x0000) - not-present page
[   32.616419] *pde = 00000000
[   32.619306] Oops: 0000 [#1] SMP
[   32.622452] CPU: 0 PID: 387 Comm: modprobe Not tainted
5.6.0-rc4-next-20200305 #1
[   32.629928] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[   32.637314] EIP: ida_free+0x61/0x130
[   32.640891] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
d3 e8
[   32.659628] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
[   32.665887] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
[   32.672153] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
[   32.678928] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
[   32.685186] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   32.691442] DR6: fffe0ff0 DR7: 00000400
[   32.695273] Call Trace:
[   32.697721]  software_node_release+0x21/0x80
[   32.701990]  kobject_put+0xa8/0x1c0
[   32.705476]  kobject_del+0x40/0x60
[   32.708881]  kobject_put+0x92/0x1c0
[   32.712380]  fwnode_remove_software_node+0x30/0x50
[   32.717201]  software_node_unregister_nodes+0x2b/0x50
[   32.722246]  test_printf_init+0xe00/0x1d29 [test_printf]
[   32.727563]  ? find_held_lock+0x27/0xa0
[   32.731423]  ? test_hashed+0x54/0x54 [test_printf]
[   32.736207]  ? test_hashed+0x54/0x54 [test_printf]
[   32.741001]  do_one_initcall+0x54/0x2e0
[   32.744841]  ? rcu_read_lock_sched_held+0x47/0x70
[   32.749547]  ? kmem_cache_alloc_trace+0x285/0x2b0
[   32.754249]  ? do_init_module+0x21/0x1ef
[   32.758166]  ? do_init_module+0x21/0x1ef
[   32.762087]  do_init_module+0x50/0x1ef
[   32.765837]  load_module+0x1e32/0x2540
[   32.769586]  ? kernel_read_file+0x295/0x2d0
[   32.773771]  sys_finit_module+0x8a/0xe0
[   32.777611]  do_fast_syscall_32+0x8e/0x340
[   32.781709]  entry_SYSENTER_32+0xaa/0x102
[   32.785719] EIP: 0xb7f75ce1
[   32.788508] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
8d 76
[   32.807246] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
[   32.813502] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
[   32.819760] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
[   32.826542] Modules linked in: test_printf(+) x86_pkg_temp_thermal fuse
[   32.833150] CR2: 0000000000000000
[   32.836461] ---[ end trace 69388c972b4562b8 ]---
[   32.841072] EIP: ida_free+0x61/0x130
[   32.844642] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
d3 e8
[   32.863420] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
[   32.869679] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
[   32.875936] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
[   32.882713] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
[   32.888970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   32.895227] DR6: fffe0ff0 DR7: 00000400
[   32.899060] BUG: sleeping function called from invalid context at
/usr/src/kernel/include/linux/percpu-rwsem.h:49
[   32.909303] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
387, name: modprobe
[   32.917207] INFO: lockdep is turned off.
[   32.921124] irq event stamp: 5120
[   32.924434] hardirqs last  enabled at (5119): [<d20dea05>] kfree+0x135/0x270
[   32.931472] hardirqs last disabled at (5120): [<d2b9ab04>]
_raw_spin_lock_irqsave+0x14/0x40
[   32.939809] softirqs last  enabled at (4978): [<d2b9f165>]
__do_softirq+0x2c5/0x3c3
[   32.947454] softirqs last disabled at (4969): [<d1ea67d5>]
call_on_stack+0x45/0x50
[   32.955009] CPU: 0 PID: 387 Comm: modprobe Tainted: G      D
   5.6.0-rc4-next-20200305 #1
[   32.963867] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[   32.971250] Call Trace:
[   32.973695]  dump_stack+0x6e/0x96
[   32.977009]  ___might_sleep+0x14d/0x240
[   32.980846]  __might_sleep+0x33/0x80
[   32.984419]  exit_signals+0x2a/0x2d0
[   32.987997]  do_exit+0x8e/0xb00
[   32.991143]  ? do_fast_syscall_32+0x8e/0x340
[   32.995421]  rewind_stack_do_exit+0x11/0x13
[   32.999600] EIP: 0xb7f75ce1
[   33.002416] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
8d 76
[   33.021153] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
[   33.027425] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
[   33.033685] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
[   33.040465] note: modprobe[387] exited with preempt_count 1
./../kselftest/module.sh: line 56:   387 Killed
$modprobe -q $module $args
printf:  [FAIL]


metadata:
  git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
  git branch: master
  git describe: next-20200305
  kernel-config:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/719/config

ref:
https://lkft.validation.linaro.org/scheduler/job/1267112#L1254
https://lkft.validation.linaro.org/scheduler/job/1267041#L9795

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-05 15:57 BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140 Naresh Kamboju
@ 2020-03-05 19:18 ` Andy Shevchenko
  2020-03-05 19:40   ` Brendan Higgins
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2020-03-05 19:18 UTC (permalink / raw)
  To: Naresh Kamboju, Sakari Ailus
  Cc: open list, open list:KERNEL SELFTEST FRAMEWORK, Steven Rostedt,
	Sergey Senozhatsky, Andy Shevchenko, Shuah Khan, Anders Roxell,
	lkft-triage, Rasmus Villemoes

+Cc: Sakari

On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> i386, arm and arm64. The steps to reproduce is running kselftests lib
> printf.sh test case.
> Which is doing modprobe operations.
>
> BTW, there are few RCU warnings from the boot log.
> Please refer below link for more details.
>
> Steps reproduce by using kselftests,
>
>           - lsmod || true
>           - cd /opt/kselftests/default-in-kernel/lib/
>           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
>           - ./printf.sh || true
>           - ./bitmap.sh || true
>           - ./prime_numbers.sh || true
>           - ./strscpy.sh || true
>
> x86_64 kernel BUG dump.
> + ./printf.sh
> [   32.594369] test_printf: loaded.
> [   32.599859] BUG: kernel NULL pointer dereference, address: 00000000
> [   32.606143] #PF: supervisor read access in kernel mode
> [   32.611280] #PF: error_code(0x0000) - not-present page
> [   32.616419] *pde = 00000000
> [   32.619306] Oops: 0000 [#1] SMP
> [   32.622452] CPU: 0 PID: 387 Comm: modprobe Not tainted
> 5.6.0-rc4-next-20200305 #1
> [   32.629928] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.2 05/23/2018
> [   32.637314] EIP: ida_free+0x61/0x130
> [   32.640891] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> d3 e8
> [   32.659628] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> [   32.665887] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> [   32.672153] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> [   32.678928] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> [   32.685186] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [   32.691442] DR6: fffe0ff0 DR7: 00000400
> [   32.695273] Call Trace:
> [   32.697721]  software_node_release+0x21/0x80
> [   32.701990]  kobject_put+0xa8/0x1c0
> [   32.705476]  kobject_del+0x40/0x60
> [   32.708881]  kobject_put+0x92/0x1c0
> [   32.712380]  fwnode_remove_software_node+0x30/0x50
> [   32.717201]  software_node_unregister_nodes+0x2b/0x50
> [   32.722246]  test_printf_init+0xe00/0x1d29 [test_printf]
> [   32.727563]  ? find_held_lock+0x27/0xa0
> [   32.731423]  ? test_hashed+0x54/0x54 [test_printf]
> [   32.736207]  ? test_hashed+0x54/0x54 [test_printf]
> [   32.741001]  do_one_initcall+0x54/0x2e0
> [   32.744841]  ? rcu_read_lock_sched_held+0x47/0x70
> [   32.749547]  ? kmem_cache_alloc_trace+0x285/0x2b0
> [   32.754249]  ? do_init_module+0x21/0x1ef
> [   32.758166]  ? do_init_module+0x21/0x1ef
> [   32.762087]  do_init_module+0x50/0x1ef
> [   32.765837]  load_module+0x1e32/0x2540
> [   32.769586]  ? kernel_read_file+0x295/0x2d0
> [   32.773771]  sys_finit_module+0x8a/0xe0
> [   32.777611]  do_fast_syscall_32+0x8e/0x340
> [   32.781709]  entry_SYSENTER_32+0xaa/0x102
> [   32.785719] EIP: 0xb7f75ce1
> [   32.788508] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> 8d 76
> [   32.807246] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> [   32.813502] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> [   32.819760] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> [   32.826542] Modules linked in: test_printf(+) x86_pkg_temp_thermal fuse
> [   32.833150] CR2: 0000000000000000
> [   32.836461] ---[ end trace 69388c972b4562b8 ]---
> [   32.841072] EIP: ida_free+0x61/0x130
> [   32.844642] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> d3 e8
> [   32.863420] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> [   32.869679] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> [   32.875936] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> [   32.882713] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> [   32.888970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [   32.895227] DR6: fffe0ff0 DR7: 00000400
> [   32.899060] BUG: sleeping function called from invalid context at
> /usr/src/kernel/include/linux/percpu-rwsem.h:49
> [   32.909303] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
> 387, name: modprobe
> [   32.917207] INFO: lockdep is turned off.
> [   32.921124] irq event stamp: 5120
> [   32.924434] hardirqs last  enabled at (5119): [<d20dea05>] kfree+0x135/0x270
> [   32.931472] hardirqs last disabled at (5120): [<d2b9ab04>]
> _raw_spin_lock_irqsave+0x14/0x40
> [   32.939809] softirqs last  enabled at (4978): [<d2b9f165>]
> __do_softirq+0x2c5/0x3c3
> [   32.947454] softirqs last disabled at (4969): [<d1ea67d5>]
> call_on_stack+0x45/0x50
> [   32.955009] CPU: 0 PID: 387 Comm: modprobe Tainted: G      D
>    5.6.0-rc4-next-20200305 #1
> [   32.963867] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.2 05/23/2018
> [   32.971250] Call Trace:
> [   32.973695]  dump_stack+0x6e/0x96
> [   32.977009]  ___might_sleep+0x14d/0x240
> [   32.980846]  __might_sleep+0x33/0x80
> [   32.984419]  exit_signals+0x2a/0x2d0
> [   32.987997]  do_exit+0x8e/0xb00
> [   32.991143]  ? do_fast_syscall_32+0x8e/0x340
> [   32.995421]  rewind_stack_do_exit+0x11/0x13
> [   32.999600] EIP: 0xb7f75ce1
> [   33.002416] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> 8d 76
> [   33.021153] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> [   33.027425] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> [   33.033685] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> [   33.040465] note: modprobe[387] exited with preempt_count 1
> ./../kselftest/module.sh: line 56:   387 Killed
> $modprobe -q $module $args
> printf:  [FAIL]
>
>
> metadata:
>   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>   git branch: master
>   git describe: next-20200305
>   kernel-config:
> http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/719/config
>
> ref:
> https://lkft.validation.linaro.org/scheduler/job/1267112#L1254
> https://lkft.validation.linaro.org/scheduler/job/1267041#L9795
>
> --
> Linaro LKFT
> https://lkft.linaro.org



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-05 19:18 ` Andy Shevchenko
@ 2020-03-05 19:40   ` Brendan Higgins
  2020-03-05 19:51     ` Brendan Higgins
  0 siblings, 1 reply; 19+ messages in thread
From: Brendan Higgins @ 2020-03-05 19:40 UTC (permalink / raw)
  To: Andy Shevchenko, heikki.krogerus, hdegoede, rafael.j.wysocki
  Cc: Naresh Kamboju, Sakari Ailus, open list,
	open list:KERNEL SELFTEST FRAMEWORK, Steven Rostedt,
	Sergey Senozhatsky, Andy Shevchenko, Shuah Khan, Anders Roxell,
	lkft-triage, Rasmus Villemoes

On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> +Cc: Sakari
>
> On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > printf.sh test case.
> > Which is doing modprobe operations.
> >
> > BTW, there are few RCU warnings from the boot log.
> > Please refer below link for more details.
> >
> > Steps reproduce by using kselftests,
> >
> >           - lsmod || true
> >           - cd /opt/kselftests/default-in-kernel/lib/
> >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> >           - ./printf.sh || true
> >           - ./bitmap.sh || true
> >           - ./prime_numbers.sh || true
> >           - ./strscpy.sh || true
> >
> > x86_64 kernel BUG dump.
> > + ./printf.sh

Oops, I am wondering if I broke this with my change "Revert "software
node: Simplify software_node_release() function"":

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd

I am still investigating, will update later.

> > [   32.594369] test_printf: loaded.
> > [   32.599859] BUG: kernel NULL pointer dereference, address: 00000000
> > [   32.606143] #PF: supervisor read access in kernel mode
> > [   32.611280] #PF: error_code(0x0000) - not-present page
> > [   32.616419] *pde = 00000000
> > [   32.619306] Oops: 0000 [#1] SMP
> > [   32.622452] CPU: 0 PID: 387 Comm: modprobe Not tainted
> > 5.6.0-rc4-next-20200305 #1
> > [   32.629928] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > 2.2 05/23/2018
> > [   32.637314] EIP: ida_free+0x61/0x130
> > [   32.640891] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > d3 e8
> > [   32.659628] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > [   32.665887] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > [   32.672153] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > [   32.678928] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > [   32.685186] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > [   32.691442] DR6: fffe0ff0 DR7: 00000400
> > [   32.695273] Call Trace:
> > [   32.697721]  software_node_release+0x21/0x80
> > [   32.701990]  kobject_put+0xa8/0x1c0
> > [   32.705476]  kobject_del+0x40/0x60
> > [   32.708881]  kobject_put+0x92/0x1c0
> > [   32.712380]  fwnode_remove_software_node+0x30/0x50
> > [   32.717201]  software_node_unregister_nodes+0x2b/0x50
> > [   32.722246]  test_printf_init+0xe00/0x1d29 [test_printf]
> > [   32.727563]  ? find_held_lock+0x27/0xa0
> > [   32.731423]  ? test_hashed+0x54/0x54 [test_printf]
> > [   32.736207]  ? test_hashed+0x54/0x54 [test_printf]
> > [   32.741001]  do_one_initcall+0x54/0x2e0
> > [   32.744841]  ? rcu_read_lock_sched_held+0x47/0x70
> > [   32.749547]  ? kmem_cache_alloc_trace+0x285/0x2b0
> > [   32.754249]  ? do_init_module+0x21/0x1ef
> > [   32.758166]  ? do_init_module+0x21/0x1ef
> > [   32.762087]  do_init_module+0x50/0x1ef
> > [   32.765837]  load_module+0x1e32/0x2540
> > [   32.769586]  ? kernel_read_file+0x295/0x2d0
> > [   32.773771]  sys_finit_module+0x8a/0xe0
> > [   32.777611]  do_fast_syscall_32+0x8e/0x340
> > [   32.781709]  entry_SYSENTER_32+0xaa/0x102
> > [   32.785719] EIP: 0xb7f75ce1
> > [   32.788508] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > 8d 76
> > [   32.807246] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > [   32.813502] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > [   32.819760] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > [   32.826542] Modules linked in: test_printf(+) x86_pkg_temp_thermal fuse
> > [   32.833150] CR2: 0000000000000000
> > [   32.836461] ---[ end trace 69388c972b4562b8 ]---
> > [   32.841072] EIP: ida_free+0x61/0x130
> > [   32.844642] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > d3 e8
> > [   32.863420] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > [   32.869679] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > [   32.875936] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > [   32.882713] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > [   32.888970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > [   32.895227] DR6: fffe0ff0 DR7: 00000400
> > [   32.899060] BUG: sleeping function called from invalid context at
> > /usr/src/kernel/include/linux/percpu-rwsem.h:49
> > [   32.909303] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
> > 387, name: modprobe
> > [   32.917207] INFO: lockdep is turned off.
> > [   32.921124] irq event stamp: 5120
> > [   32.924434] hardirqs last  enabled at (5119): [<d20dea05>] kfree+0x135/0x270
> > [   32.931472] hardirqs last disabled at (5120): [<d2b9ab04>]
> > _raw_spin_lock_irqsave+0x14/0x40
> > [   32.939809] softirqs last  enabled at (4978): [<d2b9f165>]
> > __do_softirq+0x2c5/0x3c3
> > [   32.947454] softirqs last disabled at (4969): [<d1ea67d5>]
> > call_on_stack+0x45/0x50
> > [   32.955009] CPU: 0 PID: 387 Comm: modprobe Tainted: G      D
> >    5.6.0-rc4-next-20200305 #1
> > [   32.963867] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > 2.2 05/23/2018
> > [   32.971250] Call Trace:
> > [   32.973695]  dump_stack+0x6e/0x96
> > [   32.977009]  ___might_sleep+0x14d/0x240
> > [   32.980846]  __might_sleep+0x33/0x80
> > [   32.984419]  exit_signals+0x2a/0x2d0
> > [   32.987997]  do_exit+0x8e/0xb00
> > [   32.991143]  ? do_fast_syscall_32+0x8e/0x340
> > [   32.995421]  rewind_stack_do_exit+0x11/0x13
> > [   32.999600] EIP: 0xb7f75ce1
> > [   33.002416] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > 8d 76
> > [   33.021153] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > [   33.027425] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > [   33.033685] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > [   33.040465] note: modprobe[387] exited with preempt_count 1
> > ./../kselftest/module.sh: line 56:   387 Killed
> > $modprobe -q $module $args
> > printf:  [FAIL]
> >
> >
> > metadata:
> >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> >   git branch: master
> >   git describe: next-20200305
> >   kernel-config:
> > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/719/config
> >
> > ref:
> > https://lkft.validation.linaro.org/scheduler/job/1267112#L1254
> > https://lkft.validation.linaro.org/scheduler/job/1267041#L9795
> >
> > --
> > Linaro LKFT
> > https://lkft.linaro.org
>
>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-05 19:40   ` Brendan Higgins
@ 2020-03-05 19:51     ` Brendan Higgins
  2020-03-05 22:33       ` Sakari Ailus
  0 siblings, 1 reply; 19+ messages in thread
From: Brendan Higgins @ 2020-03-05 19:51 UTC (permalink / raw)
  To: Andy Shevchenko, heikki.krogerus, hdegoede, rafael.j.wysocki
  Cc: Naresh Kamboju, Sakari Ailus, open list,
	open list:KERNEL SELFTEST FRAMEWORK, Steven Rostedt,
	Sergey Senozhatsky, Andy Shevchenko, Shuah Khan, Anders Roxell,
	lkft-triage, Rasmus Villemoes

On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
<brendanhiggins@google.com> wrote:
>
> On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> >
> > +Cc: Sakari
> >
> > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > >
> > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > printf.sh test case.
> > > Which is doing modprobe operations.
> > >
> > > BTW, there are few RCU warnings from the boot log.
> > > Please refer below link for more details.
> > >
> > > Steps reproduce by using kselftests,
> > >
> > >           - lsmod || true
> > >           - cd /opt/kselftests/default-in-kernel/lib/
> > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > >           - ./printf.sh || true
> > >           - ./bitmap.sh || true
> > >           - ./prime_numbers.sh || true
> > >           - ./strscpy.sh || true
> > >
> > > x86_64 kernel BUG dump.
> > > + ./printf.sh
>
> Oops, I am wondering if I broke this with my change "Revert "software
> node: Simplify software_node_release() function"":
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
>
> I am still investigating, will update later.

Okay, yeah, I am pretty sure I caused the breakage. I got an email
from kernel test robot a couple days ago that I didn't see:

https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/

It shows the same breakage after applying this change.

I am still investigating how my change broke it, nevertheless.

> > > [   32.594369] test_printf: loaded.
> > > [   32.599859] BUG: kernel NULL pointer dereference, address: 00000000
> > > [   32.606143] #PF: supervisor read access in kernel mode
> > > [   32.611280] #PF: error_code(0x0000) - not-present page
> > > [   32.616419] *pde = 00000000
> > > [   32.619306] Oops: 0000 [#1] SMP
> > > [   32.622452] CPU: 0 PID: 387 Comm: modprobe Not tainted
> > > 5.6.0-rc4-next-20200305 #1
> > > [   32.629928] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > 2.2 05/23/2018
> > > [   32.637314] EIP: ida_free+0x61/0x130
> > > [   32.640891] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > > d3 e8
> > > [   32.659628] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > > [   32.665887] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > > [   32.672153] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > > [   32.678928] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > > [   32.685186] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > [   32.691442] DR6: fffe0ff0 DR7: 00000400
> > > [   32.695273] Call Trace:
> > > [   32.697721]  software_node_release+0x21/0x80
> > > [   32.701990]  kobject_put+0xa8/0x1c0
> > > [   32.705476]  kobject_del+0x40/0x60
> > > [   32.708881]  kobject_put+0x92/0x1c0
> > > [   32.712380]  fwnode_remove_software_node+0x30/0x50
> > > [   32.717201]  software_node_unregister_nodes+0x2b/0x50
> > > [   32.722246]  test_printf_init+0xe00/0x1d29 [test_printf]
> > > [   32.727563]  ? find_held_lock+0x27/0xa0
> > > [   32.731423]  ? test_hashed+0x54/0x54 [test_printf]
> > > [   32.736207]  ? test_hashed+0x54/0x54 [test_printf]
> > > [   32.741001]  do_one_initcall+0x54/0x2e0
> > > [   32.744841]  ? rcu_read_lock_sched_held+0x47/0x70
> > > [   32.749547]  ? kmem_cache_alloc_trace+0x285/0x2b0
> > > [   32.754249]  ? do_init_module+0x21/0x1ef
> > > [   32.758166]  ? do_init_module+0x21/0x1ef
> > > [   32.762087]  do_init_module+0x50/0x1ef
> > > [   32.765837]  load_module+0x1e32/0x2540
> > > [   32.769586]  ? kernel_read_file+0x295/0x2d0
> > > [   32.773771]  sys_finit_module+0x8a/0xe0
> > > [   32.777611]  do_fast_syscall_32+0x8e/0x340
> > > [   32.781709]  entry_SYSENTER_32+0xaa/0x102
> > > [   32.785719] EIP: 0xb7f75ce1
> > > [   32.788508] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > > 8d 76
> > > [   32.807246] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > > [   32.813502] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > > [   32.819760] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > > [   32.826542] Modules linked in: test_printf(+) x86_pkg_temp_thermal fuse
> > > [   32.833150] CR2: 0000000000000000
> > > [   32.836461] ---[ end trace 69388c972b4562b8 ]---
> > > [   32.841072] EIP: ida_free+0x61/0x130
> > > [   32.844642] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > > d3 e8
> > > [   32.863420] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > > [   32.869679] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > > [   32.875936] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > > [   32.882713] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > > [   32.888970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > [   32.895227] DR6: fffe0ff0 DR7: 00000400
> > > [   32.899060] BUG: sleeping function called from invalid context at
> > > /usr/src/kernel/include/linux/percpu-rwsem.h:49
> > > [   32.909303] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
> > > 387, name: modprobe
> > > [   32.917207] INFO: lockdep is turned off.
> > > [   32.921124] irq event stamp: 5120
> > > [   32.924434] hardirqs last  enabled at (5119): [<d20dea05>] kfree+0x135/0x270
> > > [   32.931472] hardirqs last disabled at (5120): [<d2b9ab04>]
> > > _raw_spin_lock_irqsave+0x14/0x40
> > > [   32.939809] softirqs last  enabled at (4978): [<d2b9f165>]
> > > __do_softirq+0x2c5/0x3c3
> > > [   32.947454] softirqs last disabled at (4969): [<d1ea67d5>]
> > > call_on_stack+0x45/0x50
> > > [   32.955009] CPU: 0 PID: 387 Comm: modprobe Tainted: G      D
> > >    5.6.0-rc4-next-20200305 #1
> > > [   32.963867] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > 2.2 05/23/2018
> > > [   32.971250] Call Trace:
> > > [   32.973695]  dump_stack+0x6e/0x96
> > > [   32.977009]  ___might_sleep+0x14d/0x240
> > > [   32.980846]  __might_sleep+0x33/0x80
> > > [   32.984419]  exit_signals+0x2a/0x2d0
> > > [   32.987997]  do_exit+0x8e/0xb00
> > > [   32.991143]  ? do_fast_syscall_32+0x8e/0x340
> > > [   32.995421]  rewind_stack_do_exit+0x11/0x13
> > > [   32.999600] EIP: 0xb7f75ce1
> > > [   33.002416] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > > 8d 76
> > > [   33.021153] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > > [   33.027425] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > > [   33.033685] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > > [   33.040465] note: modprobe[387] exited with preempt_count 1
> > > ./../kselftest/module.sh: line 56:   387 Killed
> > > $modprobe -q $module $args
> > > printf:  [FAIL]
> > >
> > >
> > > metadata:
> > >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > >   git branch: master
> > >   git describe: next-20200305
> > >   kernel-config:
> > > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/719/config
> > >
> > > ref:
> > > https://lkft.validation.linaro.org/scheduler/job/1267112#L1254
> > > https://lkft.validation.linaro.org/scheduler/job/1267041#L9795
> > >
> > > --
> > > Linaro LKFT
> > > https://lkft.linaro.org
> >
> >
> >
> > --
> > With Best Regards,
> > Andy Shevchenko

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-05 19:51     ` Brendan Higgins
@ 2020-03-05 22:33       ` Sakari Ailus
  2020-03-06 12:05         ` Heikki Krogerus
  0 siblings, 1 reply; 19+ messages in thread
From: Sakari Ailus @ 2020-03-05 22:33 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: Andy Shevchenko, heikki.krogerus, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

Hi Brendan,

On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> <brendanhiggins@google.com> wrote:
> >
> > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > <andy.shevchenko@gmail.com> wrote:
> > >
> > > +Cc: Sakari
> > >
> > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > >
> > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > printf.sh test case.
> > > > Which is doing modprobe operations.
> > > >
> > > > BTW, there are few RCU warnings from the boot log.
> > > > Please refer below link for more details.
> > > >
> > > > Steps reproduce by using kselftests,
> > > >
> > > >           - lsmod || true
> > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > >           - ./printf.sh || true
> > > >           - ./bitmap.sh || true
> > > >           - ./prime_numbers.sh || true
> > > >           - ./strscpy.sh || true
> > > >
> > > > x86_64 kernel BUG dump.
> > > > + ./printf.sh
> >
> > Oops, I am wondering if I broke this with my change "Revert "software
> > node: Simplify software_node_release() function"":
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> >
> > I am still investigating, will update later.
> 
> Okay, yeah, I am pretty sure I caused the breakage. I got an email
> from kernel test robot a couple days ago that I didn't see:
> 
> https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> 
> It shows the same breakage after applying this change.
> 
> I am still investigating how my change broke it, nevertheless.

As nodes in the tree are being removed, the code before the patch that
"simplified" the software_node_release() function accessed the node's parent
in its release function.

And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
longer necessarily called in order, leading to referencing released memory.
Oops!

So Heikki's patch actually fixed a bug. :-)

> 
> > > > [   32.594369] test_printf: loaded.
> > > > [   32.599859] BUG: kernel NULL pointer dereference, address: 00000000
> > > > [   32.606143] #PF: supervisor read access in kernel mode
> > > > [   32.611280] #PF: error_code(0x0000) - not-present page
> > > > [   32.616419] *pde = 00000000
> > > > [   32.619306] Oops: 0000 [#1] SMP
> > > > [   32.622452] CPU: 0 PID: 387 Comm: modprobe Not tainted
> > > > 5.6.0-rc4-next-20200305 #1
> > > > [   32.629928] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > 2.2 05/23/2018
> > > > [   32.637314] EIP: ida_free+0x61/0x130
> > > > [   32.640891] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > > > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > > > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > > > d3 e8
> > > > [   32.659628] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > > > [   32.665887] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > > > [   32.672153] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > > > [   32.678928] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > > > [   32.685186] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > [   32.691442] DR6: fffe0ff0 DR7: 00000400
> > > > [   32.695273] Call Trace:
> > > > [   32.697721]  software_node_release+0x21/0x80
> > > > [   32.701990]  kobject_put+0xa8/0x1c0
> > > > [   32.705476]  kobject_del+0x40/0x60
> > > > [   32.708881]  kobject_put+0x92/0x1c0
> > > > [   32.712380]  fwnode_remove_software_node+0x30/0x50
> > > > [   32.717201]  software_node_unregister_nodes+0x2b/0x50
> > > > [   32.722246]  test_printf_init+0xe00/0x1d29 [test_printf]
> > > > [   32.727563]  ? find_held_lock+0x27/0xa0
> > > > [   32.731423]  ? test_hashed+0x54/0x54 [test_printf]
> > > > [   32.736207]  ? test_hashed+0x54/0x54 [test_printf]
> > > > [   32.741001]  do_one_initcall+0x54/0x2e0
> > > > [   32.744841]  ? rcu_read_lock_sched_held+0x47/0x70
> > > > [   32.749547]  ? kmem_cache_alloc_trace+0x285/0x2b0
> > > > [   32.754249]  ? do_init_module+0x21/0x1ef
> > > > [   32.758166]  ? do_init_module+0x21/0x1ef
> > > > [   32.762087]  do_init_module+0x50/0x1ef
> > > > [   32.765837]  load_module+0x1e32/0x2540
> > > > [   32.769586]  ? kernel_read_file+0x295/0x2d0
> > > > [   32.773771]  sys_finit_module+0x8a/0xe0
> > > > [   32.777611]  do_fast_syscall_32+0x8e/0x340
> > > > [   32.781709]  entry_SYSENTER_32+0xaa/0x102
> > > > [   32.785719] EIP: 0xb7f75ce1
> > > > [   32.788508] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > > > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > > > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > > > 8d 76
> > > > [   32.807246] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > > > [   32.813502] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > > > [   32.819760] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > > > [   32.826542] Modules linked in: test_printf(+) x86_pkg_temp_thermal fuse
> > > > [   32.833150] CR2: 0000000000000000
> > > > [   32.836461] ---[ end trace 69388c972b4562b8 ]---
> > > > [   32.841072] EIP: ida_free+0x61/0x130
> > > > [   32.844642] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
> > > > 88 c4 00 00 00 89 d3 e8 9d ee 01 00 89 c7 8d 45 d8 e8 e3 18 01 00 a8
> > > > 01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 70 f0 01 00 53 68 d8 61 01
> > > > d3 e8
> > > > [   32.863420] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000000
> > > > [   32.869679] ESI: 00000000 EDI: 00000246 EBP: f21a7cc8 ESP: f21a7c9c
> > > > [   32.875936] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
> > > > [   32.882713] CR0: 80050033 CR2: 00000000 CR3: 32ff3000 CR4: 003406d0
> > > > [   32.888970] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > [   32.895227] DR6: fffe0ff0 DR7: 00000400
> > > > [   32.899060] BUG: sleeping function called from invalid context at
> > > > /usr/src/kernel/include/linux/percpu-rwsem.h:49
> > > > [   32.909303] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
> > > > 387, name: modprobe
> > > > [   32.917207] INFO: lockdep is turned off.
> > > > [   32.921124] irq event stamp: 5120
> > > > [   32.924434] hardirqs last  enabled at (5119): [<d20dea05>] kfree+0x135/0x270
> > > > [   32.931472] hardirqs last disabled at (5120): [<d2b9ab04>]
> > > > _raw_spin_lock_irqsave+0x14/0x40
> > > > [   32.939809] softirqs last  enabled at (4978): [<d2b9f165>]
> > > > __do_softirq+0x2c5/0x3c3
> > > > [   32.947454] softirqs last disabled at (4969): [<d1ea67d5>]
> > > > call_on_stack+0x45/0x50
> > > > [   32.955009] CPU: 0 PID: 387 Comm: modprobe Tainted: G      D
> > > >    5.6.0-rc4-next-20200305 #1
> > > > [   32.963867] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > 2.2 05/23/2018
> > > > [   32.971250] Call Trace:
> > > > [   32.973695]  dump_stack+0x6e/0x96
> > > > [   32.977009]  ___might_sleep+0x14d/0x240
> > > > [   32.980846]  __might_sleep+0x33/0x80
> > > > [   32.984419]  exit_signals+0x2a/0x2d0
> > > > [   32.987997]  do_exit+0x8e/0xb00
> > > > [   32.991143]  ? do_fast_syscall_32+0x8e/0x340
> > > > [   32.995421]  rewind_stack_do_exit+0x11/0x13
> > > > [   32.999600] EIP: 0xb7f75ce1
> > > > [   33.002416] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
> > > > 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
> > > > 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> > > > 8d 76
> > > > [   33.021153] EAX: ffffffda EBX: 00000003 ECX: 0806233a EDX: 00000000
> > > > [   33.027425] ESI: 085fb600 EDI: 085fb550 EBP: 085fb6e0 ESP: bffe839c
> > > > [   33.033685] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
> > > > [   33.040465] note: modprobe[387] exited with preempt_count 1
> > > > ./../kselftest/module.sh: line 56:   387 Killed
> > > > $modprobe -q $module $args
> > > > printf:  [FAIL]
> > > >
> > > >
> > > > metadata:
> > > >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > > >   git branch: master
> > > >   git describe: next-20200305
> > > >   kernel-config:
> > > > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/719/config
> > > >
> > > > ref:
> > > > https://lkft.validation.linaro.org/scheduler/job/1267112#L1254
> > > > https://lkft.validation.linaro.org/scheduler/job/1267041#L9795
> > > >
> > > > --
> > > > Linaro LKFT
> > > > https://lkft.linaro.org
> > >
> > >
> > >
> > > --
> > > With Best Regards,
> > > Andy Shevchenko

-- 
Sakari Ailus
sakari.ailus@linux.intel.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-05 22:33       ` Sakari Ailus
@ 2020-03-06 12:05         ` Heikki Krogerus
  2020-03-09 20:35           ` Brendan Higgins
  0 siblings, 1 reply; 19+ messages in thread
From: Heikki Krogerus @ 2020-03-06 12:05 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: Brendan Higgins, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> Hi Brendan,
> 
> On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > <brendanhiggins@google.com> wrote:
> > >
> > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > <andy.shevchenko@gmail.com> wrote:
> > > >
> > > > +Cc: Sakari
> > > >
> > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > >
> > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > printf.sh test case.
> > > > > Which is doing modprobe operations.
> > > > >
> > > > > BTW, there are few RCU warnings from the boot log.
> > > > > Please refer below link for more details.
> > > > >
> > > > > Steps reproduce by using kselftests,
> > > > >
> > > > >           - lsmod || true
> > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > >           - ./printf.sh || true
> > > > >           - ./bitmap.sh || true
> > > > >           - ./prime_numbers.sh || true
> > > > >           - ./strscpy.sh || true
> > > > >
> > > > > x86_64 kernel BUG dump.
> > > > > + ./printf.sh
> > >
> > > Oops, I am wondering if I broke this with my change "Revert "software
> > > node: Simplify software_node_release() function"":
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > >
> > > I am still investigating, will update later.
> > 
> > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > from kernel test robot a couple days ago that I didn't see:
> > 
> > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > 
> > It shows the same breakage after applying this change.
> > 
> > I am still investigating how my change broke it, nevertheless.
> 
> As nodes in the tree are being removed, the code before the patch that
> "simplified" the software_node_release() function accessed the node's parent
> in its release function.
> 
> And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> longer necessarily called in order, leading to referencing released memory.
> Oops!
> 
> So Heikki's patch actually fixed a bug. :-)

Well, I think it just hid the problem. It looks like the core
(lib/kobject.c) allows the parent kobject to be released before the
last child kobject is released. To be honest, that does not sound
right to me...

I think we can workaround this problem by taking reference to the
parent when the child is added, and then releasing it when the child
is released, and in that way be guaranteed that the parent will not
disappear before the child is fully released, but that still does not
feel right. It feels more like the core is not doing it's job to me.
The parent just should not be released before its children.

Either I'm wrong about that, and we still should take the reference on
the parent, or we revert my patch like Brendan proposed and then fix
the core with something like this (warning, I did not even try to
compile that):

diff --git a/lib/kobject.c b/lib/kobject.c
index 83198cb37d8d..ec5774992337 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
                kobject_uevent(kobj, KOBJ_REMOVE);
        }

+       if (t && t->release) {
+               pr_debug("kobject: '%s' (%p): calling ktype release\n",
+                        kobject_name(kobj), kobj);
+               t->release(kobj);
+       }
+
        /* remove from sysfs if the caller did not do it */
        if (kobj->state_in_sysfs) {
                pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
@@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
                kobject_del(kobj);
        }

-       if (t && t->release) {
-               pr_debug("kobject: '%s' (%p): calling ktype release\n",
-                        kobject_name(kobj), kobj);
-               t->release(kobj);
-       }
-
        /* free name if we allocated it */
        if (name) {
                pr_debug("kobject: '%s': free name\n", name);


thanks,

-- 
heikki

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-06 12:05         ` Heikki Krogerus
@ 2020-03-09 20:35           ` Brendan Higgins
  2020-03-09 21:43             ` Brendan Higgins
  0 siblings, 1 reply; 19+ messages in thread
From: Brendan Higgins @ 2020-03-09 20:35 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Sakari Ailus, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> > Hi Brendan,
> >
> > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > > <brendanhiggins@google.com> wrote:
> > > >
> > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > > <andy.shevchenko@gmail.com> wrote:
> > > > >
> > > > > +Cc: Sakari
> > > > >
> > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > >
> > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > > printf.sh test case.
> > > > > > Which is doing modprobe operations.
> > > > > >
> > > > > > BTW, there are few RCU warnings from the boot log.
> > > > > > Please refer below link for more details.
> > > > > >
> > > > > > Steps reproduce by using kselftests,
> > > > > >
> > > > > >           - lsmod || true
> > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > >           - ./printf.sh || true
> > > > > >           - ./bitmap.sh || true
> > > > > >           - ./prime_numbers.sh || true
> > > > > >           - ./strscpy.sh || true
> > > > > >
> > > > > > x86_64 kernel BUG dump.
> > > > > > + ./printf.sh
> > > >
> > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > node: Simplify software_node_release() function"":
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > >
> > > > I am still investigating, will update later.
> > >
> > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > from kernel test robot a couple days ago that I didn't see:
> > >
> > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > >
> > > It shows the same breakage after applying this change.
> > >
> > > I am still investigating how my change broke it, nevertheless.
> >
> > As nodes in the tree are being removed, the code before the patch that
> > "simplified" the software_node_release() function accessed the node's parent
> > in its release function.
> >
> > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > longer necessarily called in order, leading to referencing released memory.
> > Oops!
> >
> > So Heikki's patch actually fixed a bug. :-)
>
> Well, I think it just hid the problem. It looks like the core
> (lib/kobject.c) allows the parent kobject to be released before the
> last child kobject is released. To be honest, that does not sound
> right to me...
>
> I think we can workaround this problem by taking reference to the
> parent when the child is added, and then releasing it when the child
> is released, and in that way be guaranteed that the parent will not
> disappear before the child is fully released, but that still does not
> feel right. It feels more like the core is not doing it's job to me.
> The parent just should not be released before its children.
>
> Either I'm wrong about that, and we still should take the reference on
> the parent, or we revert my patch like Brendan proposed and then fix

Either way, isn't it wrong to release the node ID before deleting the
sysfs entry? I am not sure that my fix was the correct one, but I
believe the bug that Heidi and I found is actually a bug.

> the core with something like this (warning, I did not even try to
> compile that):

I will try it out.

> diff --git a/lib/kobject.c b/lib/kobject.c
> index 83198cb37d8d..ec5774992337 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
>                 kobject_uevent(kobj, KOBJ_REMOVE);
>         }
>
> +       if (t && t->release) {
> +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> +                        kobject_name(kobj), kobj);
> +               t->release(kobj);
> +       }
> +
>         /* remove from sysfs if the caller did not do it */
>         if (kobj->state_in_sysfs) {
>                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
>                 kobject_del(kobj);
>         }
>
> -       if (t && t->release) {
> -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> -                        kobject_name(kobj), kobj);
> -               t->release(kobj);
> -       }
> -
>         /* free name if we allocated it */
>         if (name) {
>                 pr_debug("kobject: '%s': free name\n", name);
>

Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-09 20:35           ` Brendan Higgins
@ 2020-03-09 21:43             ` Brendan Higgins
  2020-03-10 11:18               ` Heikki Krogerus
  0 siblings, 1 reply; 19+ messages in thread
From: Brendan Higgins @ 2020-03-09 21:43 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Sakari Ailus, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Mon, Mar 9, 2020 at 1:35 PM Brendan Higgins
<brendanhiggins@google.com> wrote:
>
> On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus
> <heikki.krogerus@linux.intel.com> wrote:
> >
> > On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> > > Hi Brendan,
> > >
> > > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > > > <brendanhiggins@google.com> wrote:
> > > > >
> > > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > > > <andy.shevchenko@gmail.com> wrote:
> > > > > >
> > > > > > +Cc: Sakari
> > > > > >
> > > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > >
> > > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > > > printf.sh test case.
> > > > > > > Which is doing modprobe operations.
> > > > > > >
> > > > > > > BTW, there are few RCU warnings from the boot log.
> > > > > > > Please refer below link for more details.
> > > > > > >
> > > > > > > Steps reproduce by using kselftests,
> > > > > > >
> > > > > > >           - lsmod || true
> > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > >           - ./printf.sh || true
> > > > > > >           - ./bitmap.sh || true
> > > > > > >           - ./prime_numbers.sh || true
> > > > > > >           - ./strscpy.sh || true
> > > > > > >
> > > > > > > x86_64 kernel BUG dump.
> > > > > > > + ./printf.sh
> > > > >
> > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > node: Simplify software_node_release() function"":
> > > > >
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > >
> > > > > I am still investigating, will update later.
> > > >
> > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > from kernel test robot a couple days ago that I didn't see:
> > > >
> > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > >
> > > > It shows the same breakage after applying this change.
> > > >
> > > > I am still investigating how my change broke it, nevertheless.
> > >
> > > As nodes in the tree are being removed, the code before the patch that
> > > "simplified" the software_node_release() function accessed the node's parent
> > > in its release function.
> > >
> > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > longer necessarily called in order, leading to referencing released memory.
> > > Oops!
> > >
> > > So Heikki's patch actually fixed a bug. :-)
> >
> > Well, I think it just hid the problem. It looks like the core
> > (lib/kobject.c) allows the parent kobject to be released before the
> > last child kobject is released. To be honest, that does not sound
> > right to me...
> >
> > I think we can workaround this problem by taking reference to the
> > parent when the child is added, and then releasing it when the child
> > is released, and in that way be guaranteed that the parent will not
> > disappear before the child is fully released, but that still does not
> > feel right. It feels more like the core is not doing it's job to me.
> > The parent just should not be released before its children.
> >
> > Either I'm wrong about that, and we still should take the reference on
> > the parent, or we revert my patch like Brendan proposed and then fix
>
> Either way, isn't it wrong to release the node ID before deleting the
> sysfs entry? I am not sure that my fix was the correct one, but I
> believe the bug that Heidi and I found is actually a bug.
>
> > the core with something like this (warning, I did not even try to
> > compile that):
>
> I will try it out.
>
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index 83198cb37d8d..ec5774992337 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 kobject_uevent(kobj, KOBJ_REMOVE);
> >         }
> >
> > +       if (t && t->release) {
> > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > +                        kobject_name(kobj), kobj);
> > +               t->release(kobj);
> > +       }
> > +
> >         /* remove from sysfs if the caller did not do it */
> >         if (kobj->state_in_sysfs) {
> >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 kobject_del(kobj);
> >         }
> >
> > -       if (t && t->release) {
> > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > -                        kobject_name(kobj), kobj);
> > -               t->release(kobj);
> > -       }
> > -
> >         /* free name if we allocated it */
> >         if (name) {
> >                 pr_debug("kobject: '%s': free name\n", name);

Alright, so I tried it and it looks like Heikki's suggestion worked.

Is everyone comfortable going this route?

Also, should I send this fix as a separate patch? Or do people want me
to send an updated revision of my revert patch with the fix?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-09 21:43             ` Brendan Higgins
@ 2020-03-10 11:18               ` Heikki Krogerus
  2020-03-10 20:46                   ` Brendan Higgins
  0 siblings, 1 reply; 19+ messages in thread
From: Heikki Krogerus @ 2020-03-10 11:18 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: Sakari Ailus, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Mon, Mar 09, 2020 at 02:43:13PM -0700, Brendan Higgins wrote:
> On Mon, Mar 9, 2020 at 1:35 PM Brendan Higgins
> <brendanhiggins@google.com> wrote:
> >
> > On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus
> > <heikki.krogerus@linux.intel.com> wrote:
> > >
> > > On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> > > > Hi Brendan,
> > > >
> > > > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > > > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > > > > <brendanhiggins@google.com> wrote:
> > > > > >
> > > > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > > > > <andy.shevchenko@gmail.com> wrote:
> > > > > > >
> > > > > > > +Cc: Sakari
> > > > > > >
> > > > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > > >
> > > > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > > > > printf.sh test case.
> > > > > > > > Which is doing modprobe operations.
> > > > > > > >
> > > > > > > > BTW, there are few RCU warnings from the boot log.
> > > > > > > > Please refer below link for more details.
> > > > > > > >
> > > > > > > > Steps reproduce by using kselftests,
> > > > > > > >
> > > > > > > >           - lsmod || true
> > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > >           - ./printf.sh || true
> > > > > > > >           - ./bitmap.sh || true
> > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > >           - ./strscpy.sh || true
> > > > > > > >
> > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > + ./printf.sh
> > > > > >
> > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > node: Simplify software_node_release() function"":
> > > > > >
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > >
> > > > > > I am still investigating, will update later.
> > > > >
> > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > from kernel test robot a couple days ago that I didn't see:
> > > > >
> > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > >
> > > > > It shows the same breakage after applying this change.
> > > > >
> > > > > I am still investigating how my change broke it, nevertheless.
> > > >
> > > > As nodes in the tree are being removed, the code before the patch that
> > > > "simplified" the software_node_release() function accessed the node's parent
> > > > in its release function.
> > > >
> > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > longer necessarily called in order, leading to referencing released memory.
> > > > Oops!
> > > >
> > > > So Heikki's patch actually fixed a bug. :-)
> > >
> > > Well, I think it just hid the problem. It looks like the core
> > > (lib/kobject.c) allows the parent kobject to be released before the
> > > last child kobject is released. To be honest, that does not sound
> > > right to me...
> > >
> > > I think we can workaround this problem by taking reference to the
> > > parent when the child is added, and then releasing it when the child
> > > is released, and in that way be guaranteed that the parent will not
> > > disappear before the child is fully released, but that still does not
> > > feel right. It feels more like the core is not doing it's job to me.
> > > The parent just should not be released before its children.
> > >
> > > Either I'm wrong about that, and we still should take the reference on
> > > the parent, or we revert my patch like Brendan proposed and then fix
> >
> > Either way, isn't it wrong to release the node ID before deleting the
> > sysfs entry? I am not sure that my fix was the correct one, but I
> > believe the bug that Heidi and I found is actually a bug.

I agree.

> > > the core with something like this (warning, I did not even try to
> > > compile that):
> >
> > I will try it out.
> >
> > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > index 83198cb37d8d..ec5774992337 100644
> > > --- a/lib/kobject.c
> > > +++ b/lib/kobject.c
> > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > >         }
> > >
> > > +       if (t && t->release) {
> > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > +                        kobject_name(kobj), kobj);
> > > +               t->release(kobj);
> > > +       }
> > > +
> > >         /* remove from sysfs if the caller did not do it */
> > >         if (kobj->state_in_sysfs) {
> > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > >                 kobject_del(kobj);
> > >         }
> > >
> > > -       if (t && t->release) {
> > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > -                        kobject_name(kobj), kobj);
> > > -               t->release(kobj);
> > > -       }
> > > -
> > >         /* free name if we allocated it */
> > >         if (name) {
> > >                 pr_debug("kobject: '%s': free name\n", name);
> 
> Alright, so I tried it and it looks like Heikki's suggestion worked.
> 
> Is everyone comfortable going this route?

Hold on. Another way to fix the problem is to increment the parent's
reference count before that kobject_del(kobj) is called, and then
decrementing it after t->release(kobj) is called. It may be safer to
fix the problem like that.

My example above proofs that there is the problem, but it changes the
order of execution which I think can always have other consequences.

> Also, should I send this fix as a separate patch? Or do people want me
> to send an updated revision of my revert patch with the fix?

This needs to be send in its own separate patch. Ideally it could be
send together with the revert in the same series, but I'm not sure
that's possible anymore. Didn't Greg pick the revert already?


thanks,

-- 
heikki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
@ 2020-03-10 20:46                   ` Brendan Higgins
  0 siblings, 0 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-03-10 20:46 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Sakari Ailus, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Mar 10, 2020 at 4:18 AM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> On Mon, Mar 09, 2020 at 02:43:13PM -0700, Brendan Higgins wrote:
> > On Mon, Mar 9, 2020 at 1:35 PM Brendan Higgins
> > <brendanhiggins@google.com> wrote:
> > >
> > > On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus
> > > <heikki.krogerus@linux.intel.com> wrote:
> > > >
> > > > On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> > > > > Hi Brendan,
> > > > >
> > > > > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > > > > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > > > > > <brendanhiggins@google.com> wrote:
> > > > > > >
> > > > > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > > > > > <andy.shevchenko@gmail.com> wrote:
> > > > > > > >
> > > > > > > > +Cc: Sakari
> > > > > > > >
> > > > > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > > > >
> > > > > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > > > > > printf.sh test case.
> > > > > > > > > Which is doing modprobe operations.
> > > > > > > > >
> > > > > > > > > BTW, there are few RCU warnings from the boot log.
> > > > > > > > > Please refer below link for more details.
> > > > > > > > >
> > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > >
> > > > > > > > >           - lsmod || true
> > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > >           - ./printf.sh || true
> > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > >
> > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > + ./printf.sh
> > > > > > >
> > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > node: Simplify software_node_release() function"":
> > > > > > >
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > >
> > > > > > > I am still investigating, will update later.
> > > > > >
> > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > >
> > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > >
> > > > > > It shows the same breakage after applying this change.
> > > > > >
> > > > > > I am still investigating how my change broke it, nevertheless.
> > > > >
> > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > in its release function.
> > > > >
> > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > Oops!
> > > > >
> > > > > So Heikki's patch actually fixed a bug. :-)
> > > >
> > > > Well, I think it just hid the problem. It looks like the core
> > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > last child kobject is released. To be honest, that does not sound
> > > > right to me...
> > > >
> > > > I think we can workaround this problem by taking reference to the
> > > > parent when the child is added, and then releasing it when the child
> > > > is released, and in that way be guaranteed that the parent will not
> > > > disappear before the child is fully released, but that still does not
> > > > feel right. It feels more like the core is not doing it's job to me.
> > > > The parent just should not be released before its children.
> > > >
> > > > Either I'm wrong about that, and we still should take the reference on
> > > > the parent, or we revert my patch like Brendan proposed and then fix
> > >
> > > Either way, isn't it wrong to release the node ID before deleting the
> > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > believe the bug that Heidi and I found is actually a bug.
>
> I agree.
>
> > > > the core with something like this (warning, I did not even try to
> > > > compile that):
> > >
> > > I will try it out.
> > >
> > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > index 83198cb37d8d..ec5774992337 100644
> > > > --- a/lib/kobject.c
> > > > +++ b/lib/kobject.c
> > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > >         }
> > > >
> > > > +       if (t && t->release) {
> > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > +                        kobject_name(kobj), kobj);
> > > > +               t->release(kobj);
> > > > +       }
> > > > +
> > > >         /* remove from sysfs if the caller did not do it */
> > > >         if (kobj->state_in_sysfs) {
> > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > >                 kobject_del(kobj);
> > > >         }
> > > >
> > > > -       if (t && t->release) {
> > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > -                        kobject_name(kobj), kobj);
> > > > -               t->release(kobj);
> > > > -       }
> > > > -
> > > >         /* free name if we allocated it */
> > > >         if (name) {
> > > >                 pr_debug("kobject: '%s': free name\n", name);
> >
> > Alright, so I tried it and it looks like Heikki's suggestion worked.
> >
> > Is everyone comfortable going this route?
>
> Hold on. Another way to fix the problem is to increment the parent's
> reference count before that kobject_del(kobj) is called, and then
> decrementing it after t->release(kobj) is called. It may be safer to
> fix the problem like that.

Right, this was your first suggestion above, right? That actually made
more sense to me, but you seemed skeptical of it due to it being
messier, which makes sense.

Nevertheless, having children take a reference seems like the right
thing to do because the children need to degregister themselves from
the parent. Calling t->release() ahead of kobject_del() seems to
reintroduce the problem that I pointed out, albeit *much* more
briefly. If I understand correctly, it is always wrong to have a sysfs
entry that points to a partially deallocated kobject. Please correct
me if I am wrong.

So I think there are two solutions: Either we have to ensure that each
child is deallocated first so we can preserve the kobject_del() and
then t->release() ordering, or we have to add some sort of "locking"
mechanism to prevent the kobject from being accessed by anything other
than the deallocation code until it is fully deallocated; well, it
would have to prevent any access at all :-). I think it goes without
saying that this "locking" idea is pretty flawed.

The problem with just having children take a reference is that the
kobject children already take a reference to their parent, so it seems
like the kobject should be smart enough to deallocate children rather
than having swnode have to keep a separate tally of children, no?

Sorry if this all seems obvious, I am not an expert on this part of the kernel.

> My example above proofs that there is the problem, but it changes the
> order of execution which I think can always have other consequences.
>
> > Also, should I send this fix as a separate patch? Or do people want me
> > to send an updated revision of my revert patch with the fix?
>
> This needs to be send in its own separate patch. Ideally it could be
> send together with the revert in the same series, but I'm not sure
> that's possible anymore. Didn't Greg pick the revert already?

Sounds good.

I did already let Greg know when he emailed us on backporting the
patch to stable, and he acked saying he removed them. So as long as
these are not in the queue for 5.6 (it is not in Linus' tree yet), we
should be good.

Cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
@ 2020-03-10 20:46                   ` Brendan Higgins
  0 siblings, 0 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-03-10 20:46 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Sakari Ailus, Andy Shevchenko, hdegoede, rafael.j.wysocki,
	Naresh Kamboju, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Mar 10, 2020 at 4:18 AM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> On Mon, Mar 09, 2020 at 02:43:13PM -0700, Brendan Higgins wrote:
> > On Mon, Mar 9, 2020 at 1:35 PM Brendan Higgins
> > <brendanhiggins@google.com> wrote:
> > >
> > > On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus
> > > <heikki.krogerus@linux.intel.com> wrote:
> > > >
> > > > On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote:
> > > > > Hi Brendan,
> > > > >
> > > > > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote:
> > > > > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins
> > > > > > <brendanhiggins@google.com> wrote:
> > > > > > >
> > > > > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko
> > > > > > > <andy.shevchenko@gmail.com> wrote:
> > > > > > > >
> > > > > > > > +Cc: Sakari
> > > > > > > >
> > > > > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > > > >
> > > > > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64,
> > > > > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib
> > > > > > > > > printf.sh test case.
> > > > > > > > > Which is doing modprobe operations.
> > > > > > > > >
> > > > > > > > > BTW, there are few RCU warnings from the boot log.
> > > > > > > > > Please refer below link for more details.
> > > > > > > > >
> > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > >
> > > > > > > > >           - lsmod || true
> > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > >           - ./printf.sh || true
> > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > >
> > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > + ./printf.sh
> > > > > > >
> > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > node: Simplify software_node_release() function"":
> > > > > > >
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > >
> > > > > > > I am still investigating, will update later.
> > > > > >
> > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > >
> > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > >
> > > > > > It shows the same breakage after applying this change.
> > > > > >
> > > > > > I am still investigating how my change broke it, nevertheless.
> > > > >
> > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > in its release function.
> > > > >
> > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > Oops!
> > > > >
> > > > > So Heikki's patch actually fixed a bug. :-)
> > > >
> > > > Well, I think it just hid the problem. It looks like the core
> > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > last child kobject is released. To be honest, that does not sound
> > > > right to me...
> > > >
> > > > I think we can workaround this problem by taking reference to the
> > > > parent when the child is added, and then releasing it when the child
> > > > is released, and in that way be guaranteed that the parent will not
> > > > disappear before the child is fully released, but that still does not
> > > > feel right. It feels more like the core is not doing it's job to me.
> > > > The parent just should not be released before its children.
> > > >
> > > > Either I'm wrong about that, and we still should take the reference on
> > > > the parent, or we revert my patch like Brendan proposed and then fix
> > >
> > > Either way, isn't it wrong to release the node ID before deleting the
> > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > believe the bug that Heidi and I found is actually a bug.
>
> I agree.
>
> > > > the core with something like this (warning, I did not even try to
> > > > compile that):
> > >
> > > I will try it out.
> > >
> > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > index 83198cb37d8d..ec5774992337 100644
> > > > --- a/lib/kobject.c
> > > > +++ b/lib/kobject.c
> > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > >         }
> > > >
> > > > +       if (t && t->release) {
> > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > +                        kobject_name(kobj), kobj);
> > > > +               t->release(kobj);
> > > > +       }
> > > > +
> > > >         /* remove from sysfs if the caller did not do it */
> > > >         if (kobj->state_in_sysfs) {
> > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > >                 kobject_del(kobj);
> > > >         }
> > > >
> > > > -       if (t && t->release) {
> > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > -                        kobject_name(kobj), kobj);
> > > > -               t->release(kobj);
> > > > -       }
> > > > -
> > > >         /* free name if we allocated it */
> > > >         if (name) {
> > > >                 pr_debug("kobject: '%s': free name\n", name);
> >
> > Alright, so I tried it and it looks like Heikki's suggestion worked.
> >
> > Is everyone comfortable going this route?
>
> Hold on. Another way to fix the problem is to increment the parent's
> reference count before that kobject_del(kobj) is called, and then
> decrementing it after t->release(kobj) is called. It may be safer to
> fix the problem like that.

Right, this was your first suggestion above, right? That actually made
more sense to me, but you seemed skeptical of it due to it being
messier, which makes sense.

Nevertheless, having children take a reference seems like the right
thing to do because the children need to degregister themselves from
the parent. Calling t->release() ahead of kobject_del() seems to
reintroduce the problem that I pointed out, albeit *much* more
briefly. If I understand correctly, it is always wrong to have a sysfs
entry that points to a partially deallocated kobject. Please correct
me if I am wrong.

So I think there are two solutions: Either we have to ensure that each
child is deallocated first so we can preserve the kobject_del() and
then t->release() ordering, or we have to add some sort of "locking"
mechanism to prevent the kobject from being accessed by anything other
than the deallocation code until it is fully deallocated; well, it
would have to prevent any access at all :-). I think it goes without
saying that this "locking" idea is pretty flawed.

The problem with just having children take a reference is that the
kobject children already take a reference to their parent, so it seems
like the kobject should be smart enough to deallocate children rather
than having swnode have to keep a separate tally of children, no?

Sorry if this all seems obvious, I am not an expert on this part of the kernel.

> My example above proofs that there is the problem, but it changes the
> order of execution which I think can always have other consequences.
>
> > Also, should I send this fix as a separate patch? Or do people want me
> > to send an updated revision of my revert patch with the fix?
>
> This needs to be send in its own separate patch. Ideally it could be
> send together with the revert in the same series, but I'm not sure
> that's possible anymore. Didn't Greg pick the revert already?

Sounds good.

I did already let Greg know when he emailed us on backporting the
patch to stable, and he acked saying he removed them. So as long as
these are not in the queue for 5.6 (it is not in Linus' tree yet), we
should be good.

Cheers


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-03-10 20:46                   ` Brendan Higgins
  (?)
@ 2020-04-07  9:25                   ` Naresh Kamboju
  2020-04-07 20:56                     ` Brendan Higgins
  -1 siblings, 1 reply; 19+ messages in thread
From: Naresh Kamboju @ 2020-04-07  9:25 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: Heikki Krogerus, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > >
> > > > > > > > > >           - lsmod || true
> > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > >
> > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > + ./printf.sh
> > > > > > > >
> > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > >
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > >
> > > > > > > > I am still investigating, will update later.
> > > > > > >
> > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > >
> > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > >
> > > > > > > It shows the same breakage after applying this change.
> > > > > > >
> > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > >
> > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > in its release function.
> > > > > >
> > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > Oops!
> > > > > >
> > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > >
> > > > > Well, I think it just hid the problem. It looks like the core
> > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > last child kobject is released. To be honest, that does not sound
> > > > > right to me...
> > > > >
> > > > > I think we can workaround this problem by taking reference to the
> > > > > parent when the child is added, and then releasing it when the child
> > > > > is released, and in that way be guaranteed that the parent will not
> > > > > disappear before the child is fully released, but that still does not
> > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > The parent just should not be released before its children.
> > > > >
> > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > >
> > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > believe the bug that Heidi and I found is actually a bug.
> >
> > I agree.
> >
> > > > > the core with something like this (warning, I did not even try to
> > > > > compile that):
> > > >
> > > > I will try it out.
> > > >
> > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > --- a/lib/kobject.c
> > > > > +++ b/lib/kobject.c
> > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > >         }
> > > > >
> > > > > +       if (t && t->release) {
> > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > +                        kobject_name(kobj), kobj);
> > > > > +               t->release(kobj);
> > > > > +       }
> > > > > +
> > > > >         /* remove from sysfs if the caller did not do it */
> > > > >         if (kobj->state_in_sysfs) {
> > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > >                 kobject_del(kobj);
> > > > >         }
> > > > >
> > > > > -       if (t && t->release) {
> > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > -                        kobject_name(kobj), kobj);
> > > > > -               t->release(kobj);
> > > > > -       }
> > > > > -
> > > > >         /* free name if we allocated it */
> > > > >         if (name) {
> > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > >
> > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > >
> > > Is everyone comfortable going this route?
> >
> > Hold on. Another way to fix the problem is to increment the parent's
> > reference count before that kobject_del(kobj) is called, and then
> > decrementing it after t->release(kobj) is called. It may be safer to
> > fix the problem like that.
>
> Right, this was your first suggestion above, right? That actually made
> more sense to me, but you seemed skeptical of it due to it being
> messier, which makes sense.
>
> Nevertheless, having children take a reference seems like the right
> thing to do because the children need to degregister themselves from
> the parent. Calling t->release() ahead of kobject_del() seems to
> reintroduce the problem that I pointed out, albeit *much* more
> briefly. If I understand correctly, it is always wrong to have a sysfs
> entry that points to a partially deallocated kobject. Please correct
> me if I am wrong.
>
> So I think there are two solutions: Either we have to ensure that each
> child is deallocated first so we can preserve the kobject_del() and
> then t->release() ordering, or we have to add some sort of "locking"
> mechanism to prevent the kobject from being accessed by anything other
> than the deallocation code until it is fully deallocated; well, it
> would have to prevent any access at all :-). I think it goes without
> saying that this "locking" idea is pretty flawed.
>
> The problem with just having children take a reference is that the
> kobject children already take a reference to their parent, so it seems
> like the kobject should be smart enough to deallocate children rather
> than having swnode have to keep a separate tally of children, no?
>
> Sorry if this all seems obvious, I am not an expert on this part of the kernel.
>
> > My example above proofs that there is the problem, but it changes the
> > order of execution which I think can always have other consequences.
> >
> > > Also, should I send this fix as a separate patch? Or do people want me
> > > to send an updated revision of my revert patch with the fix?
> >
> > This needs to be send in its own separate patch. Ideally it could be
> > send together with the revert in the same series, but I'm not sure
> > that's possible anymore. Didn't Greg pick the revert already?
>
> Sounds good.
>
> I did already let Greg know when he emailed us on backporting the
> patch to stable, and he acked saying he removed them. So as long as
> these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> should be good.

The reported bug is still noticed on Linux mainline master branch
The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
mainline kernel 5.6.0.

Linux version 5.6.0 (oe-user@oe-host) (gcc version 7.3.0 (GCC)) #1 SMP
Mon Apr 6 17:25:26 UTC 2020
<>
[  270.580905] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  270.588978] #PF: supervisor read access in kernel mode
[  270.594114] #PF: error_code(0x0000) - not-present page
[  270.599247] PGD 800000026b340067 P4D 800000026b340067 PUD 26b34e067 PMD 0
[  270.606119] Oops: 0000 [#1] SMP PTI
[  270.609604] CPU: 1 PID: 4688 Comm: modprobe Tainted: G        W
    5.6.0 #1
[  270.616996] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[  270.624381] RIP: 0010:ida_free+0x76/0x140
[  270.628391] Code: 45 d0 00 00 00 00 48 89 45 b0 0f 88 c4 00 00 00
89 f3 e8 0d 51 02 00 48 8d 7d a8 49 89 c6 e8 e1 04 01 00 a8 01 49 89
c5 75 42 <4c> 0f a3 20 72 76 48 8b 7d a8 4c 89 f6 e8 18 53 02 00 89 de
48 c7
[  270.647128] RSP: 0018:ffffbd8841d07a50 EFLAGS: 00010046
[  270.652346] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff978be6ac48c0
[  270.659468] RDX: 0000000000000000 RSI: ffff978beb346780 RDI: ffff978be6ac5138
[  270.666595] RBP: ffffbd8841d07aa8 R08: 0000000000000001 R09: 0000000000000000
[  270.673725] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  270.680850] R13: 0000000000000000 R14: 0000000000000246 R15: ffffffffc0383b3c
[  270.687973] FS:  00007fc8e92c2740(0000) GS:ffff978befa80000(0000)
knlGS:0000000000000000
[  270.696052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  270.701790] CR2: 0000000000000000 CR3: 000000021f2b0003 CR4: 00000000003606e0
[  270.708920] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  270.716043] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  270.723169] Call Trace:
[  270.725618]  software_node_release+0x2b/0xb0
[  270.729886]  kobject_put+0xad/0x1c0
[  270.733378]  kobject_del+0x4a/0x60
[  270.736776]  kobject_put+0x92/0x1c0
[  270.740270]  fwnode_remove_software_node+0x32/0x40
[  270.745061]  software_node_unregister_nodes+0x2a/0x50
[  270.750114]  test_printf_init+0xf30/0x1c16 [test_printf]
[  270.755422]  ? test_hashed+0x75/0x75 [test_printf]
[  270.760211]  ? test_hashed+0x75/0x75 [test_printf]
[  270.765003]  do_one_initcall+0x61/0x2f0
[  270.768842]  ? rcu_read_lock_sched_held+0x4f/0x80
[  270.773541]  ? kmem_cache_alloc_trace+0x282/0x2b0
[  270.778247]  do_init_module+0x5f/0x22b
[  270.781999]  load_module+0x24e6/0x2ac0
[  270.785754]  ? security_kernel_post_read_file+0x3f/0x60
[  270.790979]  __do_sys_finit_module+0xfc/0x120
[  270.795334]  ? __do_sys_finit_module+0xfc/0x120
[  270.799862]  __x64_sys_finit_module+0x1a/0x20
[  270.804219]  do_syscall_64+0x55/0x200
[  270.807885]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
[  270.812938] RIP: 0033:0x7fc8e8bcaf59
[  270.816516] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89
01 48
[  270.835253] RSP: 002b:00007ffe9a4fb6f8 EFLAGS: 00000202 ORIG_RAX:
0000000000000139
[  270.842810] RAX: ffffffffffffffda RBX: 0000000000cdfaf0 RCX: 00007fc8e8bcaf59
[  270.849934] RDX: 0000000000000000 RSI: 0000000000418cce RDI: 0000000000000005
[  270.857057] RBP: 0000000000418cce R08: 0000000000000000 R09: 0000000000000000
[  270.864180] R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
[  270.871306] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
[  270.878433] Modules linked in: test_printf(+) cls_bpf sch_fq 8021q
sch_ingress veth algif_hash x86_pkg_temp_thermal fuse [last unloaded:
gpio_mockup]
[  270.891812] CR2: 0000000000000000
[  270.895130] ---[ end trace 151cfe414205f0bc ]---
[  270.899742] RIP: 0010:ida_free+0x76/0x140
[  270.903752] Code: 45 d0 00 00 00 00 48 89 45 b0 0f 88 c4 00 00 00
89 f3 e8 0d 51 02 00 48 8d 7d a8 49 89 c6 e8 e1 04 01 00 a8 01 49 89
c5 75 42 <4c> 0f a3 20 72 76 48 8b 7d a8 4c 89 f6 e8 18 53 02 00 89 de
48 c7
[  270.922489] RSP: 0018:ffffbd8841d07a50 EFLAGS: 00010046
[  270.927706] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff978be6ac48c0
[  270.934831] RDX: 0000000000000000 RSI: ffff978beb346780 RDI: ffff978be6ac5138
[  270.941954] RBP: ffffbd8841d07aa8 R08: 0000000000000001 R09: 0000000000000000
[  270.949078] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  270.956204] R13: 0000000000000000 R14: 0000000000000246 R15: ffffffffc0383b3c
[  270.963328] FS:  00007fc8e92c2740(0000) GS:ffff978befa80000(0000)
knlGS:0000000000000000
[  270.971414] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  270.977158] CR2: 0000000000000000 CR3: 000000021f2b0003 CR4: 00000000003606e0
[  270.984281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  270.991407] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  270.998540] BUG: sleeping function called from invalid context at
/usr/src/kernel/include/linux/percpu-rwsem.h:49
[  271.008782] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
4688, name: modprobe
[  271.016771] INFO: lockdep is turned off.
[  271.020691] irq event stamp: 4812
[  271.024011] hardirqs last  enabled at (4811): [<ffffffffb5be42f1>]
kfree+0x181/0x2e0
[  271.031748] hardirqs last disabled at (4812): [<ffffffffb67d2b68>]
_raw_spin_lock_irqsave+0x18/0x50
[  271.040778] softirqs last  enabled at (4496): [<ffffffffb6a00348>]
__do_softirq+0x348/0x459
[  271.049117] softirqs last disabled at (4487): [<ffffffffb59be298>]
irq_exit+0xb8/0xc0
[  271.056933] CPU: 1 PID: 4688 Comm: modprobe Tainted: G      D W
    5.6.0 #1
[  271.064317] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[  271.071702] Call Trace:
[  271.074147]  dump_stack+0x7a/0xa5
[  271.077465]  ___might_sleep+0x163/0x250
[  271.081296]  __might_sleep+0x4a/0x80
[  271.084868]  exit_signals+0x33/0x2f0
[  271.088446]  do_exit+0xb6/0xcd0
[  271.091586]  rewind_stack_do_exit+0x17/0x20
[  271.095770] RIP: 0033:0x7fc8e8bcaf59
[  271.099349] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89
01 48
[  271.118093] RSP: 002b:00007ffe9a4fb6f8 EFLAGS: 00000202 ORIG_RAX:
0000000000000139
[  271.125649] RAX: ffffffffffffffda RBX: 0000000000cdfaf0 RCX: 00007fc8e8bcaf59
[  271.132775] RDX: 0000000000000000 RSI: 0000000000418cce RDI: 0000000000000005
[  271.139897] RBP: 0000000000418cce R08: 0000000000000000 R09: 0000000000000000
[  271.147022] R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
[  271.154146] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
[  271.161274] note: modprobe[4688] exited with preempt_count 1

Full test log:
https://lkft.validation.linaro.org/scheduler/job/1341945#L8386
https://lkft.validation.linaro.org/scheduler/job/1341969#L8023


- Naresh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-07  9:25                   ` Naresh Kamboju
@ 2020-04-07 20:56                     ` Brendan Higgins
  2020-04-13 19:10                       ` Naresh Kamboju
  2020-04-14  8:15                       ` Heikki Krogerus
  0 siblings, 2 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-04-07 20:56 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Heikki Krogerus, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > >
> > > > > > > > > > >           - lsmod || true
> > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > >
> > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > + ./printf.sh
> > > > > > > > >
> > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > >
> > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > >
> > > > > > > > > I am still investigating, will update later.
> > > > > > > >
> > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > >
> > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > >
> > > > > > > > It shows the same breakage after applying this change.
> > > > > > > >
> > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > >
> > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > in its release function.
> > > > > > >
> > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > Oops!
> > > > > > >
> > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > >
> > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > right to me...
> > > > > >
> > > > > > I think we can workaround this problem by taking reference to the
> > > > > > parent when the child is added, and then releasing it when the child
> > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > disappear before the child is fully released, but that still does not
> > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > The parent just should not be released before its children.
> > > > > >
> > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > >
> > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > believe the bug that Heidi and I found is actually a bug.
> > >
> > > I agree.
> > >
> > > > > > the core with something like this (warning, I did not even try to
> > > > > > compile that):
> > > > >
> > > > > I will try it out.
> > > > >
> > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > --- a/lib/kobject.c
> > > > > > +++ b/lib/kobject.c
> > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > >         }
> > > > > >
> > > > > > +       if (t && t->release) {
> > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > +                        kobject_name(kobj), kobj);
> > > > > > +               t->release(kobj);
> > > > > > +       }
> > > > > > +
> > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > >         if (kobj->state_in_sysfs) {
> > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > >                 kobject_del(kobj);
> > > > > >         }
> > > > > >
> > > > > > -       if (t && t->release) {
> > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > -                        kobject_name(kobj), kobj);
> > > > > > -               t->release(kobj);
> > > > > > -       }
> > > > > > -
> > > > > >         /* free name if we allocated it */
> > > > > >         if (name) {
> > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > >
> > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > >
> > > > Is everyone comfortable going this route?
> > >
> > > Hold on. Another way to fix the problem is to increment the parent's
> > > reference count before that kobject_del(kobj) is called, and then
> > > decrementing it after t->release(kobj) is called. It may be safer to
> > > fix the problem like that.
> >
> > Right, this was your first suggestion above, right? That actually made
> > more sense to me, but you seemed skeptical of it due to it being
> > messier, which makes sense.
> >
> > Nevertheless, having children take a reference seems like the right
> > thing to do because the children need to degregister themselves from
> > the parent. Calling t->release() ahead of kobject_del() seems to
> > reintroduce the problem that I pointed out, albeit *much* more
> > briefly. If I understand correctly, it is always wrong to have a sysfs
> > entry that points to a partially deallocated kobject. Please correct
> > me if I am wrong.
> >
> > So I think there are two solutions: Either we have to ensure that each
> > child is deallocated first so we can preserve the kobject_del() and
> > then t->release() ordering, or we have to add some sort of "locking"
> > mechanism to prevent the kobject from being accessed by anything other
> > than the deallocation code until it is fully deallocated; well, it
> > would have to prevent any access at all :-). I think it goes without
> > saying that this "locking" idea is pretty flawed.
> >
> > The problem with just having children take a reference is that the
> > kobject children already take a reference to their parent, so it seems
> > like the kobject should be smart enough to deallocate children rather
> > than having swnode have to keep a separate tally of children, no?
> >
> > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> >
> > > My example above proofs that there is the problem, but it changes the
> > > order of execution which I think can always have other consequences.
> > >
> > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > to send an updated revision of my revert patch with the fix?
> > >
> > > This needs to be send in its own separate patch. Ideally it could be
> > > send together with the revert in the same series, but I'm not sure
> > > that's possible anymore. Didn't Greg pick the revert already?
> >
> > Sounds good.
> >
> > I did already let Greg know when he emailed us on backporting the
> > patch to stable, and he acked saying he removed them. So as long as
> > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > should be good.
>
> The reported bug is still noticed on Linux mainline master branch
> The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> mainline kernel 5.6.0.

Oh sorry, I thought that this patch was dropped from the maintainer's
for-next branch.

Heikki, what do you think about my suggestion of having kobject
deallocate its children?

In the meantime, are people cool with the patch that Heikki proposed
as a temporary mitigation? I think my solution might be a bit more
involved. If I don't hear anything back, I will send out Heikki's
suggestion as a patch.

> Linux version 5.6.0 (oe-user@oe-host) (gcc version 7.3.0 (GCC)) #1 SMP
> Mon Apr 6 17:25:26 UTC 2020
> <>
> [  270.580905] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [  270.588978] #PF: supervisor read access in kernel mode
> [  270.594114] #PF: error_code(0x0000) - not-present page
> [  270.599247] PGD 800000026b340067 P4D 800000026b340067 PUD 26b34e067 PMD 0
> [  270.606119] Oops: 0000 [#1] SMP PTI
> [  270.609604] CPU: 1 PID: 4688 Comm: modprobe Tainted: G        W
>     5.6.0 #1
> [  270.616996] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.2 05/23/2018
> [  270.624381] RIP: 0010:ida_free+0x76/0x140
> [  270.628391] Code: 45 d0 00 00 00 00 48 89 45 b0 0f 88 c4 00 00 00
> 89 f3 e8 0d 51 02 00 48 8d 7d a8 49 89 c6 e8 e1 04 01 00 a8 01 49 89
> c5 75 42 <4c> 0f a3 20 72 76 48 8b 7d a8 4c 89 f6 e8 18 53 02 00 89 de
> 48 c7
> [  270.647128] RSP: 0018:ffffbd8841d07a50 EFLAGS: 00010046
> [  270.652346] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff978be6ac48c0
> [  270.659468] RDX: 0000000000000000 RSI: ffff978beb346780 RDI: ffff978be6ac5138
> [  270.666595] RBP: ffffbd8841d07aa8 R08: 0000000000000001 R09: 0000000000000000
> [  270.673725] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> [  270.680850] R13: 0000000000000000 R14: 0000000000000246 R15: ffffffffc0383b3c
> [  270.687973] FS:  00007fc8e92c2740(0000) GS:ffff978befa80000(0000)
> knlGS:0000000000000000
> [  270.696052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  270.701790] CR2: 0000000000000000 CR3: 000000021f2b0003 CR4: 00000000003606e0
> [  270.708920] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  270.716043] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  270.723169] Call Trace:
> [  270.725618]  software_node_release+0x2b/0xb0
> [  270.729886]  kobject_put+0xad/0x1c0
> [  270.733378]  kobject_del+0x4a/0x60
> [  270.736776]  kobject_put+0x92/0x1c0
> [  270.740270]  fwnode_remove_software_node+0x32/0x40
> [  270.745061]  software_node_unregister_nodes+0x2a/0x50
> [  270.750114]  test_printf_init+0xf30/0x1c16 [test_printf]
> [  270.755422]  ? test_hashed+0x75/0x75 [test_printf]
> [  270.760211]  ? test_hashed+0x75/0x75 [test_printf]
> [  270.765003]  do_one_initcall+0x61/0x2f0
> [  270.768842]  ? rcu_read_lock_sched_held+0x4f/0x80
> [  270.773541]  ? kmem_cache_alloc_trace+0x282/0x2b0
> [  270.778247]  do_init_module+0x5f/0x22b
> [  270.781999]  load_module+0x24e6/0x2ac0
> [  270.785754]  ? security_kernel_post_read_file+0x3f/0x60
> [  270.790979]  __do_sys_finit_module+0xfc/0x120
> [  270.795334]  ? __do_sys_finit_module+0xfc/0x120
> [  270.799862]  __x64_sys_finit_module+0x1a/0x20
> [  270.804219]  do_syscall_64+0x55/0x200
> [  270.807885]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> [  270.812938] RIP: 0033:0x7fc8e8bcaf59
> [  270.816516] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
> 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
> 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89
> 01 48
> [  270.835253] RSP: 002b:00007ffe9a4fb6f8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000139
> [  270.842810] RAX: ffffffffffffffda RBX: 0000000000cdfaf0 RCX: 00007fc8e8bcaf59
> [  270.849934] RDX: 0000000000000000 RSI: 0000000000418cce RDI: 0000000000000005
> [  270.857057] RBP: 0000000000418cce R08: 0000000000000000 R09: 0000000000000000
> [  270.864180] R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
> [  270.871306] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
> [  270.878433] Modules linked in: test_printf(+) cls_bpf sch_fq 8021q
> sch_ingress veth algif_hash x86_pkg_temp_thermal fuse [last unloaded:
> gpio_mockup]
> [  270.891812] CR2: 0000000000000000
> [  270.895130] ---[ end trace 151cfe414205f0bc ]---
> [  270.899742] RIP: 0010:ida_free+0x76/0x140
> [  270.903752] Code: 45 d0 00 00 00 00 48 89 45 b0 0f 88 c4 00 00 00
> 89 f3 e8 0d 51 02 00 48 8d 7d a8 49 89 c6 e8 e1 04 01 00 a8 01 49 89
> c5 75 42 <4c> 0f a3 20 72 76 48 8b 7d a8 4c 89 f6 e8 18 53 02 00 89 de
> 48 c7
> [  270.922489] RSP: 0018:ffffbd8841d07a50 EFLAGS: 00010046
> [  270.927706] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff978be6ac48c0
> [  270.934831] RDX: 0000000000000000 RSI: ffff978beb346780 RDI: ffff978be6ac5138
> [  270.941954] RBP: ffffbd8841d07aa8 R08: 0000000000000001 R09: 0000000000000000
> [  270.949078] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> [  270.956204] R13: 0000000000000000 R14: 0000000000000246 R15: ffffffffc0383b3c
> [  270.963328] FS:  00007fc8e92c2740(0000) GS:ffff978befa80000(0000)
> knlGS:0000000000000000
> [  270.971414] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  270.977158] CR2: 0000000000000000 CR3: 000000021f2b0003 CR4: 00000000003606e0
> [  270.984281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  270.991407] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  270.998540] BUG: sleeping function called from invalid context at
> /usr/src/kernel/include/linux/percpu-rwsem.h:49
> [  271.008782] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid:
> 4688, name: modprobe
> [  271.016771] INFO: lockdep is turned off.
> [  271.020691] irq event stamp: 4812
> [  271.024011] hardirqs last  enabled at (4811): [<ffffffffb5be42f1>]
> kfree+0x181/0x2e0
> [  271.031748] hardirqs last disabled at (4812): [<ffffffffb67d2b68>]
> _raw_spin_lock_irqsave+0x18/0x50
> [  271.040778] softirqs last  enabled at (4496): [<ffffffffb6a00348>]
> __do_softirq+0x348/0x459
> [  271.049117] softirqs last disabled at (4487): [<ffffffffb59be298>]
> irq_exit+0xb8/0xc0
> [  271.056933] CPU: 1 PID: 4688 Comm: modprobe Tainted: G      D W
>     5.6.0 #1
> [  271.064317] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.2 05/23/2018
> [  271.071702] Call Trace:
> [  271.074147]  dump_stack+0x7a/0xa5
> [  271.077465]  ___might_sleep+0x163/0x250
> [  271.081296]  __might_sleep+0x4a/0x80
> [  271.084868]  exit_signals+0x33/0x2f0
> [  271.088446]  do_exit+0xb6/0xcd0
> [  271.091586]  rewind_stack_do_exit+0x17/0x20
> [  271.095770] RIP: 0033:0x7fc8e8bcaf59
> [  271.099349] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
> 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
> 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89
> 01 48
> [  271.118093] RSP: 002b:00007ffe9a4fb6f8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000139
> [  271.125649] RAX: ffffffffffffffda RBX: 0000000000cdfaf0 RCX: 00007fc8e8bcaf59
> [  271.132775] RDX: 0000000000000000 RSI: 0000000000418cce RDI: 0000000000000005
> [  271.139897] RBP: 0000000000418cce R08: 0000000000000000 R09: 0000000000000000
> [  271.147022] R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
> [  271.154146] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
> [  271.161274] note: modprobe[4688] exited with preempt_count 1
>
> Full test log:
> https://lkft.validation.linaro.org/scheduler/job/1341945#L8386
> https://lkft.validation.linaro.org/scheduler/job/1341969#L8023
>
>
> - Naresh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-07 20:56                     ` Brendan Higgins
@ 2020-04-13 19:10                       ` Naresh Kamboju
  2020-04-14  8:15                       ` Heikki Krogerus
  1 sibling, 0 replies; 19+ messages in thread
From: Naresh Kamboju @ 2020-04-13 19:10 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: Heikki Krogerus, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

> > > I did already let Greg know when he emailed us on backporting the
> > > patch to stable, and he acked saying he removed them. So as long as
> > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > should be good.
> >
> > The reported bug is still noticed on Linux mainline master branch
> > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > mainline kernel 5.6.0.
>
> Oh sorry, I thought that this patch was dropped from the maintainer's
> for-next branch.
>
> Heikki, what do you think about my suggestion of having kobject
> deallocate its children?
>
> In the meantime, are people cool with the patch that Heikki proposed
> as a temporary mitigation? I think my solution might be a bit more
> involved. If I don't hear anything back, I will send out Heikki's
> suggestion as a patch.

This reported problem still noticed on 5.7.0-rc1

Test crash log:
-------------------
[  281.822463] BUG: kernel NULL pointer dereference, address: 00000000
[  281.829543] #PF: supervisor read access in kernel mode
[  281.834680] #PF: error_code(0x0000) - not-present page
[  281.839820] *pde = 00000000
[  281.842706] Oops: 0000 [#1] SMP
[  281.845852] CPU: 1 PID: 3998 Comm: modprobe Tainted: G        W
    5.7.0-rc1 #1
[  281.853590] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[  281.861063] EIP: ida_free+0x61/0x130
[  281.864641] Code: 00 c7 45 e8 00 00 00 00 c7 45 ec 00 00 00 00 0f
88 c4 00 00 00 89 d3 e8 0d 8e 87 00 89 c7 8d 45 d8 e8 93 1e 01 00 a8
01 75 3f <0f> a3 30 72 72 8b 45 d8 89 fa e8 e0 8f 87 00 53 68 08 ab fd
de e8
[  281.883395] EAX: 00000000 EBX: 00000000 ECX: e422d8c0 EDX: 00000000
[  281.889672] ESI: 00000000 EDI: 00000246 EBP: e5d63cdc ESP: e5d63cb0
[  281.895935] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046
[  281.902720] CR0: 80050033 CR2: 00000000 CR3: 1db85000 CR4: 003406d0
[  281.908978] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  281.915243] DR6: fffe0ff0 DR7: 00000400
[  281.919073] Call Trace:
[  281.921521]  software_node_release+0x21/0x80
[  281.925792]  kobject_put+0xa8/0x1c0
[  281.929284]  kobject_del+0x40/0x60
[  281.932680]  kobject_put+0x92/0x1c0
[  281.936165]  fwnode_remove_software_node+0x30/0x50
[  281.940957]  software_node_unregister_nodes+0x2b/0x50
[  281.946002]  test_printf_init+0xe00/0x1d29 [test_printf]
[  281.951320]  ? fs_reclaim_acquire.part.121+0x5/0x30
[  281.956202]  ? test_hashed+0x54/0x54 [test_printf]
[  281.960985]  ? test_hashed+0x54/0x54 [test_printf]
[  281.965770]  do_one_initcall+0x54/0x2e0
[  281.969639]  ? rcu_read_lock_sched_held+0x47/0x70
[  281.974366]  ? kmem_cache_alloc_trace+0x285/0x2b0
[  281.979063]  ? do_init_module+0x21/0x1f7
[  281.982982]  ? do_init_module+0x21/0x1f7
[  281.986899]  do_init_module+0x50/0x1f7
[  281.990659]  load_module+0x1e32/0x2540
[  281.994410]  __ia32_sys_finit_module+0x8f/0xe0
[  281.998854]  do_fast_syscall_32+0x7f/0x330
[  282.002949]  entry_SYSENTER_32+0xaa/0x102
[  282.006952] EIP: 0xb7f07ce1
[  282.009742] Code: 5e 5d c3 8d b6 00 00 00 00 b8 40 42 0f 00 eb c1
8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 8b 3c 24 c3 90 51 52 55 89 e5 0f
34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
8d 76
[  282.028480] EAX: ffffffda EBX: 00000005 ECX: 0806233a EDX: 00000000
[  282.034739] ESI: 099fe840 EDI: 099fe570 EBP: 099fe700 ESP: bf97146c
[  282.040994] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000292
[  282.047776] Modules linked in: test_printf(+) cls_bpf sch_fq 8021q
sch_ingress veth algif_hash x86_pkg_temp_thermal fuse [last unloaded:
gpio_mockup]
[  282.061155] CR2: 0000000000000000
[  282.064472] ---[ end trace 09f43fd7981266c9 ]---
[  282.069084] EIP: ida_free+0x61/0x130

https://lkft.validation.linaro.org/scheduler/job/1362555#L7962

- Naresh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-07 20:56                     ` Brendan Higgins
  2020-04-13 19:10                       ` Naresh Kamboju
@ 2020-04-14  8:15                       ` Heikki Krogerus
  2020-04-14 19:18                         ` Brendan Higgins
  1 sibling, 1 reply; 19+ messages in thread
From: Heikki Krogerus @ 2020-04-14  8:15 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: Naresh Kamboju, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

Hi,

On Tue, Apr 07, 2020 at 01:56:16PM -0700, Brendan Higgins wrote:
> On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > > >
> > > > > > > > > > > >           - lsmod || true
> > > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > > >
> > > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > > + ./printf.sh
> > > > > > > > > >
> > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > > >
> > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > > >
> > > > > > > > > > I am still investigating, will update later.
> > > > > > > > >
> > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > > >
> > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > > >
> > > > > > > > > It shows the same breakage after applying this change.
> > > > > > > > >
> > > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > > >
> > > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > > in its release function.
> > > > > > > >
> > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > > Oops!
> > > > > > > >
> > > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > > >
> > > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > > right to me...
> > > > > > >
> > > > > > > I think we can workaround this problem by taking reference to the
> > > > > > > parent when the child is added, and then releasing it when the child
> > > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > > disappear before the child is fully released, but that still does not
> > > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > > The parent just should not be released before its children.
> > > > > > >
> > > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > > >
> > > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > > believe the bug that Heidi and I found is actually a bug.
> > > >
> > > > I agree.
> > > >
> > > > > > > the core with something like this (warning, I did not even try to
> > > > > > > compile that):
> > > > > >
> > > > > > I will try it out.
> > > > > >
> > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > > --- a/lib/kobject.c
> > > > > > > +++ b/lib/kobject.c
> > > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > > >         }
> > > > > > >
> > > > > > > +       if (t && t->release) {
> > > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > +                        kobject_name(kobj), kobj);
> > > > > > > +               t->release(kobj);
> > > > > > > +       }
> > > > > > > +
> > > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > > >         if (kobj->state_in_sysfs) {
> > > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > >                 kobject_del(kobj);
> > > > > > >         }
> > > > > > >
> > > > > > > -       if (t && t->release) {
> > > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > -                        kobject_name(kobj), kobj);
> > > > > > > -               t->release(kobj);
> > > > > > > -       }
> > > > > > > -
> > > > > > >         /* free name if we allocated it */
> > > > > > >         if (name) {
> > > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > > >
> > > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > > >
> > > > > Is everyone comfortable going this route?
> > > >
> > > > Hold on. Another way to fix the problem is to increment the parent's
> > > > reference count before that kobject_del(kobj) is called, and then
> > > > decrementing it after t->release(kobj) is called. It may be safer to
> > > > fix the problem like that.
> > >
> > > Right, this was your first suggestion above, right? That actually made
> > > more sense to me, but you seemed skeptical of it due to it being
> > > messier, which makes sense.
> > >
> > > Nevertheless, having children take a reference seems like the right
> > > thing to do because the children need to degregister themselves from
> > > the parent. Calling t->release() ahead of kobject_del() seems to
> > > reintroduce the problem that I pointed out, albeit *much* more
> > > briefly. If I understand correctly, it is always wrong to have a sysfs
> > > entry that points to a partially deallocated kobject. Please correct
> > > me if I am wrong.
> > >
> > > So I think there are two solutions: Either we have to ensure that each
> > > child is deallocated first so we can preserve the kobject_del() and
> > > then t->release() ordering, or we have to add some sort of "locking"
> > > mechanism to prevent the kobject from being accessed by anything other
> > > than the deallocation code until it is fully deallocated; well, it
> > > would have to prevent any access at all :-). I think it goes without
> > > saying that this "locking" idea is pretty flawed.
> > >
> > > The problem with just having children take a reference is that the
> > > kobject children already take a reference to their parent, so it seems
> > > like the kobject should be smart enough to deallocate children rather
> > > than having swnode have to keep a separate tally of children, no?
> > >
> > > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> > >
> > > > My example above proofs that there is the problem, but it changes the
> > > > order of execution which I think can always have other consequences.
> > > >
> > > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > > to send an updated revision of my revert patch with the fix?
> > > >
> > > > This needs to be send in its own separate patch. Ideally it could be
> > > > send together with the revert in the same series, but I'm not sure
> > > > that's possible anymore. Didn't Greg pick the revert already?
> > >
> > > Sounds good.
> > >
> > > I did already let Greg know when he emailed us on backporting the
> > > patch to stable, and he acked saying he removed them. So as long as
> > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > should be good.
> >
> > The reported bug is still noticed on Linux mainline master branch
> > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > mainline kernel 5.6.0.
> 
> Oh sorry, I thought that this patch was dropped from the maintainer's
> for-next branch.
> 
> Heikki, what do you think about my suggestion of having kobject
> deallocate its children?

I'm not sure what was this suggestion?

> In the meantime, are people cool with the patch that Heikki proposed
> as a temporary mitigation? I think my solution might be a bit more
> involved. If I don't hear anything back, I will send out Heikki's
> suggestion as a patch.

Why not just take the reference to the parent like I proposed?

diff --git a/lib/kobject.c b/lib/kobject.c
index 83198cb37d8d..173046c423f8 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -663,6 +663,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
  */
 static void kobject_cleanup(struct kobject *kobj)
 {
+       struct kobject *parent = kobj->parent;
        struct kobj_type *t = get_ktype(kobj);
        const char *name = kobj->name;

@@ -680,6 +681,9 @@ static void kobject_cleanup(struct kobject *kobj)
                kobject_uevent(kobj, KOBJ_REMOVE);
        }

+       /* make sure the parent is not released before the (last) child */
+       kobject_get(parent)
+
        /* remove from sysfs if the caller did not do it */
        if (kobj->state_in_sysfs) {
                pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
@@ -693,6 +697,8 @@ static void kobject_cleanup(struct kobject *kobj)
                t->release(kobj);
        }

+       kobject_put(parent);
+
        /* free name if we allocated it */
        if (name) {
                pr_debug("kobject: '%s': free name\n", name);


Br,

-- 
heikki

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-14  8:15                       ` Heikki Krogerus
@ 2020-04-14 19:18                         ` Brendan Higgins
  2020-04-14 19:27                           ` Brendan Higgins
  2020-04-14 20:44                           ` Brendan Higgins
  0 siblings, 2 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-04-14 19:18 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Naresh Kamboju, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Apr 14, 2020 at 1:15 AM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> Hi,
>
> On Tue, Apr 07, 2020 at 01:56:16PM -0700, Brendan Higgins wrote:
> > On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > >
> > > On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > > > >
> > > > > > > > > > > > >           - lsmod || true
> > > > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > > > >
> > > > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > > > + ./printf.sh
> > > > > > > > > > >
> > > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > > > >
> > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > > > >
> > > > > > > > > > > I am still investigating, will update later.
> > > > > > > > > >
> > > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > > > >
> > > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > > > >
> > > > > > > > > > It shows the same breakage after applying this change.
> > > > > > > > > >
> > > > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > > > >
> > > > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > > > in its release function.
> > > > > > > > >
> > > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > > > Oops!
> > > > > > > > >
> > > > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > > > >
> > > > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > > > right to me...
> > > > > > > >
> > > > > > > > I think we can workaround this problem by taking reference to the
> > > > > > > > parent when the child is added, and then releasing it when the child
> > > > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > > > disappear before the child is fully released, but that still does not
> > > > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > > > The parent just should not be released before its children.
> > > > > > > >
> > > > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > > > >
> > > > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > > > believe the bug that Heidi and I found is actually a bug.
> > > > >
> > > > > I agree.
> > > > >
> > > > > > > > the core with something like this (warning, I did not even try to
> > > > > > > > compile that):
> > > > > > >
> > > > > > > I will try it out.
> > > > > > >
> > > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > > > --- a/lib/kobject.c
> > > > > > > > +++ b/lib/kobject.c
> > > > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > > > >         }
> > > > > > > >
> > > > > > > > +       if (t && t->release) {
> > > > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > +                        kobject_name(kobj), kobj);
> > > > > > > > +               t->release(kobj);
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > > > >         if (kobj->state_in_sysfs) {
> > > > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > >                 kobject_del(kobj);
> > > > > > > >         }
> > > > > > > >
> > > > > > > > -       if (t && t->release) {
> > > > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > -                        kobject_name(kobj), kobj);
> > > > > > > > -               t->release(kobj);
> > > > > > > > -       }
> > > > > > > > -
> > > > > > > >         /* free name if we allocated it */
> > > > > > > >         if (name) {
> > > > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > > > >
> > > > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > > > >
> > > > > > Is everyone comfortable going this route?
> > > > >
> > > > > Hold on. Another way to fix the problem is to increment the parent's
> > > > > reference count before that kobject_del(kobj) is called, and then
> > > > > decrementing it after t->release(kobj) is called. It may be safer to
> > > > > fix the problem like that.
> > > >
> > > > Right, this was your first suggestion above, right? That actually made
> > > > more sense to me, but you seemed skeptical of it due to it being
> > > > messier, which makes sense.
> > > >
> > > > Nevertheless, having children take a reference seems like the right
> > > > thing to do because the children need to degregister themselves from
> > > > the parent. Calling t->release() ahead of kobject_del() seems to
> > > > reintroduce the problem that I pointed out, albeit *much* more
> > > > briefly. If I understand correctly, it is always wrong to have a sysfs
> > > > entry that points to a partially deallocated kobject. Please correct
> > > > me if I am wrong.
> > > >
> > > > So I think there are two solutions: Either we have to ensure that each
> > > > child is deallocated first so we can preserve the kobject_del() and
> > > > then t->release() ordering, or we have to add some sort of "locking"
> > > > mechanism to prevent the kobject from being accessed by anything other
> > > > than the deallocation code until it is fully deallocated; well, it
> > > > would have to prevent any access at all :-). I think it goes without
> > > > saying that this "locking" idea is pretty flawed.
> > > >
> > > > The problem with just having children take a reference is that the
> > > > kobject children already take a reference to their parent, so it seems
> > > > like the kobject should be smart enough to deallocate children rather
> > > > than having swnode have to keep a separate tally of children, no?
> > > >
> > > > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> > > >
> > > > > My example above proofs that there is the problem, but it changes the
> > > > > order of execution which I think can always have other consequences.
> > > > >
> > > > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > > > to send an updated revision of my revert patch with the fix?
> > > > >
> > > > > This needs to be send in its own separate patch. Ideally it could be
> > > > > send together with the revert in the same series, but I'm not sure
> > > > > that's possible anymore. Didn't Greg pick the revert already?
> > > >
> > > > Sounds good.
> > > >
> > > > I did already let Greg know when he emailed us on backporting the
> > > > patch to stable, and he acked saying he removed them. So as long as
> > > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > > should be good.
> > >
> > > The reported bug is still noticed on Linux mainline master branch
> > > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > > mainline kernel 5.6.0.
> >
> > Oh sorry, I thought that this patch was dropped from the maintainer's
> > for-next branch.
> >
> > Heikki, what do you think about my suggestion of having kobject
> > deallocate its children?
>
> I'm not sure what was this suggestion?
>
> > In the meantime, are people cool with the patch that Heikki proposed
> > as a temporary mitigation? I think my solution might be a bit more
> > involved. If I don't hear anything back, I will send out Heikki's
> > suggestion as a patch.
>
> Why not just take the reference to the parent like I proposed?
>
> diff --git a/lib/kobject.c b/lib/kobject.c
> index 83198cb37d8d..173046c423f8 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -663,6 +663,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
>   */
>  static void kobject_cleanup(struct kobject *kobj)
>  {
> +       struct kobject *parent = kobj->parent;
>         struct kobj_type *t = get_ktype(kobj);
>         const char *name = kobj->name;
>
> @@ -680,6 +681,9 @@ static void kobject_cleanup(struct kobject *kobj)
>                 kobject_uevent(kobj, KOBJ_REMOVE);
>         }
>
> +       /* make sure the parent is not released before the (last) child */
> +       kobject_get(parent)
> +
>         /* remove from sysfs if the caller did not do it */
>         if (kobj->state_in_sysfs) {
>                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> @@ -693,6 +697,8 @@ static void kobject_cleanup(struct kobject *kobj)
>                 t->release(kobj);
>         }
>
> +       kobject_put(parent);
> +
>         /* free name if we allocated it */
>         if (name) {
>                 pr_debug("kobject: '%s': free name\n", name);

Ah, I think that will work. Sorry, I just thought that was a bit
kludgy, but if you are okay with doing it that way, I think it will
work.

I will try it and send out a patch.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-14 19:18                         ` Brendan Higgins
@ 2020-04-14 19:27                           ` Brendan Higgins
  2020-04-14 21:06                             ` Brendan Higgins
  2020-04-14 20:44                           ` Brendan Higgins
  1 sibling, 1 reply; 19+ messages in thread
From: Brendan Higgins @ 2020-04-14 19:27 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Naresh Kamboju, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Apr 14, 2020 at 12:18 PM Brendan Higgins
<brendanhiggins@google.com> wrote:
>
> On Tue, Apr 14, 2020 at 1:15 AM Heikki Krogerus
> <heikki.krogerus@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On Tue, Apr 07, 2020 at 01:56:16PM -0700, Brendan Higgins wrote:
> > > On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > >
> > > > On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >           - lsmod || true
> > > > > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > > > > + ./printf.sh
> > > > > > > > > > > >
> > > > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > > > > >
> > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > > > > >
> > > > > > > > > > > > I am still investigating, will update later.
> > > > > > > > > > >
> > > > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > > > > >
> > > > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > > > > >
> > > > > > > > > > > It shows the same breakage after applying this change.
> > > > > > > > > > >
> > > > > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > > > > >
> > > > > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > > > > in its release function.
> > > > > > > > > >
> > > > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > > > > Oops!
> > > > > > > > > >
> > > > > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > > > > >
> > > > > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > > > > right to me...
> > > > > > > > >
> > > > > > > > > I think we can workaround this problem by taking reference to the
> > > > > > > > > parent when the child is added, and then releasing it when the child
> > > > > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > > > > disappear before the child is fully released, but that still does not
> > > > > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > > > > The parent just should not be released before its children.
> > > > > > > > >
> > > > > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > > > > >
> > > > > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > > > > believe the bug that Heidi and I found is actually a bug.
> > > > > >
> > > > > > I agree.
> > > > > >
> > > > > > > > > the core with something like this (warning, I did not even try to
> > > > > > > > > compile that):
> > > > > > > >
> > > > > > > > I will try it out.
> > > > > > > >
> > > > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > > > > --- a/lib/kobject.c
> > > > > > > > > +++ b/lib/kobject.c
> > > > > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > +       if (t && t->release) {
> > > > > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > +                        kobject_name(kobj), kobj);
> > > > > > > > > +               t->release(kobj);
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > > > > >         if (kobj->state_in_sysfs) {
> > > > > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > >                 kobject_del(kobj);
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > -       if (t && t->release) {
> > > > > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > -                        kobject_name(kobj), kobj);
> > > > > > > > > -               t->release(kobj);
> > > > > > > > > -       }
> > > > > > > > > -
> > > > > > > > >         /* free name if we allocated it */
> > > > > > > > >         if (name) {
> > > > > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > > > > >
> > > > > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > > > > >
> > > > > > > Is everyone comfortable going this route?
> > > > > >
> > > > > > Hold on. Another way to fix the problem is to increment the parent's
> > > > > > reference count before that kobject_del(kobj) is called, and then
> > > > > > decrementing it after t->release(kobj) is called. It may be safer to
> > > > > > fix the problem like that.
> > > > >
> > > > > Right, this was your first suggestion above, right? That actually made
> > > > > more sense to me, but you seemed skeptical of it due to it being
> > > > > messier, which makes sense.
> > > > >
> > > > > Nevertheless, having children take a reference seems like the right
> > > > > thing to do because the children need to degregister themselves from
> > > > > the parent. Calling t->release() ahead of kobject_del() seems to
> > > > > reintroduce the problem that I pointed out, albeit *much* more
> > > > > briefly. If I understand correctly, it is always wrong to have a sysfs
> > > > > entry that points to a partially deallocated kobject. Please correct
> > > > > me if I am wrong.
> > > > >
> > > > > So I think there are two solutions: Either we have to ensure that each
> > > > > child is deallocated first so we can preserve the kobject_del() and
> > > > > then t->release() ordering, or we have to add some sort of "locking"
> > > > > mechanism to prevent the kobject from being accessed by anything other
> > > > > than the deallocation code until it is fully deallocated; well, it
> > > > > would have to prevent any access at all :-). I think it goes without
> > > > > saying that this "locking" idea is pretty flawed.
> > > > >
> > > > > The problem with just having children take a reference is that the
> > > > > kobject children already take a reference to their parent, so it seems
> > > > > like the kobject should be smart enough to deallocate children rather
> > > > > than having swnode have to keep a separate tally of children, no?
> > > > >
> > > > > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> > > > >
> > > > > > My example above proofs that there is the problem, but it changes the
> > > > > > order of execution which I think can always have other consequences.
> > > > > >
> > > > > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > > > > to send an updated revision of my revert patch with the fix?
> > > > > >
> > > > > > This needs to be send in its own separate patch. Ideally it could be
> > > > > > send together with the revert in the same series, but I'm not sure
> > > > > > that's possible anymore. Didn't Greg pick the revert already?
> > > > >
> > > > > Sounds good.
> > > > >
> > > > > I did already let Greg know when he emailed us on backporting the
> > > > > patch to stable, and he acked saying he removed them. So as long as
> > > > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > > > should be good.
> > > >
> > > > The reported bug is still noticed on Linux mainline master branch
> > > > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > > > mainline kernel 5.6.0.
> > >
> > > Oh sorry, I thought that this patch was dropped from the maintainer's
> > > for-next branch.
> > >
> > > Heikki, what do you think about my suggestion of having kobject
> > > deallocate its children?
> >
> > I'm not sure what was this suggestion?
> >
> > > In the meantime, are people cool with the patch that Heikki proposed
> > > as a temporary mitigation? I think my solution might be a bit more
> > > involved. If I don't hear anything back, I will send out Heikki's
> > > suggestion as a patch.
> >
> > Why not just take the reference to the parent like I proposed?
> >
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index 83198cb37d8d..173046c423f8 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -663,6 +663,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> >   */
> >  static void kobject_cleanup(struct kobject *kobj)
> >  {
> > +       struct kobject *parent = kobj->parent;
> >         struct kobj_type *t = get_ktype(kobj);
> >         const char *name = kobj->name;
> >
> > @@ -680,6 +681,9 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 kobject_uevent(kobj, KOBJ_REMOVE);
> >         }
> >
> > +       /* make sure the parent is not released before the (last) child */
> > +       kobject_get(parent)
> > +
> >         /* remove from sysfs if the caller did not do it */
> >         if (kobj->state_in_sysfs) {
> >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > @@ -693,6 +697,8 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 t->release(kobj);
> >         }
> >
> > +       kobject_put(parent);
> > +
> >         /* free name if we allocated it */
> >         if (name) {
> >                 pr_debug("kobject: '%s': free name\n", name);
>
> Ah, I think that will work. Sorry, I just thought that was a bit
> kludgy, but if you are okay with doing it that way, I think it will
> work.
>
> I will try it and send out a patch.

I think I am also going to send out a new test case, in a separate
patch of course, that exercises this logic. Kind of annoying having
the printf test catching this.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-14 19:18                         ` Brendan Higgins
  2020-04-14 19:27                           ` Brendan Higgins
@ 2020-04-14 20:44                           ` Brendan Higgins
  1 sibling, 0 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-04-14 20:44 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Naresh Kamboju, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Apr 14, 2020 at 12:18 PM Brendan Higgins
<brendanhiggins@google.com> wrote:
>
> On Tue, Apr 14, 2020 at 1:15 AM Heikki Krogerus
> <heikki.krogerus@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On Tue, Apr 07, 2020 at 01:56:16PM -0700, Brendan Higgins wrote:
> > > On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > >
> > > > On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >           - lsmod || true
> > > > > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > > > > + ./printf.sh
> > > > > > > > > > > >
> > > > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > > > > >
> > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > > > > >
> > > > > > > > > > > > I am still investigating, will update later.
> > > > > > > > > > >
> > > > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > > > > >
> > > > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > > > > >
> > > > > > > > > > > It shows the same breakage after applying this change.
> > > > > > > > > > >
> > > > > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > > > > >
> > > > > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > > > > in its release function.
> > > > > > > > > >
> > > > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > > > > Oops!
> > > > > > > > > >
> > > > > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > > > > >
> > > > > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > > > > right to me...
> > > > > > > > >
> > > > > > > > > I think we can workaround this problem by taking reference to the
> > > > > > > > > parent when the child is added, and then releasing it when the child
> > > > > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > > > > disappear before the child is fully released, but that still does not
> > > > > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > > > > The parent just should not be released before its children.
> > > > > > > > >
> > > > > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > > > > >
> > > > > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > > > > believe the bug that Heidi and I found is actually a bug.
> > > > > >
> > > > > > I agree.
> > > > > >
> > > > > > > > > the core with something like this (warning, I did not even try to
> > > > > > > > > compile that):
> > > > > > > >
> > > > > > > > I will try it out.
> > > > > > > >
> > > > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > > > > --- a/lib/kobject.c
> > > > > > > > > +++ b/lib/kobject.c
> > > > > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > +       if (t && t->release) {
> > > > > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > +                        kobject_name(kobj), kobj);
> > > > > > > > > +               t->release(kobj);
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > > > > >         if (kobj->state_in_sysfs) {
> > > > > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > >                 kobject_del(kobj);
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > -       if (t && t->release) {
> > > > > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > -                        kobject_name(kobj), kobj);
> > > > > > > > > -               t->release(kobj);
> > > > > > > > > -       }
> > > > > > > > > -
> > > > > > > > >         /* free name if we allocated it */
> > > > > > > > >         if (name) {
> > > > > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > > > > >
> > > > > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > > > > >
> > > > > > > Is everyone comfortable going this route?
> > > > > >
> > > > > > Hold on. Another way to fix the problem is to increment the parent's
> > > > > > reference count before that kobject_del(kobj) is called, and then
> > > > > > decrementing it after t->release(kobj) is called. It may be safer to
> > > > > > fix the problem like that.
> > > > >
> > > > > Right, this was your first suggestion above, right? That actually made
> > > > > more sense to me, but you seemed skeptical of it due to it being
> > > > > messier, which makes sense.
> > > > >
> > > > > Nevertheless, having children take a reference seems like the right
> > > > > thing to do because the children need to degregister themselves from
> > > > > the parent. Calling t->release() ahead of kobject_del() seems to
> > > > > reintroduce the problem that I pointed out, albeit *much* more
> > > > > briefly. If I understand correctly, it is always wrong to have a sysfs
> > > > > entry that points to a partially deallocated kobject. Please correct
> > > > > me if I am wrong.
> > > > >
> > > > > So I think there are two solutions: Either we have to ensure that each
> > > > > child is deallocated first so we can preserve the kobject_del() and
> > > > > then t->release() ordering, or we have to add some sort of "locking"
> > > > > mechanism to prevent the kobject from being accessed by anything other
> > > > > than the deallocation code until it is fully deallocated; well, it
> > > > > would have to prevent any access at all :-). I think it goes without
> > > > > saying that this "locking" idea is pretty flawed.
> > > > >
> > > > > The problem with just having children take a reference is that the
> > > > > kobject children already take a reference to their parent, so it seems
> > > > > like the kobject should be smart enough to deallocate children rather
> > > > > than having swnode have to keep a separate tally of children, no?
> > > > >
> > > > > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> > > > >
> > > > > > My example above proofs that there is the problem, but it changes the
> > > > > > order of execution which I think can always have other consequences.
> > > > > >
> > > > > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > > > > to send an updated revision of my revert patch with the fix?
> > > > > >
> > > > > > This needs to be send in its own separate patch. Ideally it could be
> > > > > > send together with the revert in the same series, but I'm not sure
> > > > > > that's possible anymore. Didn't Greg pick the revert already?
> > > > >
> > > > > Sounds good.
> > > > >
> > > > > I did already let Greg know when he emailed us on backporting the
> > > > > patch to stable, and he acked saying he removed them. So as long as
> > > > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > > > should be good.
> > > >
> > > > The reported bug is still noticed on Linux mainline master branch
> > > > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > > > mainline kernel 5.6.0.
> > >
> > > Oh sorry, I thought that this patch was dropped from the maintainer's
> > > for-next branch.
> > >
> > > Heikki, what do you think about my suggestion of having kobject
> > > deallocate its children?
> >
> > I'm not sure what was this suggestion?
> >
> > > In the meantime, are people cool with the patch that Heikki proposed
> > > as a temporary mitigation? I think my solution might be a bit more
> > > involved. If I don't hear anything back, I will send out Heikki's
> > > suggestion as a patch.
> >
> > Why not just take the reference to the parent like I proposed?
> >
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index 83198cb37d8d..173046c423f8 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -663,6 +663,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> >   */
> >  static void kobject_cleanup(struct kobject *kobj)
> >  {
> > +       struct kobject *parent = kobj->parent;
> >         struct kobj_type *t = get_ktype(kobj);
> >         const char *name = kobj->name;
> >
> > @@ -680,6 +681,9 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 kobject_uevent(kobj, KOBJ_REMOVE);
> >         }
> >
> > +       /* make sure the parent is not released before the (last) child */
> > +       kobject_get(parent)
> > +
> >         /* remove from sysfs if the caller did not do it */
> >         if (kobj->state_in_sysfs) {
> >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > @@ -693,6 +697,8 @@ static void kobject_cleanup(struct kobject *kobj)
> >                 t->release(kobj);
> >         }
> >
> > +       kobject_put(parent);
> > +
> >         /* free name if we allocated it */
> >         if (name) {
> >                 pr_debug("kobject: '%s': free name\n", name);
>
> Ah, I think that will work. Sorry, I just thought that was a bit
> kludgy, but if you are okay with doing it that way, I think it will
> work.
>
> I will try it and send out a patch.

Alright, I tested Heikki's diff with some minor clean up and it works.
I just sent a patch to the list here:

https://lore.kernel.org/linux-kselftest/20200414204240.186377-1-brendanhiggins@google.com/T/#u

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140
  2020-04-14 19:27                           ` Brendan Higgins
@ 2020-04-14 21:06                             ` Brendan Higgins
  0 siblings, 0 replies; 19+ messages in thread
From: Brendan Higgins @ 2020-04-14 21:06 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Naresh Kamboju, Sakari Ailus, Andy Shevchenko, Hans de Goede,
	rafael.j.wysocki, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko, Shuah Khan,
	Anders Roxell, lkft-triage, Rasmus Villemoes

On Tue, Apr 14, 2020 at 12:27 PM Brendan Higgins
<brendanhiggins@google.com> wrote:
>
> On Tue, Apr 14, 2020 at 12:18 PM Brendan Higgins
> <brendanhiggins@google.com> wrote:
> >
> > On Tue, Apr 14, 2020 at 1:15 AM Heikki Krogerus
> > <heikki.krogerus@linux.intel.com> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Apr 07, 2020 at 01:56:16PM -0700, Brendan Higgins wrote:
> > > > On Tue, Apr 7, 2020 at 2:25 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > >
> > > > > On Wed, 11 Mar 2020 at 02:16, Brendan Higgins <brendanhiggins@google.com> wrote:
> > > > > > > > > > > > > > > Steps reproduce by using kselftests,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >           - lsmod || true
> > > > > > > > > > > > > > >           - cd /opt/kselftests/default-in-kernel/lib/
> > > > > > > > > > > > > > >           - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH
> > > > > > > > > > > > > > >           - ./printf.sh || true
> > > > > > > > > > > > > > >           - ./bitmap.sh || true
> > > > > > > > > > > > > > >           - ./prime_numbers.sh || true
> > > > > > > > > > > > > > >           - ./strscpy.sh || true
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > x86_64 kernel BUG dump.
> > > > > > > > > > > > > > > + ./printf.sh
> > > > > > > > > > > > >
> > > > > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software
> > > > > > > > > > > > > node: Simplify software_node_release() function"":
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am still investigating, will update later.
> > > > > > > > > > > >
> > > > > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email
> > > > > > > > > > > > from kernel test robot a couple days ago that I didn't see:
> > > > > > > > > > > >
> > > > > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/
> > > > > > > > > > > >
> > > > > > > > > > > > It shows the same breakage after applying this change.
> > > > > > > > > > > >
> > > > > > > > > > > > I am still investigating how my change broke it, nevertheless.
> > > > > > > > > > >
> > > > > > > > > > > As nodes in the tree are being removed, the code before the patch that
> > > > > > > > > > > "simplified" the software_node_release() function accessed the node's parent
> > > > > > > > > > > in its release function.
> > > > > > > > > > >
> > > > > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no
> > > > > > > > > > > longer necessarily called in order, leading to referencing released memory.
> > > > > > > > > > > Oops!
> > > > > > > > > > >
> > > > > > > > > > > So Heikki's patch actually fixed a bug. :-)
> > > > > > > > > >
> > > > > > > > > > Well, I think it just hid the problem. It looks like the core
> > > > > > > > > > (lib/kobject.c) allows the parent kobject to be released before the
> > > > > > > > > > last child kobject is released. To be honest, that does not sound
> > > > > > > > > > right to me...
> > > > > > > > > >
> > > > > > > > > > I think we can workaround this problem by taking reference to the
> > > > > > > > > > parent when the child is added, and then releasing it when the child
> > > > > > > > > > is released, and in that way be guaranteed that the parent will not
> > > > > > > > > > disappear before the child is fully released, but that still does not
> > > > > > > > > > feel right. It feels more like the core is not doing it's job to me.
> > > > > > > > > > The parent just should not be released before its children.
> > > > > > > > > >
> > > > > > > > > > Either I'm wrong about that, and we still should take the reference on
> > > > > > > > > > the parent, or we revert my patch like Brendan proposed and then fix
> > > > > > > > >
> > > > > > > > > Either way, isn't it wrong to release the node ID before deleting the
> > > > > > > > > sysfs entry? I am not sure that my fix was the correct one, but I
> > > > > > > > > believe the bug that Heidi and I found is actually a bug.
> > > > > > >
> > > > > > > I agree.
> > > > > > >
> > > > > > > > > > the core with something like this (warning, I did not even try to
> > > > > > > > > > compile that):
> > > > > > > > >
> > > > > > > > > I will try it out.
> > > > > > > > >
> > > > > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > > > > > > > index 83198cb37d8d..ec5774992337 100644
> > > > > > > > > > --- a/lib/kobject.c
> > > > > > > > > > +++ b/lib/kobject.c
> > > > > > > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > > > > > > > > >         }
> > > > > > > > > >
> > > > > > > > > > +       if (t && t->release) {
> > > > > > > > > > +               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > > +                        kobject_name(kobj), kobj);
> > > > > > > > > > +               t->release(kobj);
> > > > > > > > > > +       }
> > > > > > > > > > +
> > > > > > > > > >         /* remove from sysfs if the caller did not do it */
> > > > > > > > > >         if (kobj->state_in_sysfs) {
> > > > > > > > > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > > > > > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj)
> > > > > > > > > >                 kobject_del(kobj);
> > > > > > > > > >         }
> > > > > > > > > >
> > > > > > > > > > -       if (t && t->release) {
> > > > > > > > > > -               pr_debug("kobject: '%s' (%p): calling ktype release\n",
> > > > > > > > > > -                        kobject_name(kobj), kobj);
> > > > > > > > > > -               t->release(kobj);
> > > > > > > > > > -       }
> > > > > > > > > > -
> > > > > > > > > >         /* free name if we allocated it */
> > > > > > > > > >         if (name) {
> > > > > > > > > >                 pr_debug("kobject: '%s': free name\n", name);
> > > > > > > >
> > > > > > > > Alright, so I tried it and it looks like Heikki's suggestion worked.
> > > > > > > >
> > > > > > > > Is everyone comfortable going this route?
> > > > > > >
> > > > > > > Hold on. Another way to fix the problem is to increment the parent's
> > > > > > > reference count before that kobject_del(kobj) is called, and then
> > > > > > > decrementing it after t->release(kobj) is called. It may be safer to
> > > > > > > fix the problem like that.
> > > > > >
> > > > > > Right, this was your first suggestion above, right? That actually made
> > > > > > more sense to me, but you seemed skeptical of it due to it being
> > > > > > messier, which makes sense.
> > > > > >
> > > > > > Nevertheless, having children take a reference seems like the right
> > > > > > thing to do because the children need to degregister themselves from
> > > > > > the parent. Calling t->release() ahead of kobject_del() seems to
> > > > > > reintroduce the problem that I pointed out, albeit *much* more
> > > > > > briefly. If I understand correctly, it is always wrong to have a sysfs
> > > > > > entry that points to a partially deallocated kobject. Please correct
> > > > > > me if I am wrong.
> > > > > >
> > > > > > So I think there are two solutions: Either we have to ensure that each
> > > > > > child is deallocated first so we can preserve the kobject_del() and
> > > > > > then t->release() ordering, or we have to add some sort of "locking"
> > > > > > mechanism to prevent the kobject from being accessed by anything other
> > > > > > than the deallocation code until it is fully deallocated; well, it
> > > > > > would have to prevent any access at all :-). I think it goes without
> > > > > > saying that this "locking" idea is pretty flawed.
> > > > > >
> > > > > > The problem with just having children take a reference is that the
> > > > > > kobject children already take a reference to their parent, so it seems
> > > > > > like the kobject should be smart enough to deallocate children rather
> > > > > > than having swnode have to keep a separate tally of children, no?
> > > > > >
> > > > > > Sorry if this all seems obvious, I am not an expert on this part of the kernel.
> > > > > >
> > > > > > > My example above proofs that there is the problem, but it changes the
> > > > > > > order of execution which I think can always have other consequences.
> > > > > > >
> > > > > > > > Also, should I send this fix as a separate patch? Or do people want me
> > > > > > > > to send an updated revision of my revert patch with the fix?
> > > > > > >
> > > > > > > This needs to be send in its own separate patch. Ideally it could be
> > > > > > > send together with the revert in the same series, but I'm not sure
> > > > > > > that's possible anymore. Didn't Greg pick the revert already?
> > > > > >
> > > > > > Sounds good.
> > > > > >
> > > > > > I did already let Greg know when he emailed us on backporting the
> > > > > > patch to stable, and he acked saying he removed them. So as long as
> > > > > > these are not in the queue for 5.6 (it is not in Linus' tree yet), we
> > > > > > should be good.
> > > > >
> > > > > The reported bug is still noticed on Linux mainline master branch
> > > > > The Kernel BUG noticed on x86_64 and i386 running selftest on Linux
> > > > > mainline kernel 5.6.0.
> > > >
> > > > Oh sorry, I thought that this patch was dropped from the maintainer's
> > > > for-next branch.
> > > >
> > > > Heikki, what do you think about my suggestion of having kobject
> > > > deallocate its children?
> > >
> > > I'm not sure what was this suggestion?
> > >
> > > > In the meantime, are people cool with the patch that Heikki proposed
> > > > as a temporary mitigation? I think my solution might be a bit more
> > > > involved. If I don't hear anything back, I will send out Heikki's
> > > > suggestion as a patch.
> > >
> > > Why not just take the reference to the parent like I proposed?
> > >
> > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > index 83198cb37d8d..173046c423f8 100644
> > > --- a/lib/kobject.c
> > > +++ b/lib/kobject.c
> > > @@ -663,6 +663,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> > >   */
> > >  static void kobject_cleanup(struct kobject *kobj)
> > >  {
> > > +       struct kobject *parent = kobj->parent;
> > >         struct kobj_type *t = get_ktype(kobj);
> > >         const char *name = kobj->name;
> > >
> > > @@ -680,6 +681,9 @@ static void kobject_cleanup(struct kobject *kobj)
> > >                 kobject_uevent(kobj, KOBJ_REMOVE);
> > >         }
> > >
> > > +       /* make sure the parent is not released before the (last) child */
> > > +       kobject_get(parent)
> > > +
> > >         /* remove from sysfs if the caller did not do it */
> > >         if (kobj->state_in_sysfs) {
> > >                 pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > > @@ -693,6 +697,8 @@ static void kobject_cleanup(struct kobject *kobj)
> > >                 t->release(kobj);
> > >         }
> > >
> > > +       kobject_put(parent);
> > > +
> > >         /* free name if we allocated it */
> > >         if (name) {
> > >                 pr_debug("kobject: '%s': free name\n", name);
> >
> > Ah, I think that will work. Sorry, I just thought that was a bit
> > kludgy, but if you are okay with doing it that way, I think it will
> > work.
> >
> > I will try it and send out a patch.
>
> I think I am also going to send out a new test case, in a separate
> patch of course, that exercises this logic. Kind of annoying having
> the printf test catching this.

For anyone interested, I sent out an RFC for this test case here:

https://lore.kernel.org/lkml/20200414210142.191327-1-brendanhiggins@google.com/T/#u

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-04-14 21:06 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-05 15:57 BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140 Naresh Kamboju
2020-03-05 19:18 ` Andy Shevchenko
2020-03-05 19:40   ` Brendan Higgins
2020-03-05 19:51     ` Brendan Higgins
2020-03-05 22:33       ` Sakari Ailus
2020-03-06 12:05         ` Heikki Krogerus
2020-03-09 20:35           ` Brendan Higgins
2020-03-09 21:43             ` Brendan Higgins
2020-03-10 11:18               ` Heikki Krogerus
2020-03-10 20:46                 ` Brendan Higgins
2020-03-10 20:46                   ` Brendan Higgins
2020-04-07  9:25                   ` Naresh Kamboju
2020-04-07 20:56                     ` Brendan Higgins
2020-04-13 19:10                       ` Naresh Kamboju
2020-04-14  8:15                       ` Heikki Krogerus
2020-04-14 19:18                         ` Brendan Higgins
2020-04-14 19:27                           ` Brendan Higgins
2020-04-14 21:06                             ` Brendan Higgins
2020-04-14 20:44                           ` Brendan Higgins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.