linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux 5.15-rc4: refcount underflow when unloading gpio-mockup
@ 2021-10-04  9:34 Kent Gibson
  2021-10-04  9:44 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-04  9:34 UTC (permalink / raw)
  To: Andy Shevchenko, Heikki Krogerus, Greg Kroah-Hartman,
	Rafael J. Wysocki, linux-acpi, linux-kernel
  Cc: Bartosz Golaszewski

Hi,

I'm seeing a refcount underflow when I unload the gpio-mockup module on
Linux v5.15-rc4 (and going back to v5.15-rc1):

# modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
# rmmod gpio-mockup
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
Modules linked in: gpio_mockup(-)
CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
EIP: refcount_warn_saturate+0xd1/0x120
Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
Call Trace:
 kobject_put+0xdc/0xf0
 software_node_notify_remove+0xa8/0xc0
 device_del+0x15a/0x3e0
 ? kfree_const+0xf/0x30
 ? kobject_put+0xa6/0xf0
 ? module_remove_driver+0x73/0xa0
 platform_device_del.part.0+0xf/0x80
 platform_device_unregister+0x19/0x40
 gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
 gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
 __ia32_sys_delete_module+0x137/0x1e0
 ? task_work_run+0x61/0x90
 ? exit_to_user_mode_prepare+0x1b5/0x1c0
 __do_fast_syscall_32+0x50/0xc0
 do_fast_syscall_32+0x32/0x70
 do_SYSENTER_32+0x15/0x20
 entry_SYSENTER_32+0x98/0xe7
EIP: 0xb7eda549
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
---[ end trace 3d71387f54bc2d06 ]---

I suspect this is related to the recent changes to swnode.c or
platform.c, as gpio-mockup hasn't changed, but haven't had the
chance to debug further.

Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04  9:34 linux 5.15-rc4: refcount underflow when unloading gpio-mockup Kent Gibson
@ 2021-10-04  9:44 ` Greg Kroah-Hartman
  2021-10-04  9:58   ` Kent Gibson
  2021-10-04 12:19   ` Kent Gibson
  0 siblings, 2 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2021-10-04  9:44 UTC (permalink / raw)
  To: Kent Gibson
  Cc: Andy Shevchenko, Heikki Krogerus, Rafael J. Wysocki, linux-acpi,
	linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> Hi,
> 
> I'm seeing a refcount underflow when I unload the gpio-mockup module on
> Linux v5.15-rc4 (and going back to v5.15-rc1):
> 
> # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> # rmmod gpio-mockup
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> Modules linked in: gpio_mockup(-)
> CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> EIP: refcount_warn_saturate+0xd1/0x120
> Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> Call Trace:
>  kobject_put+0xdc/0xf0
>  software_node_notify_remove+0xa8/0xc0
>  device_del+0x15a/0x3e0
>  ? kfree_const+0xf/0x30
>  ? kobject_put+0xa6/0xf0
>  ? module_remove_driver+0x73/0xa0
>  platform_device_del.part.0+0xf/0x80
>  platform_device_unregister+0x19/0x40
>  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
>  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
>  __ia32_sys_delete_module+0x137/0x1e0
>  ? task_work_run+0x61/0x90
>  ? exit_to_user_mode_prepare+0x1b5/0x1c0
>  __do_fast_syscall_32+0x50/0xc0
>  do_fast_syscall_32+0x32/0x70
>  do_SYSENTER_32+0x15/0x20
>  entry_SYSENTER_32+0x98/0xe7
> EIP: 0xb7eda549
> Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> ---[ end trace 3d71387f54bc2d06 ]---
> 
> I suspect this is related to the recent changes to swnode.c or
> platform.c, as gpio-mockup hasn't changed, but haven't had the
> chance to debug further.

Any chance you can run 'git bisect' for this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04  9:44 ` Greg Kroah-Hartman
@ 2021-10-04  9:58   ` Kent Gibson
  2021-10-04 12:19   ` Kent Gibson
  1 sibling, 0 replies; 15+ messages in thread
From: Kent Gibson @ 2021-10-04  9:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andy Shevchenko, Heikki Krogerus, Rafael J. Wysocki, linux-acpi,
	linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > Hi,
> > 
> > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > 
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > # rmmod gpio-mockup
> > ------------[ cut here ]------------
> > refcount_t: underflow; use-after-free.
> > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > Modules linked in: gpio_mockup(-)
> > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > EIP: refcount_warn_saturate+0xd1/0x120
> > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > Call Trace:
> >  kobject_put+0xdc/0xf0
> >  software_node_notify_remove+0xa8/0xc0
> >  device_del+0x15a/0x3e0
> >  ? kfree_const+0xf/0x30
> >  ? kobject_put+0xa6/0xf0
> >  ? module_remove_driver+0x73/0xa0
> >  platform_device_del.part.0+0xf/0x80
> >  platform_device_unregister+0x19/0x40
> >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> >  __ia32_sys_delete_module+0x137/0x1e0
> >  ? task_work_run+0x61/0x90
> >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> >  __do_fast_syscall_32+0x50/0xc0
> >  do_fast_syscall_32+0x32/0x70
> >  do_SYSENTER_32+0x15/0x20
> >  entry_SYSENTER_32+0x98/0xe7
> > EIP: 0xb7eda549
> > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > ---[ end trace 3d71387f54bc2d06 ]---
> > 
> > I suspect this is related to the recent changes to swnode.c or
> > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > chance to debug further.
> 
> Any chance you can run 'git bisect' for this?
> 

Will do - just wanted to get the problem out there as a heads-up and in
case anyone had some immediate insight.

Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04  9:44 ` Greg Kroah-Hartman
  2021-10-04  9:58   ` Kent Gibson
@ 2021-10-04 12:19   ` Kent Gibson
  2021-10-04 12:30     ` Heikki Krogerus
  1 sibling, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-04 12:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andy Shevchenko, Heikki Krogerus, Rafael J. Wysocki, linux-acpi,
	linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > Hi,
> > 
> > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > 
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > # rmmod gpio-mockup
> > ------------[ cut here ]------------
> > refcount_t: underflow; use-after-free.
> > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > Modules linked in: gpio_mockup(-)
> > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > EIP: refcount_warn_saturate+0xd1/0x120
> > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > Call Trace:
> >  kobject_put+0xdc/0xf0
> >  software_node_notify_remove+0xa8/0xc0
> >  device_del+0x15a/0x3e0
> >  ? kfree_const+0xf/0x30
> >  ? kobject_put+0xa6/0xf0
> >  ? module_remove_driver+0x73/0xa0
> >  platform_device_del.part.0+0xf/0x80
> >  platform_device_unregister+0x19/0x40
> >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> >  __ia32_sys_delete_module+0x137/0x1e0
> >  ? task_work_run+0x61/0x90
> >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> >  __do_fast_syscall_32+0x50/0xc0
> >  do_fast_syscall_32+0x32/0x70
> >  do_SYSENTER_32+0x15/0x20
> >  entry_SYSENTER_32+0x98/0xe7
> > EIP: 0xb7eda549
> > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > ---[ end trace 3d71387f54bc2d06 ]---
> > 
> > I suspect this is related to the recent changes to swnode.c or
> > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > chance to debug further.
> 
> Any chance you can run 'git bisect' for this?
> 

That results in:

bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
commit bd1e336aa8535a99f339e2d66a611984262221ce
Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Date:   Tue Aug 17 13:24:49 2021 +0300

    driver core: platform: Remove platform_device_add_properties()

Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 12:19   ` Kent Gibson
@ 2021-10-04 12:30     ` Heikki Krogerus
  2021-10-04 12:47       ` Kent Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Heikki Krogerus @ 2021-10-04 12:30 UTC (permalink / raw)
  To: Kent Gibson
  Cc: Greg Kroah-Hartman, Andy Shevchenko, Rafael J. Wysocki,
	linux-acpi, linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > Hi,
> > > 
> > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > 
> > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > # rmmod gpio-mockup
> > > ------------[ cut here ]------------
> > > refcount_t: underflow; use-after-free.
> > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > Modules linked in: gpio_mockup(-)
> > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > EIP: refcount_warn_saturate+0xd1/0x120
> > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > Call Trace:
> > >  kobject_put+0xdc/0xf0
> > >  software_node_notify_remove+0xa8/0xc0
> > >  device_del+0x15a/0x3e0
> > >  ? kfree_const+0xf/0x30
> > >  ? kobject_put+0xa6/0xf0
> > >  ? module_remove_driver+0x73/0xa0
> > >  platform_device_del.part.0+0xf/0x80
> > >  platform_device_unregister+0x19/0x40
> > >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > >  __ia32_sys_delete_module+0x137/0x1e0
> > >  ? task_work_run+0x61/0x90
> > >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > >  __do_fast_syscall_32+0x50/0xc0
> > >  do_fast_syscall_32+0x32/0x70
> > >  do_SYSENTER_32+0x15/0x20
> > >  entry_SYSENTER_32+0x98/0xe7
> > > EIP: 0xb7eda549
> > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > ---[ end trace 3d71387f54bc2d06 ]---
> > > 
> > > I suspect this is related to the recent changes to swnode.c or
> > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > chance to debug further.
> > 
> > Any chance you can run 'git bisect' for this?
> > 
> 
> That results in:
> 
> bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> commit bd1e336aa8535a99f339e2d66a611984262221ce
> Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> Date:   Tue Aug 17 13:24:49 2021 +0300
> 
>     driver core: platform: Remove platform_device_add_properties()

Can you test does this patch help:
https://lore.kernel.org/all/20210930121246.22833-3-heikki.krogerus@linux.intel.com/

thanks,

-- 
heikki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 12:30     ` Heikki Krogerus
@ 2021-10-04 12:47       ` Kent Gibson
  2021-10-04 13:20         ` Heikki Krogerus
  0 siblings, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-04 12:47 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Greg Kroah-Hartman, Andy Shevchenko, Rafael J. Wysocki,
	linux-acpi, linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > Hi,
> > > > 
> > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > > 
> > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > # rmmod gpio-mockup
> > > > ------------[ cut here ]------------
> > > > refcount_t: underflow; use-after-free.
> > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > Modules linked in: gpio_mockup(-)
> > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > Call Trace:
> > > >  kobject_put+0xdc/0xf0
> > > >  software_node_notify_remove+0xa8/0xc0
> > > >  device_del+0x15a/0x3e0
> > > >  ? kfree_const+0xf/0x30
> > > >  ? kobject_put+0xa6/0xf0
> > > >  ? module_remove_driver+0x73/0xa0
> > > >  platform_device_del.part.0+0xf/0x80
> > > >  platform_device_unregister+0x19/0x40
> > > >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > >  __ia32_sys_delete_module+0x137/0x1e0
> > > >  ? task_work_run+0x61/0x90
> > > >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > >  __do_fast_syscall_32+0x50/0xc0
> > > >  do_fast_syscall_32+0x32/0x70
> > > >  do_SYSENTER_32+0x15/0x20
> > > >  entry_SYSENTER_32+0x98/0xe7
> > > > EIP: 0xb7eda549
> > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > > 
> > > > I suspect this is related to the recent changes to swnode.c or
> > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > chance to debug further.
> > > 
> > > Any chance you can run 'git bisect' for this?
> > > 
> > 
> > That results in:
> > 
> > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> > Date:   Tue Aug 17 13:24:49 2021 +0300
> > 
> >     driver core: platform: Remove platform_device_add_properties()
> 
> Can you test does this patch help:
> https://lore.kernel.org/all/20210930121246.22833-3-heikki.krogerus@linux.intel.com/
> 

You sure that is the patch you have in mind? It only removes dead code,
so I don't see how that would help.  And it isn't quite dead either -
drivers/pci/quirks.c is still using device_add_properties(), so it won't
build.

Looking at the offending patch, it effectively replaces a call to
device_add_properties() with one to
device_create_managed_software_node(), and those two functions appear
quite different - at least at first glance.
Is that correct?

Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 12:47       ` Kent Gibson
@ 2021-10-04 13:20         ` Heikki Krogerus
  2021-10-04 14:17           ` Kent Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Heikki Krogerus @ 2021-10-04 13:20 UTC (permalink / raw)
  To: Kent Gibson
  Cc: Greg Kroah-Hartman, Andy Shevchenko, Rafael J. Wysocki,
	linux-acpi, linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > > > 
> > > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > > # rmmod gpio-mockup
> > > > > ------------[ cut here ]------------
> > > > > refcount_t: underflow; use-after-free.
> > > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > > Modules linked in: gpio_mockup(-)
> > > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > > Call Trace:
> > > > >  kobject_put+0xdc/0xf0
> > > > >  software_node_notify_remove+0xa8/0xc0
> > > > >  device_del+0x15a/0x3e0
> > > > >  ? kfree_const+0xf/0x30
> > > > >  ? kobject_put+0xa6/0xf0
> > > > >  ? module_remove_driver+0x73/0xa0
> > > > >  platform_device_del.part.0+0xf/0x80
> > > > >  platform_device_unregister+0x19/0x40
> > > > >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > > >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > > >  __ia32_sys_delete_module+0x137/0x1e0
> > > > >  ? task_work_run+0x61/0x90
> > > > >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > > >  __do_fast_syscall_32+0x50/0xc0
> > > > >  do_fast_syscall_32+0x32/0x70
> > > > >  do_SYSENTER_32+0x15/0x20
> > > > >  entry_SYSENTER_32+0x98/0xe7
> > > > > EIP: 0xb7eda549
> > > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > > > 
> > > > > I suspect this is related to the recent changes to swnode.c or
> > > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > > chance to debug further.
> > > > 
> > > > Any chance you can run 'git bisect' for this?
> > > > 
> > > 
> > > That results in:
> > > 
> > > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > > Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> > > Date:   Tue Aug 17 13:24:49 2021 +0300
> > > 
> > >     driver core: platform: Remove platform_device_add_properties()
> > 
> > Can you test does this patch help:
> > https://lore.kernel.org/all/20210930121246.22833-3-heikki.krogerus@linux.intel.com/
> > 
> 
> You sure that is the patch you have in mind? It only removes dead code,
> so I don't see how that would help.  And it isn't quite dead either -
> drivers/pci/quirks.c is still using device_add_properties(), so it won't
> build.

Right, so can you test with the whole series that patch is part of?

> Looking at the offending patch, it effectively replaces a call to
> device_add_properties() with one to
> device_create_managed_software_node(), and those two functions appear
> quite different - at least at first glance.
> Is that correct?

The only real difference between the two functions is that
device_create_managed_software_node() marks the software node it
creates (and it does it exactly the same way as
device_add_properties()) as "managed" with a specific flag.

It means that when the device is removed, so is the software node.
It happens when device_del() calls device_platform_notify_remove(),
which then calls software_node_notify_remove().

The problem is that after doing that step, device_del() then calls
device_remove_properties() unconditionally which also attempts to
remove the software node. So you end up doing the same thing twice.

So the code in the patch that we're interested, and that I would like
you to test, is this:

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 938cfcd1674eb..152a611a7e9ca 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3583,7 +3583,6 @@ void device_del(struct device *dev)
        device_pm_remove(dev);
        driver_deferred_probe_del(dev);
        device_platform_notify_remove(dev);
-       device_remove_properties(dev);
        device_links_purge(dev);
 
        if (dev->bus)


thanks,

-- 
heikki

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 13:20         ` Heikki Krogerus
@ 2021-10-04 14:17           ` Kent Gibson
  2021-10-04 15:28             ` Kent Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-04 14:17 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Greg Kroah-Hartman, Andy Shevchenko, Rafael J. Wysocki,
	linux-acpi, linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on
> > > > > > Linux v5.15-rc4 (and going back to v5.15-rc1):
> > > > > > 
> > > > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10
> > > > > > # rmmod gpio-mockup
> > > > > > ------------[ cut here ]------------
> > > > > > refcount_t: underflow; use-after-free.
> > > > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
> > > > > > Modules linked in: gpio_mockup(-)
> > > > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1
> > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > > > EIP: refcount_warn_saturate+0xd1/0x120
> > > > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04
> > > > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000
> > > > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80
> > > > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296
> > > > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690
> > > > > > Call Trace:
> > > > > >  kobject_put+0xdc/0xf0
> > > > > >  software_node_notify_remove+0xa8/0xc0
> > > > > >  device_del+0x15a/0x3e0
> > > > > >  ? kfree_const+0xf/0x30
> > > > > >  ? kobject_put+0xa6/0xf0
> > > > > >  ? module_remove_driver+0x73/0xa0
> > > > > >  platform_device_del.part.0+0xf/0x80
> > > > > >  platform_device_unregister+0x19/0x40
> > > > > >  gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup]
> > > > > >  gpio_mockup_exit+0x1c/0x68c [gpio_mockup]
> > > > > >  __ia32_sys_delete_module+0x137/0x1e0
> > > > > >  ? task_work_run+0x61/0x90
> > > > > >  ? exit_to_user_mode_prepare+0x1b5/0x1c0
> > > > > >  __do_fast_syscall_32+0x50/0xc0
> > > > > >  do_fast_syscall_32+0x32/0x70
> > > > > >  do_SYSENTER_32+0x15/0x20
> > > > > >  entry_SYSENTER_32+0x98/0xe7
> > > > > > EIP: 0xb7eda549
> > > > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
> > > > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160
> > > > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8
> > > > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202
> > > > > > ---[ end trace 3d71387f54bc2d06 ]---
> > > > > > 
> > > > > > I suspect this is related to the recent changes to swnode.c or
> > > > > > platform.c, as gpio-mockup hasn't changed, but haven't had the
> > > > > > chance to debug further.
> > > > > 
> > > > > Any chance you can run 'git bisect' for this?
> > > > > 
> > > > 
> > > > That results in:
> > > > 
> > > > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit
> > > > commit bd1e336aa8535a99f339e2d66a611984262221ce
> > > > Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> > > > Date:   Tue Aug 17 13:24:49 2021 +0300
> > > > 
> > > >     driver core: platform: Remove platform_device_add_properties()
> > > 
> > > Can you test does this patch help:
> > > https://lore.kernel.org/all/20210930121246.22833-3-heikki.krogerus@linux.intel.com/
> > > 
> > 
> > You sure that is the patch you have in mind? It only removes dead code,
> > so I don't see how that would help.  And it isn't quite dead either -
> > drivers/pci/quirks.c is still using device_add_properties(), so it won't
> > build.
> 
> Right, so can you test with the whole series that patch is part of?
> 

Well, you could've said that to start with ;-).

> > Looking at the offending patch, it effectively replaces a call to
> > device_add_properties() with one to
> > device_create_managed_software_node(), and those two functions appear
> > quite different - at least at first glance.
> > Is that correct?
> 
> The only real difference between the two functions is that
> device_create_managed_software_node() marks the software node it
> creates (and it does it exactly the same way as
> device_add_properties()) as "managed" with a specific flag.
> 

Yeah, my bad - not sure what function I was looking at but it wasn't
device_create_managed_software_node().

> It means that when the device is removed, so is the software node.
> It happens when device_del() calls device_platform_notify_remove(),
> which then calls software_node_notify_remove().
> 
> The problem is that after doing that step, device_del() then calls
> device_remove_properties() unconditionally which also attempts to
> remove the software node. So you end up doing the same thing twice.
> 
> So the code in the patch that we're interested, and that I would like
> you to test, is this:
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 938cfcd1674eb..152a611a7e9ca 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3583,7 +3583,6 @@ void device_del(struct device *dev)
>         device_pm_remove(dev);
>         driver_deferred_probe_del(dev);
>         device_platform_notify_remove(dev);
> -       device_remove_properties(dev);
>         device_links_purge(dev);
>  
>         if (dev->bus)
> 

Makes sense.
Good news is the patch built.
Bad news is I still get the same result:

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 98 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x150
Modules linked in: gpio_mockup(-)
CPU: 0 PID: 98 Comm: rmmod Not tainted 5.15.0-rc4 #1
Hardware name: linux,dummy-virt (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xf4/0x150
lr : refcount_warn_saturate+0xf4/0x150
sp : ffffffc010be3c50
x29: ffffffc010be3c50 x28: ffffff8001fa5780 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000000 x24: ffffffc010a14518
x23: 0000000000000000 x22: ffffffc010a13798 x21: ffffffc010a7a480
x20: ffffff8002126c10 x19: ffffff8002146600 x18: fffffffffffe25f8
x17: 666f733d4d455453 x16: 5953425553003065 x15: ffffffc01098ac80
x14: fffffffffffc25f7 x13: 2e656572662d7265 x12: ffffffc01098acd0
x11: 0000000000000093 x10: 6e75203a745f746e x9 : 00000000ffffefff
x8 : ffffffc0109e2cd0 x7 : 0000000000017fe8 x6 : 0000000000000001
x5 : ffffff8007f8fa68 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff8001fa5780
Call trace:
 refcount_warn_saturate+0xf4/0x150
 kobject_put+0xf4/0x110
 software_node_notify_remove+0xc4/0xe0
 device_del+0x18c/0x430
 platform_device_del.part.0+0x1c/0x90
 platform_device_unregister+0x28/0x50
 gpio_mockup_unregister_pdevs+0x28/0x50 [gpio_mockup]
 gpio_mockup_exit+0x28/0x3c0 [gpio_mockup]
 __arm64_sys_delete_module+0x180/0x200
 invoke_syscall+0x54/0x130
 el0_svc_common.constprop.0+0x44/0xf0
 do_el0_svc+0x40/0xa0
 el0_svc+0x20/0x60
 el0t_64_sync_handler+0x1a4/0x1b0
 el0t_64_sync+0x1a0/0x1a4
---[ end trace 44421b7a22d450dd ]---


Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 14:17           ` Kent Gibson
@ 2021-10-04 15:28             ` Kent Gibson
  2021-10-04 19:56               ` Andy Shevchenko
  0 siblings, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-04 15:28 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Greg Kroah-Hartman, Andy Shevchenko, Rafael J. Wysocki,
	linux-acpi, linux-kernel, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > Hi,
> > > > > > > 

[snip]

> > > Looking at the offending patch, it effectively replaces a call to
> > > device_add_properties() with one to
> > > device_create_managed_software_node(), and those two functions appear
> > > quite different - at least at first glance.
> > > Is that correct?
> > 
> > The only real difference between the two functions is that
> > device_create_managed_software_node() marks the software node it
> > creates (and it does it exactly the same way as
> > device_add_properties()) as "managed" with a specific flag.
> > 
> 

That managed flag makes all the difference.

I've tried to find a fix along the same lines as Laurentiu Tudor's
5aeb05b27f8 software node: balance refcount for managed software nodes
but haven't found anything that works.

What does work for me is to revert the call to
device_create_managed_software_node() to a call to
device_add_properties():

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 652531f67135..2f30bdb94fab 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
                goto err;

        if (pdevinfo->properties) {
-               ret = device_create_managed_software_node(&pdev->dev,
-                                                         pdevinfo->properties, NULL);
+               ret = device_add_properties(&pdev->dev, pdevinfo->properties);
                if (ret)
                        goto err;
        }

That obviously wont work with your latest series that removes
device_add_properties(), but that is the simplest, and only, solution
that I've found so far.

Cheers,
Kent.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 15:28             ` Kent Gibson
@ 2021-10-04 19:56               ` Andy Shevchenko
  2021-10-05  0:40                 ` Kent Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2021-10-04 19:56 UTC (permalink / raw)
  To: Kent Gibson
  Cc: Heikki Krogerus, Greg Kroah-Hartman, Andy Shevchenko,
	Rafael J. Wysocki, ACPI Devel Maling List,
	Linux Kernel Mailing List, Bartosz Golaszewski

On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <warthog618@gmail.com> wrote:
>
> On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > > Hi,
> > > > > > > >
>
> [snip]
>
> > > > Looking at the offending patch, it effectively replaces a call to
> > > > device_add_properties() with one to
> > > > device_create_managed_software_node(), and those two functions appear
> > > > quite different - at least at first glance.
> > > > Is that correct?
> > >
> > > The only real difference between the two functions is that
> > > device_create_managed_software_node() marks the software node it
> > > creates (and it does it exactly the same way as
> > > device_add_properties()) as "managed" with a specific flag.
> > >
> >
>
> That managed flag makes all the difference.
>
> I've tried to find a fix along the same lines as Laurentiu Tudor's
> 5aeb05b27f8 software node: balance refcount for managed software nodes
> but haven't found anything that works.
>
> What does work for me is to revert the call to
> device_create_managed_software_node() to a call to
> device_add_properties():
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 652531f67135..2f30bdb94fab 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
>                 goto err;
>
>         if (pdevinfo->properties) {
> -               ret = device_create_managed_software_node(&pdev->dev,
> -                                                         pdevinfo->properties, NULL);
> +               ret = device_add_properties(&pdev->dev, pdevinfo->properties);
>                 if (ret)
>                         goto err;
>         }
>
> That obviously wont work with your latest series that removes
> device_add_properties(), but that is the simplest, and only, solution
> that I've found so far.

Here is debug prints of what happens

# modprobe gpio-mockup gpio_mockup_ranges=-1,10
[  212.850993]  (null): device_create_managed_software_node <<< 0
[  212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
[  212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
[  212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
[  212.882962] gpiochip_find_base: found new base at 840
[  212.889873] gpio gpiochip3: software_node_notify 0 <<<
[  212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
[  212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
[  212.913313] gpio gpiochip840: software_node_notify 0 <<<
[  212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
[  212.935601] modprobe (185) used greatest stack depth: 12744 bytes left

As I read it the software node is created for gpio-mockup device and
then _shared_ with the GPIO device, since it's managed it's gone when
GPIO device gone followed by UAF when mockup (platform) device itself
gone. I.o.w. this software node mustn't be managed because it covers
the lifetime of two different objects. The correct fix is to create
manually software node and assign it to the pdev and manually free in
gpio_mockup_unregister_pdevs()/

Tl;DR: it's a bug in gpio-mockup.


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-04 19:56               ` Andy Shevchenko
@ 2021-10-05  0:40                 ` Kent Gibson
  2021-10-05  8:21                   ` Andy Shevchenko
  0 siblings, 1 reply; 15+ messages in thread
From: Kent Gibson @ 2021-10-05  0:40 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Heikki Krogerus, Greg Kroah-Hartman, Andy Shevchenko,
	Rafael J. Wysocki, ACPI Devel Maling List,
	Linux Kernel Mailing List, Bartosz Golaszewski

On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <warthog618@gmail.com> wrote:
> >
> > On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> > > On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > > > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> >
> > [snip]
> >
> > > > > Looking at the offending patch, it effectively replaces a call to
> > > > > device_add_properties() with one to
> > > > > device_create_managed_software_node(), and those two functions appear
> > > > > quite different - at least at first glance.
> > > > > Is that correct?
> > > >
> > > > The only real difference between the two functions is that
> > > > device_create_managed_software_node() marks the software node it
> > > > creates (and it does it exactly the same way as
> > > > device_add_properties()) as "managed" with a specific flag.
> > > >
> > >
> >
> > That managed flag makes all the difference.
> >
> > I've tried to find a fix along the same lines as Laurentiu Tudor's
> > 5aeb05b27f8 software node: balance refcount for managed software nodes
> > but haven't found anything that works.
> >
> > What does work for me is to revert the call to
> > device_create_managed_software_node() to a call to
> > device_add_properties():
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 652531f67135..2f30bdb94fab 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
> >                 goto err;
> >
> >         if (pdevinfo->properties) {
> > -               ret = device_create_managed_software_node(&pdev->dev,
> > -                                                         pdevinfo->properties, NULL);
> > +               ret = device_add_properties(&pdev->dev, pdevinfo->properties);
> >                 if (ret)
> >                         goto err;
> >         }
> >
> > That obviously wont work with your latest series that removes
> > device_add_properties(), but that is the simplest, and only, solution
> > that I've found so far.
> 
> Here is debug prints of what happens
> 
> # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> [  212.850993]  (null): device_create_managed_software_node <<< 0
> [  212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> [  212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> [  212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> [  212.882962] gpiochip_find_base: found new base at 840
> [  212.889873] gpio gpiochip3: software_node_notify 0 <<<
> [  212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> [  212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> [  212.913313] gpio gpiochip840: software_node_notify 0 <<<
> [  212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> [  212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
> 
> As I read it the software node is created for gpio-mockup device and
> then _shared_ with the GPIO device, since it's managed it's gone when
> GPIO device gone followed by UAF when mockup (platform) device itself
> gone. I.o.w. this software node mustn't be managed because it covers
> the lifetime of two different objects. The correct fix is to create
> manually software node and assign it to the pdev and manually free in
> gpio_mockup_unregister_pdevs()/
> 
> Tl;DR: it's a bug in gpio-mockup.
> 

So the bug is in the behaviour of gpio_mockup_register_chip()?
That is out of my court, so I'll leave it to you and Bart to sort out.

Cheers,
Kent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-05  0:40                 ` Kent Gibson
@ 2021-10-05  8:21                   ` Andy Shevchenko
  2021-10-05  9:50                     ` Heikki Krogerus
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2021-10-05  8:21 UTC (permalink / raw)
  To: Kent Gibson
  Cc: Heikki Krogerus, Greg Kroah-Hartman, Rafael J. Wysocki,
	ACPI Devel Maling List, Linux Kernel Mailing List,
	Bartosz Golaszewski

On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
> On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> > On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <warthog618@gmail.com> wrote:

...

> > Here is debug prints of what happens
> > 
> > # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> > [  212.850993]  (null): device_create_managed_software_node <<< 0
> > [  212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> > [  212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> > [  212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> > [  212.882962] gpiochip_find_base: found new base at 840
> > [  212.889873] gpio gpiochip3: software_node_notify 0 <<<
> > [  212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> > [  212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> > [  212.913313] gpio gpiochip840: software_node_notify 0 <<<
> > [  212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> > [  212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
> > 
> > As I read it the software node is created for gpio-mockup device and
> > then _shared_ with the GPIO device, since it's managed it's gone when
> > GPIO device gone followed by UAF when mockup (platform) device itself
> > gone. I.o.w. this software node mustn't be managed because it covers
> > the lifetime of two different objects. The correct fix is to create
> > manually software node and assign it to the pdev and manually free in
> > gpio_mockup_unregister_pdevs()/
> > 
> > Tl;DR: it's a bug in gpio-mockup.
> 
> So the bug is in the behaviour of gpio_mockup_register_chip()?

Not really. The utter root cause is the former API (device_add_properties()
et al) which Heikki is getting rid of in particular because of this issue,
i.e. when users blindly call it without fully understanding the picture of
the object lifetimes.

> That is out of my court, so I'll leave it to you and Bart to sort out.

I'll see what I can do about.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-05  8:21                   ` Andy Shevchenko
@ 2021-10-05  9:50                     ` Heikki Krogerus
  2021-10-05  9:53                       ` Andy Shevchenko
  0 siblings, 1 reply; 15+ messages in thread
From: Heikki Krogerus @ 2021-10-05  9:50 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Kent Gibson, Greg Kroah-Hartman, Rafael J. Wysocki,
	ACPI Devel Maling List, Linux Kernel Mailing List,
	Bartosz Golaszewski

[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]

On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 10:56:00PM +0300, Andy Shevchenko wrote:
> > > On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <warthog618@gmail.com> wrote:
> 
> ...
> 
> > > Here is debug prints of what happens
> > > 
> > > # modprobe gpio-mockup gpio_mockup_ranges=-1,10
> > > [  212.850993]  (null): device_create_managed_software_node <<< 0
> > > [  212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
> > > [  212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
> > > [  212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
> > > [  212.882962] gpiochip_find_base: found new base at 840
> > > [  212.889873] gpio gpiochip3: software_node_notify 0 <<<
> > > [  212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
> > > [  212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
> > > [  212.913313] gpio gpiochip840: software_node_notify 0 <<<
> > > [  212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
> > > [  212.935601] modprobe (185) used greatest stack depth: 12744 bytes left
> > > 
> > > As I read it the software node is created for gpio-mockup device and
> > > then _shared_ with the GPIO device, since it's managed it's gone when
> > > GPIO device gone followed by UAF when mockup (platform) device itself
> > > gone. I.o.w. this software node mustn't be managed because it covers
> > > the lifetime of two different objects. The correct fix is to create
> > > manually software node and assign it to the pdev and manually free in
> > > gpio_mockup_unregister_pdevs()/
> > > 
> > > Tl;DR: it's a bug in gpio-mockup.
> > 
> > So the bug is in the behaviour of gpio_mockup_register_chip()?
> 
> Not really. The utter root cause is the former API (device_add_properties()
> et al) which Heikki is getting rid of in particular because of this issue,
> i.e. when users blindly call it without fully understanding the picture of
> the object lifetimes.
> 
> > That is out of my court, so I'll leave it to you and Bart to sort out.
> 
> I'll see what I can do about.

So, something like this (attached)?

-- 
heikki

[-- Attachment #2: gpio-mockup.diff --]
[-- Type: text/plain, Size: 2459 bytes --]

diff --git a/drivers/gpio/gpio-mockup.c b/drivers/gpio/gpio-mockup.c
index 0a9d746a0fe0a..469f3cbfd6b05 100644
--- a/drivers/gpio/gpio-mockup.c
+++ b/drivers/gpio/gpio-mockup.c
@@ -476,10 +476,22 @@ static struct platform_device *gpio_mockup_pdevs[GPIO_MOCKUP_MAX_GC];
 
 static void gpio_mockup_unregister_pdevs(void)
 {
+	struct software_node *swnode;
 	int i;
 
-	for (i = 0; i < GPIO_MOCKUP_MAX_GC; i++)
+	for (i = 0; i < GPIO_MOCKUP_MAX_GC; i++) {
+		swnode = to_software_node(gpio_mockup_pdevs[i].dev.fwnode);
+
+		/*
+		 * Note. The software node must be unregistered before the
+		 * device to prevent the device_remove_properties() call in
+		 * device_del() from causing ref count underflow.
+		 */
+		software_node_unregister(swnode);
 		platform_device_unregister(gpio_mockup_pdevs[i]);
+		property_entries_free(swnode->properties);
+		kfree(swnode);
+	}
 }
 
 static __init char **gpio_mockup_make_line_names(const char *label,
@@ -508,9 +520,11 @@ static int __init gpio_mockup_register_chip(int idx)
 	struct property_entry properties[GPIO_MOCKUP_MAX_PROP];
 	struct platform_device_info pdevinfo;
 	struct platform_device *pdev;
+	struct software_node *swnode;
 	char **line_names = NULL;
 	char chip_label[32];
 	int prop = 0, base;
+	int ret = -ENOMEM;
 	u16 ngpio;
 
 	memset(properties, 0, sizeof(properties));
@@ -536,20 +550,47 @@ static int __init gpio_mockup_register_chip(int idx)
 					"gpio-line-names", line_names, ngpio);
 	}
 
+	swnode = kzalloc(sizeof(*swnode), GFP_KERNEL);
+	if (!swnode)
+		goto err_free_line_names;
+
+	swnode->properties = property_entry_dup(properties);
+	if (!swnode->properties)
+		goto err_free_swnode;
+
+	ret = software_node_register(swnode);
+	if (ret)
+		goto err_free_properties;
+
 	pdevinfo.name = "gpio-mockup";
 	pdevinfo.id = idx;
-	pdevinfo.properties = properties;
+	pdevinfo.fwnode = software_node_fwnode(swnode);
 
 	pdev = platform_device_register_full(&pdevinfo);
 	kfree_strarray(line_names, ngpio);
 	if (IS_ERR(pdev)) {
 		pr_err("error registering device");
-		return PTR_ERR(pdev);
+		ret = PTR_ERR(pdev);
+		goto err_unregister_swnode;
 	}
 
 	gpio_mockup_pdevs[idx] = pdev;
 
 	return 0;
+
+err_unregister_swnode:
+	software_node_unregister(swnode);
+
+err_free_properties:
+	property_entries_free(swnode->properties);
+
+err_free_swnode:
+	kfree(swnode);
+
+err_free_line_names:
+	kfree_strarray(line_names, ngpio);
+
+	return ret;
 }
 
 static int __init gpio_mockup_init(void)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-05  9:50                     ` Heikki Krogerus
@ 2021-10-05  9:53                       ` Andy Shevchenko
  2021-10-05  9:57                         ` Heikki Krogerus
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2021-10-05  9:53 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: Kent Gibson, Greg Kroah-Hartman, Rafael J. Wysocki,
	ACPI Devel Maling List, Linux Kernel Mailing List,
	Bartosz Golaszewski

On Tue, Oct 5, 2021 at 12:50 PM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:

...

> > I'll see what I can do about.
>
> So, something like this (attached)?

Actually I have sent another solution:
https://lore.kernel.org/linux-gpio/20211005093731.62743-1-andriy.shevchenko@linux.intel.com/T/#u

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup
  2021-10-05  9:53                       ` Andy Shevchenko
@ 2021-10-05  9:57                         ` Heikki Krogerus
  0 siblings, 0 replies; 15+ messages in thread
From: Heikki Krogerus @ 2021-10-05  9:57 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Kent Gibson, Greg Kroah-Hartman, Rafael J. Wysocki,
	ACPI Devel Maling List, Linux Kernel Mailing List,
	Bartosz Golaszewski

On Tue, Oct 05, 2021 at 12:53:04PM +0300, Andy Shevchenko wrote:
> On Tue, Oct 5, 2021 at 12:50 PM Heikki Krogerus
> <heikki.krogerus@linux.intel.com> wrote:
> >
> > On Tue, Oct 05, 2021 at 11:21:53AM +0300, Andy Shevchenko wrote:
> > > On Tue, Oct 05, 2021 at 08:40:35AM +0800, Kent Gibson wrote:
> 
> ...
> 
> > > I'll see what I can do about.
> >
> > So, something like this (attached)?
> 
> Actually I have sent another solution:
> https://lore.kernel.org/linux-gpio/20211005093731.62743-1-andriy.shevchenko@linux.intel.com/T/#u

Ah, OK. Thanks Andy.

-- 
heikki

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-10-05  9:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04  9:34 linux 5.15-rc4: refcount underflow when unloading gpio-mockup Kent Gibson
2021-10-04  9:44 ` Greg Kroah-Hartman
2021-10-04  9:58   ` Kent Gibson
2021-10-04 12:19   ` Kent Gibson
2021-10-04 12:30     ` Heikki Krogerus
2021-10-04 12:47       ` Kent Gibson
2021-10-04 13:20         ` Heikki Krogerus
2021-10-04 14:17           ` Kent Gibson
2021-10-04 15:28             ` Kent Gibson
2021-10-04 19:56               ` Andy Shevchenko
2021-10-05  0:40                 ` Kent Gibson
2021-10-05  8:21                   ` Andy Shevchenko
2021-10-05  9:50                     ` Heikki Krogerus
2021-10-05  9:53                       ` Andy Shevchenko
2021-10-05  9:57                         ` Heikki Krogerus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).