netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: ODEBUG bug in netdev_freemem (2)
@ 2019-06-24  8:53 syzbot
  2019-06-24  9:33 ` Thomas Gleixner
  2019-08-08  0:25 ` syzbot
  0 siblings, 2 replies; 9+ messages in thread
From: syzbot @ 2019-06-24  8:53 UTC (permalink / raw)
  To: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, davem,
	dmitry.torokhov, f.fainelli, gregkh, idosch, linux-kernel,
	netdev, syzkaller-bugs, tglx, tyhicks, wanghai26, yuehaibing

Hello,

syzbot found the following crash on:

HEAD commit:    fd6b99fa Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com

device hsr_slave_0 left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
bond0 (unregistering): Releasing backup interface bond_slave_1
bond0 (unregistering): Releasing backup interface bond_slave_0
bond0 (unregistering): Released all slaves
------------[ cut here ]------------
ODEBUG: free active (active state 0) object type: timer_list hint:  
delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
WARNING: CPU: 1 PID: 25149 at lib/debugobjects.c:325  
debug_print_object+0x168/0x250 lib/debugobjects.c:325
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 25149 Comm: kworker/u4:1 Not tainted 5.2.0-rc4+ #31
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2cb/0x744 kernel/panic.c:219
  __warn.cold+0x20/0x4d kernel/panic.c:576
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986
RIP: 0010:debug_print_object+0x168/0x250 lib/debugobjects.c:325
Code: dd e0 c9 a4 87 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 b5 00 00 00 48  
8b 14 dd e0 c9 a4 87 48 c7 c7 80 bf a4 87 e8 16 75 0d fe <0f> 0b 83 05 4b  
46 4b 06 01 48 83 c4 20 5b 41 5c 41 5d 41 5e 5d c3
RSP: 0018:ffff888058c07838 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815ac956 RDI: ffffed100b180ef9
RBP: ffff888058c07878 R08: ffff88805692a340 R09: ffffed1015d240f1
R10: ffffed1015d240f0 R11: ffff8880ae920787 R12: 0000000000000001
R13: ffffffff88bad1a0 R14: ffffffff816039d0 R15: ffff88805f992e60
  __debug_check_no_obj_freed lib/debugobjects.c:785 [inline]
  debug_check_no_obj_freed+0x29f/0x464 lib/debugobjects.c:817
  kfree+0xbd/0x220 mm/slab.c:3754
  kvfree+0x61/0x70 mm/util.c:460
  netdev_freemem+0x4c/0x60 net/core/dev.c:9070
  netdev_release+0x86/0xb0 net/core/net-sysfs.c:1635
  device_release+0x7a/0x210 drivers/base/core.c:1064
  kobject_cleanup lib/kobject.c:691 [inline]
  kobject_release lib/kobject.c:720 [inline]
  kref_put include/linux/kref.h:65 [inline]
  kobject_put.cold+0x289/0x2e6 lib/kobject.c:737
  netdev_run_todo+0x53b/0x7c0 net/core/dev.c:8975
  rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:112
  default_device_exit_batch+0x358/0x410 net/core/dev.c:9756
  ops_exit_list.isra.0+0xfc/0x150 net/core/net_namespace.c:157
  cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
  process_one_work+0x989/0x1790 kernel/workqueue.c:2269
  worker_thread+0x98/0xe40 kernel/workqueue.c:2415
  kthread+0x354/0x420 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

======================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24  8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot
@ 2019-06-24  9:33 ` Thomas Gleixner
  2019-06-24 10:54   ` Dmitry Vyukov
  2019-08-08  0:25 ` syzbot
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-24  9:33 UTC (permalink / raw)
  To: syzbot
  Cc: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, davem,
	dmitry.torokhov, f.fainelli, gregkh, idosch, linux-kernel,
	netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing

On Mon, 24 Jun 2019, syzbot wrote:

> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    fd6b99fa Merge branch 'akpm' (patches from Andrew)
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com
> 
> device hsr_slave_0 left promiscuous mode
> team0 (unregistering): Port device team_slave_1 removed
> team0 (unregistering): Port device team_slave_0 removed
> bond0 (unregistering): Releasing backup interface bond_slave_1
> bond0 (unregistering): Releasing backup interface bond_slave_0
> bond0 (unregistering): Released all slaves
> ------------[ cut here ]------------
> ODEBUG: free active (active state 0) object type: timer_list hint:
> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767

One of the cleaned up devices has left an active timer which belongs to a
delayed work. That's all I can decode out of that splat. :(

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24  9:33 ` Thomas Gleixner
@ 2019-06-24 10:54   ` Dmitry Vyukov
  2019-06-24 12:08     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2019-06-24 10:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko,
	David Miller, Dmitry Torokhov, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs,
	tyhicks, wanghai26, yuehaibing

On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, 24 Jun 2019, syzbot wrote:
>
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    fd6b99fa Merge branch 'akpm' (patches from Andrew)
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
> > dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com
> >
> > device hsr_slave_0 left promiscuous mode
> > team0 (unregistering): Port device team_slave_1 removed
> > team0 (unregistering): Port device team_slave_0 removed
> > bond0 (unregistering): Releasing backup interface bond_slave_1
> > bond0 (unregistering): Releasing backup interface bond_slave_0
> > bond0 (unregistering): Released all slaves
> > ------------[ cut here ]------------
> > ODEBUG: free active (active state 0) object type: timer_list hint:
> > delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
>
> One of the cleaned up devices has left an active timer which belongs to a
> delayed work. That's all I can decode out of that splat. :(

Hi Thomas,

If ODEBUG would memorize full stack traces for object allocation
(using lib/stackdepot.c), it would make this splat actionable, right?
I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24 10:54   ` Dmitry Vyukov
@ 2019-06-24 12:08     ` Eric Dumazet
  2019-06-24 12:22       ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2019-06-24 12:08 UTC (permalink / raw)
  To: Dmitry Vyukov, Thomas Gleixner
  Cc: syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko,
	David Miller, Dmitry Torokhov, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs,
	tyhicks, wanghai26, yuehaibing



On 6/24/19 3:54 AM, Dmitry Vyukov wrote:
> On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> On Mon, 24 Jun 2019, syzbot wrote:
>>
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    fd6b99fa Merge branch 'akpm' (patches from Andrew)
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>>
>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com
>>>
>>> device hsr_slave_0 left promiscuous mode
>>> team0 (unregistering): Port device team_slave_1 removed
>>> team0 (unregistering): Port device team_slave_0 removed
>>> bond0 (unregistering): Releasing backup interface bond_slave_1
>>> bond0 (unregistering): Releasing backup interface bond_slave_0
>>> bond0 (unregistering): Released all slaves
>>> ------------[ cut here ]------------
>>> ODEBUG: free active (active state 0) object type: timer_list hint:
>>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
>>
>> One of the cleaned up devices has left an active timer which belongs to a
>> delayed work. That's all I can decode out of that splat. :(
> 
> Hi Thomas,
> 
> If ODEBUG would memorize full stack traces for object allocation
> (using lib/stackdepot.c), it would make this splat actionable, right?
> I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.
> 

Not sure this would help in this case as some netdev are allocated through a generic helper.

The driver specific portion might not show up in the stack trace.

It would be nice here to get the work queue function pointer,
so that it gives us a clue which driver needs a fix.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24 12:08     ` Eric Dumazet
@ 2019-06-24 12:22       ` Dmitry Vyukov
  2019-06-24 13:18         ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2019-06-24 12:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Thomas Gleixner, syzbot, Alexander Duyck, amritha.nambiar,
	Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs,
	tyhicks, wanghai26, yuehaibing

On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On 6/24/19 3:54 AM, Dmitry Vyukov wrote:
> > On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> On Mon, 24 Jun 2019, syzbot wrote:
> >>
> >>> Hello,
> >>>
> >>> syzbot found the following crash on:
> >>>
> >>> HEAD commit:    fd6b99fa Merge branch 'akpm' (patches from Andrew)
> >>> git tree:       upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
> >>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >>>
> >>> Unfortunately, I don't have any reproducer for this crash yet.
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com
> >>>
> >>> device hsr_slave_0 left promiscuous mode
> >>> team0 (unregistering): Port device team_slave_1 removed
> >>> team0 (unregistering): Port device team_slave_0 removed
> >>> bond0 (unregistering): Releasing backup interface bond_slave_1
> >>> bond0 (unregistering): Releasing backup interface bond_slave_0
> >>> bond0 (unregistering): Released>
>
> all slaves
> >>> ------------[ cut here ]------------
> >>> ODEBUG: free active (active state 0) object type: timer_list hint:
> >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
> >>
> >> One of the cleaned up devices has left an active timer which belongs to a
> >> delayed work. That's all I can decode out of that splat. :(
> >
> > Hi Thomas,
> >
> > If ODEBUG would memorize full stack traces for object allocation
> > (using lib/stackdepot.c), it would make this splat actionable, right?
> > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.
> >
>
> Not sure this would help in this case as some netdev are allocated through a generic helper.
>
> The driver specific portion might not show up in the stack trace.
>
> It would be nice here to get the work queue function pointer,
> so that it gives us a clue which driver needs a fix.

I see. But isn't the workqueue callback is cleanup_net in this case
and is in the stack?

  cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
  process_one_work+0x989/0x1790 kernel/workqueue.c:2269

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24 12:22       ` Dmitry Vyukov
@ 2019-06-24 13:18         ` Thomas Gleixner
  2019-06-24 17:27           ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-24 13:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, syzbot, Alexander Duyck, amritha.nambiar,
	Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs,
	tyhicks, wanghai26, yuehaibing

On Mon, 24 Jun 2019, Dmitry Vyukov wrote:
> On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > >>> ------------[ cut here ]------------
> > >>> ODEBUG: free active (active state 0) object type: timer_list hint:
> > >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
> > >>
> > >> One of the cleaned up devices has left an active timer which belongs to a
> > >> delayed work. That's all I can decode out of that splat. :(
> > >
> > > Hi Thomas,
> > >
> > > If ODEBUG would memorize full stack traces for object allocation
> > > (using lib/stackdepot.c), it would make this splat actionable, right?
> > > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.
> > >
> >
> > Not sure this would help in this case as some netdev are allocated through a generic helper.
> >
> > The driver specific portion might not show up in the stack trace.
> >
> > It would be nice here to get the work queue function pointer,
> > so that it gives us a clue which driver needs a fix.

Hrm. Let me think about a way to achieve that after I handled that
regression which is on my desk.

> I see. But isn't the workqueue callback is cleanup_net in this case
> and is in the stack?
> 
>   cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
>   process_one_work+0x989/0x1790 kernel/workqueue.c:2269

That's the work which does the cleanup, but I doubt that this is part of
the offending net_device.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24 13:18         ` Thomas Gleixner
@ 2019-06-24 17:27           ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-24 17:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, syzbot, Alexander Duyck, amritha.nambiar,
	Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli,
	Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs,
	tyhicks, wanghai26, yuehaibing

On Mon, 24 Jun 2019, Thomas Gleixner wrote:
> On Mon, 24 Jun 2019, Dmitry Vyukov wrote:
> > On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > >>> ------------[ cut here ]------------
> > > >>> ODEBUG: free active (active state 0) object type: timer_list hint:
> > > >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
> > > >>
> > > >> One of the cleaned up devices has left an active timer which belongs to a
> > > >> delayed work. That's all I can decode out of that splat. :(
> > > >
> > > > Hi Thomas,
> > > >
> > > > If ODEBUG would memorize full stack traces for object allocation
> > > > (using lib/stackdepot.c), it would make this splat actionable, right?
> > > > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.
> > > >
> > >
> > > Not sure this would help in this case as some netdev are allocated through a generic helper.
> > >
> > > The driver specific portion might not show up in the stack trace.
> > >
> > > It would be nice here to get the work queue function pointer,
> > > so that it gives us a clue which driver needs a fix.
> 
> Hrm. Let me think about a way to achieve that after I handled that
> regression which is on my desk.

Here is a quick and dirty hack which solves the issue at least for all run
time initialized delayed work objects. Here is the output of a test I
whipped up for this:

 OBJ: Init delayed work, arm timer
 OBJ: Leak timer
 ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20 chint: foo_fun+0x0/0x17

chint is the debug object hint of the compound object, i.e. the work
function 'foo_fun'.

Yes, naming sucks and there is still the option to use the existing
debug_obj::astate mechanics, but I was not able to wrap my head around all
the nasty corner cases which the workqueue code provides quickly. Needs
more thought.

Anyway, this should definitely help to diagnose the issue at hand.

Thanks,

	tglx

8<--------------------
 include/linux/debugobjects.h |   10 ++++++++++
 include/linux/workqueue.h    |   26 ++++++++++++++++----------
 kernel/workqueue.c           |    9 ++++++++-
 lib/debugobjects.c           |   43 +++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 75 insertions(+), 13 deletions(-)

--- a/include/linux/debugobjects.h
+++ b/include/linux/debugobjects.h
@@ -24,6 +24,9 @@ struct debug_obj_descr;
  * @astate:	current active state
  * @object:	pointer to the real object
  * @descr:	pointer to an object type specific debug description structure
+ * @comp_addr:	pointer to a compound object which is glued with @object
+ * @comp_descr:	pointer to a compound object type specific debug description
+ *		structure
  */
 struct debug_obj {
 	struct hlist_node	node;
@@ -31,6 +34,8 @@ struct debug_obj {
 	unsigned int		astate;
 	void			*object;
 	struct debug_obj_descr	*descr;
+	void			*comp_addr;
+	struct debug_obj_descr	*comp_descr;
 };
 
 /**
@@ -82,6 +87,9 @@ extern void
 debug_object_active_state(void *addr, struct debug_obj_descr *descr,
 			  unsigned int expect, unsigned int next);
 
+extern void debug_object_set_compound(void *addr, void *comp_addr,
+				      struct debug_obj_descr *comp_descr);
+
 extern void debug_objects_early_init(void);
 extern void debug_objects_mem_init(void);
 #else
@@ -99,6 +107,8 @@ static inline void
 debug_object_free      (void *addr, struct debug_obj_descr *descr) { }
 static inline void
 debug_object_assert_init(void *addr, struct debug_obj_descr *descr) { }
+static inline void
+debug_object_set_compound(void *addr, void *ca, struct debug_obj_descr *cd) { }
 
 static inline void debug_objects_early_init(void) { }
 static inline void debug_objects_mem_init(void) { }
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -204,7 +204,7 @@ struct execute_work {
 	struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, TIMER_DEFERRABLE)
 
 #ifdef CONFIG_DEBUG_OBJECTS_WORK
-extern void __init_work(struct work_struct *work, int onstack);
+extern void __init_work(struct work_struct *work, int onstack, bool delayed);
 extern void destroy_work_on_stack(struct work_struct *work);
 extern void destroy_delayed_work_on_stack(struct delayed_work *work);
 static inline unsigned int work_static(struct work_struct *work)
@@ -212,7 +212,7 @@ static inline unsigned int work_static(s
 	return *work_data_bits(work) & WORK_STRUCT_STATIC;
 }
 #else
-static inline void __init_work(struct work_struct *work, int onstack) { }
+static inline void __init_work(struct work_struct *work, int onstack, bool delayed) { }
 static inline void destroy_work_on_stack(struct work_struct *work) { }
 static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { }
 static inline unsigned int work_static(struct work_struct *work) { return 0; }
@@ -226,20 +226,20 @@ static inline unsigned int work_static(s
  * to generate better code.
  */
 #ifdef CONFIG_LOCKDEP
-#define __INIT_WORK(_work, _func, _onstack)				\
+#define __INIT_WORK(_work, _func, _onstack, _delayed)			\
 	do {								\
 		static struct lock_class_key __key;			\
 									\
-		__init_work((_work), _onstack);				\
+		__init_work((_work), _onstack, _delayed);		\
 		(_work)->data = (atomic_long_t) WORK_DATA_INIT();	\
 		lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, &__key, 0); \
 		INIT_LIST_HEAD(&(_work)->entry);			\
 		(_work)->func = (_func);				\
 	} while (0)
 #else
-#define __INIT_WORK(_work, _func, _onstack)				\
+#define __INIT_WORK(_work, _func, _onstack, _delayed)			\
 	do {								\
-		__init_work((_work), _onstack);				\
+		__init_work((_work), _onstack, _delayed);		\
 		(_work)->data = (atomic_long_t) WORK_DATA_INIT();	\
 		INIT_LIST_HEAD(&(_work)->entry);			\
 		(_work)->func = (_func);				\
@@ -247,25 +247,31 @@ static inline unsigned int work_static(s
 #endif
 
 #define INIT_WORK(_work, _func)						\
-	__INIT_WORK((_work), (_func), 0)
+	__INIT_WORK((_work), (_func), 0, 0)
 
 #define INIT_WORK_ONSTACK(_work, _func)					\
-	__INIT_WORK((_work), (_func), 1)
+	__INIT_WORK((_work), (_func), 1, 0)
+
+#define __INIT_DWORK(_work, _func)					\
+	__INIT_WORK((_work), (_func), 0, 1)
+
+#define __INIT_DWORK_ONSTACK(_work, _func)				\
+	__INIT_WORK((_work), (_func), 1, 1)
 
 #define __INIT_DELAYED_WORK(_work, _func, _tflags)			\
 	do {								\
-		INIT_WORK(&(_work)->work, (_func));			\
 		__init_timer(&(_work)->timer,				\
 			     delayed_work_timer_fn,			\
 			     (_tflags) | TIMER_IRQSAFE);		\
+		__INIT_DWORK(&(_work)->work, (_func));			\
 	} while (0)
 
 #define __INIT_DELAYED_WORK_ONSTACK(_work, _func, _tflags)		\
 	do {								\
-		INIT_WORK_ONSTACK(&(_work)->work, (_func));		\
 		__init_timer_on_stack(&(_work)->timer,			\
 				      delayed_work_timer_fn,		\
 				      (_tflags) | TIMER_IRQSAFE);	\
+		__INIT_DWORK_ONSTACK(&(_work)->work, (_func));		\
 	} while (0)
 
 #define INIT_DELAYED_WORK(_work, _func)					\
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -499,12 +499,19 @@ static inline void debug_work_deactivate
 	debug_object_deactivate(work, &work_debug_descr);
 }
 
-void __init_work(struct work_struct *work, int onstack)
+void __init_work(struct work_struct *work, int onstack, bool delayed)
 {
 	if (onstack)
 		debug_object_init_on_stack(work, &work_debug_descr);
 	else
 		debug_object_init(work, &work_debug_descr);
+
+	if (delayed) {
+		struct delayed_work *dwork = to_delayed_work(work);
+
+		debug_object_set_compound(&dwork->timer, work,
+					  &work_debug_descr);
+	}
 }
 EXPORT_SYMBOL_GPL(__init_work);
 
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -179,6 +179,8 @@ alloc_object(void *addr, struct debug_bu
 		obj->descr  = descr;
 		obj->state  = ODEBUG_STATE_NONE;
 		obj->astate = 0;
+		obj->comp_addr = NULL;
+		obj->comp_descr = NULL;
 		hlist_del(&obj->node);
 
 		hlist_add_head(&obj->node, &b->list);
@@ -321,11 +323,17 @@ static void debug_print_object(struct de
 	if (limit < 5 && descr != descr_test) {
 		void *hint = descr->debug_hint ?
 			descr->debug_hint(obj->object) : NULL;
+		void *chint = NULL;
+
+		/* Get a hint about a compound object */
+		if (obj->comp_descr && obj->comp_descr->debug_hint)
+			chint = obj->comp_descr->debug_hint(obj->comp_addr);
+
 		limit++;
 		WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) "
-				 "object type: %s hint: %pS\n",
+				 "object type: %s hint: %pS chint: %pS\n",
 			msg, obj_states[obj->state], obj->astate,
-			descr->name, hint);
+		     descr->name, hint, chint);
 	}
 	debug_objects_warnings++;
 }
@@ -448,6 +456,37 @@ void debug_object_init_on_stack(void *ad
 EXPORT_SYMBOL_GPL(debug_object_init_on_stack);
 
 /**
+ * debug_object_set_compound - Set a pointer to a compund object
+ * @addr:	address of the object
+ * @comp_addr:	pointer to the compound object related to @addr
+ * @comp_descr:	pointer to an object specific debug description structure for
+ *		@comp_addr
+ *
+ * Useful for delayed work and similar constructs where the
+ * debug_obj::astate tracking would be complex to achieve.
+ */
+void debug_object_set_compound(void *addr, void *comp_addr,
+			       struct debug_obj_descr *comp_descr)
+{
+	struct debug_bucket *db;
+	struct debug_obj *obj;
+	unsigned long flags;
+
+	if (!debug_objects_enabled)
+		return;
+
+	db = get_bucket((unsigned long) addr);
+
+	raw_spin_lock_irqsave(&db->lock, flags);
+	obj = lookup_object(addr, db);
+	if (obj) {
+		obj->comp_addr = comp_addr;
+		obj->comp_descr = comp_descr;
+	}
+	raw_spin_unlock_irqrestore(&db->lock, flags);
+}
+
+/**
  * debug_object_activate - debug checks when an object is activated
  * @addr:	address of the object
  * @descr:	pointer to an object specific debug description structure

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-06-24  8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot
  2019-06-24  9:33 ` Thomas Gleixner
@ 2019-08-08  0:25 ` syzbot
  2019-08-12 19:45   ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: syzbot @ 2019-08-08  0:25 UTC (permalink / raw)
  To: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, avagin,
	davem, dmitry.torokhov, dvyukov, eric.dumazet, f.fainelli,
	gregkh, idosch, jiri, kimbrownkd, linux-kernel, netdev,
	syzkaller-bugs, tglx, tyhicks, wanghai26, yuehaibing

syzbot has found a reproducer for the following crash on:

HEAD commit:    13dfb3fa Merge git://git.kernel.org/pub/scm/linux/kernel/g..
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1671e69a600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7
dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=170542c2600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com

bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): Released all slaves
------------[ cut here ]------------
ODEBUG: free active (active state 0) object type: timer_list hint:  
delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:768
WARNING: CPU: 0 PID: 9919 at lib/debugobjects.c:481  
debug_print_object+0x168/0x250 lib/debugobjects.c:481
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9919 Comm: kworker/u4:6 Not tainted 5.3.0-rc3+ #122
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2dc/0x755 kernel/panic.c:219
  __warn.cold+0x20/0x4c kernel/panic.c:576
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
RIP: 0010:debug_print_object+0x168/0x250 lib/debugobjects.c:481
Code: dd e0 32 c6 87 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 b5 00 00 00 48  
8b 14 dd e0 32 c6 87 48 c7 c7 e0 27 c6 87 e8 70 cd 05 fe <0f> 0b 83 05 a3  
7c 67 06 01 48 83 c4 20 5b 41 5c 41 5d 41 5e 5d c3
RSP: 0018:ffff8880898ff838 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815c3ba6 RDI: ffffed101131fef9
RBP: ffff8880898ff878 R08: ffff88808ece42c0 R09: ffffed1015d04101
R10: ffffed1015d04100 R11: ffff8880ae820807 R12: 0000000000000001
R13: ffffffff88db6660 R14: ffffffff8161da40 R15: ffff88808f639af0
  __debug_check_no_obj_freed lib/debugobjects.c:963 [inline]
  debug_check_no_obj_freed+0x2d4/0x43f lib/debugobjects.c:994
  kfree+0xf8/0x2c0 mm/slab.c:3755
  kvfree+0x61/0x70 mm/util.c:488
  netdev_freemem+0x4c/0x60 net/core/dev.c:9093
  netdev_release+0x86/0xb0 net/core/net-sysfs.c:1635
  device_release+0x7a/0x210 drivers/base/core.c:1064
  kobject_cleanup lib/kobject.c:693 [inline]
  kobject_release lib/kobject.c:722 [inline]
  kref_put include/linux/kref.h:65 [inline]
  kobject_put.cold+0x289/0x2e6 lib/kobject.c:739
  netdev_run_todo+0x53b/0x7b0 net/core/dev.c:8998
  rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:112
  default_device_exit_batch+0x358/0x410 net/core/dev.c:9781
  ops_exit_list.isra.0+0xfc/0x150 net/core/net_namespace.c:175
  cleanup_net+0x4e2/0xa70 net/core/net_namespace.c:594
  process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
  worker_thread+0x98/0xe40 kernel/workqueue.c:2415
  kthread+0x361/0x430 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Kernel Offset: disabled
Rebooting in 86400 seconds..


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: WARNING: ODEBUG bug in netdev_freemem (2)
  2019-08-08  0:25 ` syzbot
@ 2019-08-12 19:45   ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2019-08-12 19:45 UTC (permalink / raw)
  To: syzbot
  Cc: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, avagin,
	davem, dmitry.torokhov, dvyukov, eric.dumazet, f.fainelli,
	gregkh, idosch, jiri, kimbrownkd, linux-kernel, netdev,
	syzkaller-bugs, tyhicks, wanghai26, yuehaibing

On Wed, 7 Aug 2019, syzbot wrote:

> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:    13dfb3fa Merge git://git.kernel.org/pub/scm/linux/kernel/g..
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1671e69a600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7
> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=170542c2600000

I can't reproduce that here. Can you please apply the patch from:

  https://lore.kernel.org/lkml/alpine.DEB.2.21.1906241920540.32342@nanos.tec.linutronix.de

and try to reproduce with that applied? That should give us more
information about the actual delayed work.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-12 19:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24  8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot
2019-06-24  9:33 ` Thomas Gleixner
2019-06-24 10:54   ` Dmitry Vyukov
2019-06-24 12:08     ` Eric Dumazet
2019-06-24 12:22       ` Dmitry Vyukov
2019-06-24 13:18         ` Thomas Gleixner
2019-06-24 17:27           ` Thomas Gleixner
2019-08-08  0:25 ` syzbot
2019-08-12 19:45   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).