* WARNING: ODEBUG bug in netdev_freemem (2) @ 2019-06-24 8:53 syzbot 2019-06-24 9:33 ` Thomas Gleixner 2019-08-08 0:25 ` syzbot 0 siblings, 2 replies; 9+ messages in thread From: syzbot @ 2019-06-24 8:53 UTC (permalink / raw) To: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, davem, dmitry.torokhov, f.fainelli, gregkh, idosch, linux-kernel, netdev, syzkaller-bugs, tglx, tyhicks, wanghai26, yuehaibing Hello, syzbot found the following crash on: HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000 kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586 dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com device hsr_slave_0 left promiscuous mode team0 (unregistering): Port device team_slave_1 removed team0 (unregistering): Port device team_slave_0 removed bond0 (unregistering): Releasing backup interface bond_slave_1 bond0 (unregistering): Releasing backup interface bond_slave_0 bond0 (unregistering): Released all slaves ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 WARNING: CPU: 1 PID: 25149 at lib/debugobjects.c:325 debug_print_object+0x168/0x250 lib/debugobjects.c:325 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 25149 Comm: kworker/u4:1 Not tainted 5.2.0-rc4+ #31 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x744 kernel/panic.c:219 __warn.cold+0x20/0x4d kernel/panic.c:576 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986 RIP: 0010:debug_print_object+0x168/0x250 lib/debugobjects.c:325 Code: dd e0 c9 a4 87 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 b5 00 00 00 48 8b 14 dd e0 c9 a4 87 48 c7 c7 80 bf a4 87 e8 16 75 0d fe <0f> 0b 83 05 4b 46 4b 06 01 48 83 c4 20 5b 41 5c 41 5d 41 5e 5d c3 RSP: 0018:ffff888058c07838 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815ac956 RDI: ffffed100b180ef9 RBP: ffff888058c07878 R08: ffff88805692a340 R09: ffffed1015d240f1 R10: ffffed1015d240f0 R11: ffff8880ae920787 R12: 0000000000000001 R13: ffffffff88bad1a0 R14: ffffffff816039d0 R15: ffff88805f992e60 __debug_check_no_obj_freed lib/debugobjects.c:785 [inline] debug_check_no_obj_freed+0x29f/0x464 lib/debugobjects.c:817 kfree+0xbd/0x220 mm/slab.c:3754 kvfree+0x61/0x70 mm/util.c:460 netdev_freemem+0x4c/0x60 net/core/dev.c:9070 netdev_release+0x86/0xb0 net/core/net-sysfs.c:1635 device_release+0x7a/0x210 drivers/base/core.c:1064 kobject_cleanup lib/kobject.c:691 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put.cold+0x289/0x2e6 lib/kobject.c:737 netdev_run_todo+0x53b/0x7c0 net/core/dev.c:8975 rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:112 default_device_exit_batch+0x358/0x410 net/core/dev.c:9756 ops_exit_list.isra.0+0xfc/0x150 net/core/net_namespace.c:157 cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553 process_one_work+0x989/0x1790 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x354/0x420 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 ====================================================== --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot @ 2019-06-24 9:33 ` Thomas Gleixner 2019-06-24 10:54 ` Dmitry Vyukov 2019-08-08 0:25 ` syzbot 1 sibling, 1 reply; 9+ messages in thread From: Thomas Gleixner @ 2019-06-24 9:33 UTC (permalink / raw) To: syzbot Cc: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, davem, dmitry.torokhov, f.fainelli, gregkh, idosch, linux-kernel, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Mon, 24 Jun 2019, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew) > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586 > dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > Unfortunately, I don't have any reproducer for this crash yet. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com > > device hsr_slave_0 left promiscuous mode > team0 (unregistering): Port device team_slave_1 removed > team0 (unregistering): Port device team_slave_0 removed > bond0 (unregistering): Releasing backup interface bond_slave_1 > bond0 (unregistering): Releasing backup interface bond_slave_0 > bond0 (unregistering): Released all slaves > ------------[ cut here ]------------ > ODEBUG: free active (active state 0) object type: timer_list hint: > delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 One of the cleaned up devices has left an active timer which belongs to a delayed work. That's all I can decode out of that splat. :( Thanks, tglx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 9:33 ` Thomas Gleixner @ 2019-06-24 10:54 ` Dmitry Vyukov 2019-06-24 12:08 ` Eric Dumazet 0 siblings, 1 reply; 9+ messages in thread From: Dmitry Vyukov @ 2019-06-24 10:54 UTC (permalink / raw) To: Thomas Gleixner Cc: syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli, Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Mon, 24 Jun 2019, syzbot wrote: > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew) > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586 > > dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com > > > > device hsr_slave_0 left promiscuous mode > > team0 (unregistering): Port device team_slave_1 removed > > team0 (unregistering): Port device team_slave_0 removed > > bond0 (unregistering): Releasing backup interface bond_slave_1 > > bond0 (unregistering): Releasing backup interface bond_slave_0 > > bond0 (unregistering): Released all slaves > > ------------[ cut here ]------------ > > ODEBUG: free active (active state 0) object type: timer_list hint: > > delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 > > One of the cleaned up devices has left an active timer which belongs to a > delayed work. That's all I can decode out of that splat. :( Hi Thomas, If ODEBUG would memorize full stack traces for object allocation (using lib/stackdepot.c), it would make this splat actionable, right? I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 10:54 ` Dmitry Vyukov @ 2019-06-24 12:08 ` Eric Dumazet 2019-06-24 12:22 ` Dmitry Vyukov 0 siblings, 1 reply; 9+ messages in thread From: Eric Dumazet @ 2019-06-24 12:08 UTC (permalink / raw) To: Dmitry Vyukov, Thomas Gleixner Cc: syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli, Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On 6/24/19 3:54 AM, Dmitry Vyukov wrote: > On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote: >> >> On Mon, 24 Jun 2019, syzbot wrote: >> >>> Hello, >>> >>> syzbot found the following crash on: >>> >>> HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew) >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> >>> Unfortunately, I don't have any reproducer for this crash yet. >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com >>> >>> device hsr_slave_0 left promiscuous mode >>> team0 (unregistering): Port device team_slave_1 removed >>> team0 (unregistering): Port device team_slave_0 removed >>> bond0 (unregistering): Releasing backup interface bond_slave_1 >>> bond0 (unregistering): Releasing backup interface bond_slave_0 >>> bond0 (unregistering): Released all slaves >>> ------------[ cut here ]------------ >>> ODEBUG: free active (active state 0) object type: timer_list hint: >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 >> >> One of the cleaned up devices has left an active timer which belongs to a >> delayed work. That's all I can decode out of that splat. :( > > Hi Thomas, > > If ODEBUG would memorize full stack traces for object allocation > (using lib/stackdepot.c), it would make this splat actionable, right? > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this. > Not sure this would help in this case as some netdev are allocated through a generic helper. The driver specific portion might not show up in the stack trace. It would be nice here to get the work queue function pointer, so that it gives us a clue which driver needs a fix. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 12:08 ` Eric Dumazet @ 2019-06-24 12:22 ` Dmitry Vyukov 2019-06-24 13:18 ` Thomas Gleixner 0 siblings, 1 reply; 9+ messages in thread From: Dmitry Vyukov @ 2019-06-24 12:22 UTC (permalink / raw) To: Eric Dumazet Cc: Thomas Gleixner, syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli, Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > On 6/24/19 3:54 AM, Dmitry Vyukov wrote: > > On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@linutronix.de> wrote: > >> > >> On Mon, 24 Jun 2019, syzbot wrote: > >> > >>> Hello, > >>> > >>> syzbot found the following crash on: > >>> > >>> HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew) > >>> git tree: upstream > >>> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000 > >>> kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586 > >>> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec > >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) > >>> > >>> Unfortunately, I don't have any reproducer for this crash yet. > >>> > >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >>> Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com > >>> > >>> device hsr_slave_0 left promiscuous mode > >>> team0 (unregistering): Port device team_slave_1 removed > >>> team0 (unregistering): Port device team_slave_0 removed > >>> bond0 (unregistering): Releasing backup interface bond_slave_1 > >>> bond0 (unregistering): Releasing backup interface bond_slave_0 > >>> bond0 (unregistering): Released> > > all slaves > >>> ------------[ cut here ]------------ > >>> ODEBUG: free active (active state 0) object type: timer_list hint: > >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 > >> > >> One of the cleaned up devices has left an active timer which belongs to a > >> delayed work. That's all I can decode out of that splat. :( > > > > Hi Thomas, > > > > If ODEBUG would memorize full stack traces for object allocation > > (using lib/stackdepot.c), it would make this splat actionable, right? > > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this. > > > > Not sure this would help in this case as some netdev are allocated through a generic helper. > > The driver specific portion might not show up in the stack trace. > > It would be nice here to get the work queue function pointer, > so that it gives us a clue which driver needs a fix. I see. But isn't the workqueue callback is cleanup_net in this case and is in the stack? cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553 process_one_work+0x989/0x1790 kernel/workqueue.c:2269 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 12:22 ` Dmitry Vyukov @ 2019-06-24 13:18 ` Thomas Gleixner 2019-06-24 17:27 ` Thomas Gleixner 0 siblings, 1 reply; 9+ messages in thread From: Thomas Gleixner @ 2019-06-24 13:18 UTC (permalink / raw) To: Dmitry Vyukov Cc: Eric Dumazet, syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli, Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Mon, 24 Jun 2019, Dmitry Vyukov wrote: > On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > >>> ------------[ cut here ]------------ > > >>> ODEBUG: free active (active state 0) object type: timer_list hint: > > >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 > > >> > > >> One of the cleaned up devices has left an active timer which belongs to a > > >> delayed work. That's all I can decode out of that splat. :( > > > > > > Hi Thomas, > > > > > > If ODEBUG would memorize full stack traces for object allocation > > > (using lib/stackdepot.c), it would make this splat actionable, right? > > > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this. > > > > > > > Not sure this would help in this case as some netdev are allocated through a generic helper. > > > > The driver specific portion might not show up in the stack trace. > > > > It would be nice here to get the work queue function pointer, > > so that it gives us a clue which driver needs a fix. Hrm. Let me think about a way to achieve that after I handled that regression which is on my desk. > I see. But isn't the workqueue callback is cleanup_net in this case > and is in the stack? > > cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553 > process_one_work+0x989/0x1790 kernel/workqueue.c:2269 That's the work which does the cleanup, but I doubt that this is part of the offending net_device. Thanks, tglx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 13:18 ` Thomas Gleixner @ 2019-06-24 17:27 ` Thomas Gleixner 0 siblings, 0 replies; 9+ messages in thread From: Thomas Gleixner @ 2019-06-24 17:27 UTC (permalink / raw) To: Dmitry Vyukov Cc: Eric Dumazet, syzbot, Alexander Duyck, amritha.nambiar, Andy Shevchenko, David Miller, Dmitry Torokhov, Florian Fainelli, Greg Kroah-Hartman, Ido Schimmel, LKML, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Mon, 24 Jun 2019, Thomas Gleixner wrote: > On Mon, 24 Jun 2019, Dmitry Vyukov wrote: > > On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > >>> ------------[ cut here ]------------ > > > >>> ODEBUG: free active (active state 0) object type: timer_list hint: > > > >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767 > > > >> > > > >> One of the cleaned up devices has left an active timer which belongs to a > > > >> delayed work. That's all I can decode out of that splat. :( > > > > > > > > Hi Thomas, > > > > > > > > If ODEBUG would memorize full stack traces for object allocation > > > > (using lib/stackdepot.c), it would make this splat actionable, right? > > > > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this. > > > > > > > > > > Not sure this would help in this case as some netdev are allocated through a generic helper. > > > > > > The driver specific portion might not show up in the stack trace. > > > > > > It would be nice here to get the work queue function pointer, > > > so that it gives us a clue which driver needs a fix. > > Hrm. Let me think about a way to achieve that after I handled that > regression which is on my desk. Here is a quick and dirty hack which solves the issue at least for all run time initialized delayed work objects. Here is the output of a test I whipped up for this: OBJ: Init delayed work, arm timer OBJ: Leak timer ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20 chint: foo_fun+0x0/0x17 chint is the debug object hint of the compound object, i.e. the work function 'foo_fun'. Yes, naming sucks and there is still the option to use the existing debug_obj::astate mechanics, but I was not able to wrap my head around all the nasty corner cases which the workqueue code provides quickly. Needs more thought. Anyway, this should definitely help to diagnose the issue at hand. Thanks, tglx 8<-------------------- include/linux/debugobjects.h | 10 ++++++++++ include/linux/workqueue.h | 26 ++++++++++++++++---------- kernel/workqueue.c | 9 ++++++++- lib/debugobjects.c | 43 +++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 75 insertions(+), 13 deletions(-) --- a/include/linux/debugobjects.h +++ b/include/linux/debugobjects.h @@ -24,6 +24,9 @@ struct debug_obj_descr; * @astate: current active state * @object: pointer to the real object * @descr: pointer to an object type specific debug description structure + * @comp_addr: pointer to a compound object which is glued with @object + * @comp_descr: pointer to a compound object type specific debug description + * structure */ struct debug_obj { struct hlist_node node; @@ -31,6 +34,8 @@ struct debug_obj { unsigned int astate; void *object; struct debug_obj_descr *descr; + void *comp_addr; + struct debug_obj_descr *comp_descr; }; /** @@ -82,6 +87,9 @@ extern void debug_object_active_state(void *addr, struct debug_obj_descr *descr, unsigned int expect, unsigned int next); +extern void debug_object_set_compound(void *addr, void *comp_addr, + struct debug_obj_descr *comp_descr); + extern void debug_objects_early_init(void); extern void debug_objects_mem_init(void); #else @@ -99,6 +107,8 @@ static inline void debug_object_free (void *addr, struct debug_obj_descr *descr) { } static inline void debug_object_assert_init(void *addr, struct debug_obj_descr *descr) { } +static inline void +debug_object_set_compound(void *addr, void *ca, struct debug_obj_descr *cd) { } static inline void debug_objects_early_init(void) { } static inline void debug_objects_mem_init(void) { } --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -204,7 +204,7 @@ struct execute_work { struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, TIMER_DEFERRABLE) #ifdef CONFIG_DEBUG_OBJECTS_WORK -extern void __init_work(struct work_struct *work, int onstack); +extern void __init_work(struct work_struct *work, int onstack, bool delayed); extern void destroy_work_on_stack(struct work_struct *work); extern void destroy_delayed_work_on_stack(struct delayed_work *work); static inline unsigned int work_static(struct work_struct *work) @@ -212,7 +212,7 @@ static inline unsigned int work_static(s return *work_data_bits(work) & WORK_STRUCT_STATIC; } #else -static inline void __init_work(struct work_struct *work, int onstack) { } +static inline void __init_work(struct work_struct *work, int onstack, bool delayed) { } static inline void destroy_work_on_stack(struct work_struct *work) { } static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { } static inline unsigned int work_static(struct work_struct *work) { return 0; } @@ -226,20 +226,20 @@ static inline unsigned int work_static(s * to generate better code. */ #ifdef CONFIG_LOCKDEP -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK(_work, _func, _onstack, _delayed) \ do { \ static struct lock_class_key __key; \ \ - __init_work((_work), _onstack); \ + __init_work((_work), _onstack, _delayed); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, &__key, 0); \ INIT_LIST_HEAD(&(_work)->entry); \ (_work)->func = (_func); \ } while (0) #else -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK(_work, _func, _onstack, _delayed) \ do { \ - __init_work((_work), _onstack); \ + __init_work((_work), _onstack, _delayed); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ INIT_LIST_HEAD(&(_work)->entry); \ (_work)->func = (_func); \ @@ -247,25 +247,31 @@ static inline unsigned int work_static(s #endif #define INIT_WORK(_work, _func) \ - __INIT_WORK((_work), (_func), 0) + __INIT_WORK((_work), (_func), 0, 0) #define INIT_WORK_ONSTACK(_work, _func) \ - __INIT_WORK((_work), (_func), 1) + __INIT_WORK((_work), (_func), 1, 0) + +#define __INIT_DWORK(_work, _func) \ + __INIT_WORK((_work), (_func), 0, 1) + +#define __INIT_DWORK_ONSTACK(_work, _func) \ + __INIT_WORK((_work), (_func), 1, 1) #define __INIT_DELAYED_WORK(_work, _func, _tflags) \ do { \ - INIT_WORK(&(_work)->work, (_func)); \ __init_timer(&(_work)->timer, \ delayed_work_timer_fn, \ (_tflags) | TIMER_IRQSAFE); \ + __INIT_DWORK(&(_work)->work, (_func)); \ } while (0) #define __INIT_DELAYED_WORK_ONSTACK(_work, _func, _tflags) \ do { \ - INIT_WORK_ONSTACK(&(_work)->work, (_func)); \ __init_timer_on_stack(&(_work)->timer, \ delayed_work_timer_fn, \ (_tflags) | TIMER_IRQSAFE); \ + __INIT_DWORK_ONSTACK(&(_work)->work, (_func)); \ } while (0) #define INIT_DELAYED_WORK(_work, _func) \ --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -499,12 +499,19 @@ static inline void debug_work_deactivate debug_object_deactivate(work, &work_debug_descr); } -void __init_work(struct work_struct *work, int onstack) +void __init_work(struct work_struct *work, int onstack, bool delayed) { if (onstack) debug_object_init_on_stack(work, &work_debug_descr); else debug_object_init(work, &work_debug_descr); + + if (delayed) { + struct delayed_work *dwork = to_delayed_work(work); + + debug_object_set_compound(&dwork->timer, work, + &work_debug_descr); + } } EXPORT_SYMBOL_GPL(__init_work); --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -179,6 +179,8 @@ alloc_object(void *addr, struct debug_bu obj->descr = descr; obj->state = ODEBUG_STATE_NONE; obj->astate = 0; + obj->comp_addr = NULL; + obj->comp_descr = NULL; hlist_del(&obj->node); hlist_add_head(&obj->node, &b->list); @@ -321,11 +323,17 @@ static void debug_print_object(struct de if (limit < 5 && descr != descr_test) { void *hint = descr->debug_hint ? descr->debug_hint(obj->object) : NULL; + void *chint = NULL; + + /* Get a hint about a compound object */ + if (obj->comp_descr && obj->comp_descr->debug_hint) + chint = obj->comp_descr->debug_hint(obj->comp_addr); + limit++; WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) " - "object type: %s hint: %pS\n", + "object type: %s hint: %pS chint: %pS\n", msg, obj_states[obj->state], obj->astate, - descr->name, hint); + descr->name, hint, chint); } debug_objects_warnings++; } @@ -448,6 +456,37 @@ void debug_object_init_on_stack(void *ad EXPORT_SYMBOL_GPL(debug_object_init_on_stack); /** + * debug_object_set_compound - Set a pointer to a compund object + * @addr: address of the object + * @comp_addr: pointer to the compound object related to @addr + * @comp_descr: pointer to an object specific debug description structure for + * @comp_addr + * + * Useful for delayed work and similar constructs where the + * debug_obj::astate tracking would be complex to achieve. + */ +void debug_object_set_compound(void *addr, void *comp_addr, + struct debug_obj_descr *comp_descr) +{ + struct debug_bucket *db; + struct debug_obj *obj; + unsigned long flags; + + if (!debug_objects_enabled) + return; + + db = get_bucket((unsigned long) addr); + + raw_spin_lock_irqsave(&db->lock, flags); + obj = lookup_object(addr, db); + if (obj) { + obj->comp_addr = comp_addr; + obj->comp_descr = comp_descr; + } + raw_spin_unlock_irqrestore(&db->lock, flags); +} + +/** * debug_object_activate - debug checks when an object is activated * @addr: address of the object * @descr: pointer to an object specific debug description structure ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-06-24 8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot 2019-06-24 9:33 ` Thomas Gleixner @ 2019-08-08 0:25 ` syzbot 2019-08-12 19:45 ` Thomas Gleixner 1 sibling, 1 reply; 9+ messages in thread From: syzbot @ 2019-08-08 0:25 UTC (permalink / raw) To: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, avagin, davem, dmitry.torokhov, dvyukov, eric.dumazet, f.fainelli, gregkh, idosch, jiri, kimbrownkd, linux-kernel, netdev, syzkaller-bugs, tglx, tyhicks, wanghai26, yuehaibing syzbot has found a reproducer for the following crash on: HEAD commit: 13dfb3fa Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=1671e69a600000 kernel config: https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7 dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=170542c2600000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+c4521ac872a4ccc3afec@syzkaller.appspotmail.com bond0 (unregistering): (slave bond_slave_1): Releasing backup interface bond0 (unregistering): (slave bond_slave_0): Releasing backup interface bond0 (unregistering): Released all slaves ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:768 WARNING: CPU: 0 PID: 9919 at lib/debugobjects.c:481 debug_print_object+0x168/0x250 lib/debugobjects.c:481 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 9919 Comm: kworker/u4:6 Not tainted 5.3.0-rc3+ #122 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2dc/0x755 kernel/panic.c:219 __warn.cold+0x20/0x4c kernel/panic.c:576 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028 RIP: 0010:debug_print_object+0x168/0x250 lib/debugobjects.c:481 Code: dd e0 32 c6 87 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 b5 00 00 00 48 8b 14 dd e0 32 c6 87 48 c7 c7 e0 27 c6 87 e8 70 cd 05 fe <0f> 0b 83 05 a3 7c 67 06 01 48 83 c4 20 5b 41 5c 41 5d 41 5e 5d c3 RSP: 0018:ffff8880898ff838 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815c3ba6 RDI: ffffed101131fef9 RBP: ffff8880898ff878 R08: ffff88808ece42c0 R09: ffffed1015d04101 R10: ffffed1015d04100 R11: ffff8880ae820807 R12: 0000000000000001 R13: ffffffff88db6660 R14: ffffffff8161da40 R15: ffff88808f639af0 __debug_check_no_obj_freed lib/debugobjects.c:963 [inline] debug_check_no_obj_freed+0x2d4/0x43f lib/debugobjects.c:994 kfree+0xf8/0x2c0 mm/slab.c:3755 kvfree+0x61/0x70 mm/util.c:488 netdev_freemem+0x4c/0x60 net/core/dev.c:9093 netdev_release+0x86/0xb0 net/core/net-sysfs.c:1635 device_release+0x7a/0x210 drivers/base/core.c:1064 kobject_cleanup lib/kobject.c:693 [inline] kobject_release lib/kobject.c:722 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put.cold+0x289/0x2e6 lib/kobject.c:739 netdev_run_todo+0x53b/0x7b0 net/core/dev.c:8998 rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:112 default_device_exit_batch+0x358/0x410 net/core/dev.c:9781 ops_exit_list.isra.0+0xfc/0x150 net/core/net_namespace.c:175 cleanup_net+0x4e2/0xa70 net/core/net_namespace.c:594 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Kernel Offset: disabled Rebooting in 86400 seconds.. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: WARNING: ODEBUG bug in netdev_freemem (2) 2019-08-08 0:25 ` syzbot @ 2019-08-12 19:45 ` Thomas Gleixner 0 siblings, 0 replies; 9+ messages in thread From: Thomas Gleixner @ 2019-08-12 19:45 UTC (permalink / raw) To: syzbot Cc: alexander.h.duyck, amritha.nambiar, andriy.shevchenko, avagin, davem, dmitry.torokhov, dvyukov, eric.dumazet, f.fainelli, gregkh, idosch, jiri, kimbrownkd, linux-kernel, netdev, syzkaller-bugs, tyhicks, wanghai26, yuehaibing On Wed, 7 Aug 2019, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit: 13dfb3fa Merge git://git.kernel.org/pub/scm/linux/kernel/g.. > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=1671e69a600000 > kernel config: https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7 > dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=170542c2600000 I can't reproduce that here. Can you please apply the patch from: https://lore.kernel.org/lkml/alpine.DEB.2.21.1906241920540.32342@nanos.tec.linutronix.de and try to reproduce with that applied? That should give us more information about the actual delayed work. Thanks, tglx ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-08-12 19:46 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-24 8:53 WARNING: ODEBUG bug in netdev_freemem (2) syzbot 2019-06-24 9:33 ` Thomas Gleixner 2019-06-24 10:54 ` Dmitry Vyukov 2019-06-24 12:08 ` Eric Dumazet 2019-06-24 12:22 ` Dmitry Vyukov 2019-06-24 13:18 ` Thomas Gleixner 2019-06-24 17:27 ` Thomas Gleixner 2019-08-08 0:25 ` syzbot 2019-08-12 19:45 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).