All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release
@ 2021-09-21  3:15 syzbot
  2021-09-22 22:56 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2021-09-21  3:15 UTC (permalink / raw)
  To: axboe, linux-block, linux-kernel, netdev, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    85c698863c15 net/ipv4/tcp_minisocks.c: remove superfluous ..
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=13d9d3e7300000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6d93fe4341f98704
dashboard link: https://syzkaller.appspot.com/bug?extid=a10a3d280be23be45d04
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11bd98f7300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a10a3d280be23be45d04@syzkaller.appspotmail.com

------------[ cut here ]------------
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
WARNING: CPU: 0 PID: 3816 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Modules linked in:
CPU: 0 PID: 3816 Comm: syz-executor.0 Not tainted 5.15.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd c0 39 e4 89 4c 89 ee 48 c7 c7 c0 2d e4 89 e8 ef e3 14 05 <0f> 0b 83 05 35 09 91 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
RSP: 0018:ffffc9000a6770f0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000000
RDX: ffff88805b3db900 RSI: ffffffff815dbd88 RDI: fffff520014cee10
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815d5b2e R11: 0000000000000000 R12: ffffffff898de180
R13: ffffffff89e43440 R14: ffffffff8164bae0 R15: 1ffff920014cee29
FS:  00007fee66b3a700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f362471f3a0 CR3: 00000000272af000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 debug_object_assert_init lib/debugobjects.c:895 [inline]
 debug_object_assert_init+0x1f4/0x2e0 lib/debugobjects.c:866
 debug_timer_assert_init kernel/time/timer.c:739 [inline]
 debug_assert_init kernel/time/timer.c:784 [inline]
 try_to_del_timer_sync+0x6d/0x110 kernel/time/timer.c:1229
 del_timer_sync+0x138/0x1b0 kernel/time/timer.c:1382
 cleanup_srcu_struct+0x14c/0x2f0 kernel/rcu/srcutree.c:379
 blk_mq_hw_sysfs_release+0x147/0x190 block/blk-mq-sysfs.c:40
 kobject_cleanup lib/kobject.c:705 [inline]
 kobject_release lib/kobject.c:736 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c8/0x540 lib/kobject.c:753
 blk_mq_release+0x259/0x430 block/blk-mq.c:3094
 blk_release_queue+0x2a7/0x4a0 block/blk-sysfs.c:823
 kobject_cleanup lib/kobject.c:705 [inline]
 kobject_release lib/kobject.c:736 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c8/0x540 lib/kobject.c:753
 __blk_mq_alloc_disk+0x12e/0x160 block/blk-mq.c:3142
 nbd_dev_add+0x3be/0xb90 drivers/block/nbd.c:1716
 nbd_genl_connect+0x11f3/0x1930 drivers/block/nbd.c:1884
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
 genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340
 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:724
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2492
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fee673e4739
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fee66b3a188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fee674e9038 RCX: 00007fee673e4739
RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000007
RBP: 00007fee6743ecc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fee674e9038
R13: 00007ffe3695106f R14: 00007fee66b3a300 R15: 0000000000022000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release
  2021-09-21  3:15 [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release syzbot
@ 2021-09-22 22:56 ` Thomas Gleixner
  2021-09-22 23:27   ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2021-09-22 22:56 UTC (permalink / raw)
  To: syzbot, axboe, linux-block, linux-kernel, netdev, syzkaller-bugs
  Cc: Paul E. McKenney

On Mon, Sep 20 2021 at 20:15, syzbot wrote:

Cc+: paulmck

> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    85c698863c15 net/ipv4/tcp_minisocks.c: remove superfluous ..
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13d9d3e7300000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6d93fe4341f98704
> dashboard link: https://syzkaller.appspot.com/bug?extid=a10a3d280be23be45d04
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11bd98f7300000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a10a3d280be23be45d04@syzkaller.appspotmail.com
>
> ------------[ cut here ]------------
> ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
> WARNING: CPU: 0 PID: 3816 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> Modules linked in:
> CPU: 0 PID: 3816 Comm: syz-executor.0 Not tainted 5.15.0-rc1-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd c0 39 e4 89 4c 89 ee 48 c7 c7 c0 2d e4 89 e8 ef e3 14 05 <0f> 0b 83 05 35 09 91 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
> RSP: 0018:ffffc9000a6770f0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000000
> RDX: ffff88805b3db900 RSI: ffffffff815dbd88 RDI: fffff520014cee10
> RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffffff815d5b2e R11: 0000000000000000 R12: ffffffff898de180
> R13: ffffffff89e43440 R14: ffffffff8164bae0 R15: 1ffff920014cee29
> FS:  00007fee66b3a700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f362471f3a0 CR3: 00000000272af000 CR4: 00000000001506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  debug_object_assert_init lib/debugobjects.c:895 [inline]
>  debug_object_assert_init+0x1f4/0x2e0 lib/debugobjects.c:866
>  debug_timer_assert_init kernel/time/timer.c:739 [inline]
>  debug_assert_init kernel/time/timer.c:784 [inline]
>  try_to_del_timer_sync+0x6d/0x110 kernel/time/timer.c:1229
>  del_timer_sync+0x138/0x1b0 kernel/time/timer.c:1382
>  cleanup_srcu_struct+0x14c/0x2f0 kernel/rcu/srcutree.c:379

So cleanup_srcu_struct() tries to delete a timer which was never initialized....

>  blk_mq_hw_sysfs_release+0x147/0x190 block/blk-mq-sysfs.c:40
>  kobject_cleanup lib/kobject.c:705 [inline]
>  kobject_release lib/kobject.c:736 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x1c8/0x540 lib/kobject.c:753
>  blk_mq_release+0x259/0x430 block/blk-mq.c:3094
>  blk_release_queue+0x2a7/0x4a0 block/blk-sysfs.c:823
>  kobject_cleanup lib/kobject.c:705 [inline]
>  kobject_release lib/kobject.c:736 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x1c8/0x540 lib/kobject.c:753
>  __blk_mq_alloc_disk+0x12e/0x160 block/blk-mq.c:3142
>  nbd_dev_add+0x3be/0xb90 drivers/block/nbd.c:1716
>  nbd_genl_connect+0x11f3/0x1930 drivers/block/nbd.c:1884
>  genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
>  genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
>  genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
>  netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
>  genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
>  netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
>  netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340
>  netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929
>  sock_sendmsg_nosec net/socket.c:704 [inline]
>  sock_sendmsg+0xcf/0x120 net/socket.c:724
>  ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
>  ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
>  __sys_sendmsg+0xf3/0x1c0 net/socket.c:2492
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7fee673e4739
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fee66b3a188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007fee674e9038 RCX: 00007fee673e4739
> RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000007
> RBP: 00007fee6743ecc4 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007fee674e9038
> R13: 00007ffe3695106f R14: 00007fee66b3a300 R15: 0000000000022000
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release
  2021-09-22 22:56 ` Thomas Gleixner
@ 2021-09-22 23:27   ` Paul E. McKenney
  2021-09-23  3:03     ` Ming Lei
  0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2021-09-22 23:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, axboe, linux-block, linux-kernel, netdev, syzkaller-bugs

On Thu, Sep 23, 2021 at 12:56:39AM +0200, Thomas Gleixner wrote:
> On Mon, Sep 20 2021 at 20:15, syzbot wrote:
> 
> Cc+: paulmck
> 
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    85c698863c15 net/ipv4/tcp_minisocks.c: remove superfluous ..
> > git tree:       net-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13d9d3e7300000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=6d93fe4341f98704
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a10a3d280be23be45d04
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11bd98f7300000
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+a10a3d280be23be45d04@syzkaller.appspotmail.com
> >
> > ------------[ cut here ]------------
> > ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
> > WARNING: CPU: 0 PID: 3816 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> > Modules linked in:
> > CPU: 0 PID: 3816 Comm: syz-executor.0 Not tainted 5.15.0-rc1-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> > Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd c0 39 e4 89 4c 89 ee 48 c7 c7 c0 2d e4 89 e8 ef e3 14 05 <0f> 0b 83 05 35 09 91 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
> > RSP: 0018:ffffc9000a6770f0 EFLAGS: 00010282
> > RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000000
> > RDX: ffff88805b3db900 RSI: ffffffff815dbd88 RDI: fffff520014cee10
> > RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> > R10: ffffffff815d5b2e R11: 0000000000000000 R12: ffffffff898de180
> > R13: ffffffff89e43440 R14: ffffffff8164bae0 R15: 1ffff920014cee29
> > FS:  00007fee66b3a700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f362471f3a0 CR3: 00000000272af000 CR4: 00000000001506f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  debug_object_assert_init lib/debugobjects.c:895 [inline]
> >  debug_object_assert_init+0x1f4/0x2e0 lib/debugobjects.c:866
> >  debug_timer_assert_init kernel/time/timer.c:739 [inline]
> >  debug_assert_init kernel/time/timer.c:784 [inline]
> >  try_to_del_timer_sync+0x6d/0x110 kernel/time/timer.c:1229
> >  del_timer_sync+0x138/0x1b0 kernel/time/timer.c:1382
> >  cleanup_srcu_struct+0x14c/0x2f0 kernel/rcu/srcutree.c:379
> 
> So cleanup_srcu_struct() tries to delete a timer which was never initialized....

That does not sound like a way to keep a kernel running...

For dynamically allocated srcu_struct structures, what is supposed to
happen is that init_srcu_struct() is invoked at some point before the
first use.  Then init_srcu_struct() invokes init_srcu_struct_fields()
which invokes init_srcu_struct_nodes() which initializes that timer.
Unless the memory allocations in init_srcu_struct_fields() failed,
in which case init_srcu_struct() should have handed you back a -ENOMEM.

OK, there is a call to init_srcu_struct() in blk_mq_alloc_hctx().
It does ignore the init_srcu_struct() return value, so maybe a
WARN_ON_ONCE(init_srcu_struct(hctx->srcu)) would be good.

The ->srcu field is declared as follows:

	struct srcu_struct	srcu[];

The blk_mq_hw_ctx_size() function adjusts the size.  All of this is
controlled by a BLK_MQ_F_BLOCKING flag.  If this flag were to change,
clearly bad things could happen.  The places where it is set look to me
to be initialization time, but it might be good to have someone familiar
with this code double-check this.  Or have a separate bit that records
the state of BLK_MQ_F_BLOCKING at blk_mq_alloc_hctx() time and complain
bitterly if there was a change at blk_mq_hw_sysfs_release() time.

Other thoughts?

							Thanx, Paul

> >  blk_mq_hw_sysfs_release+0x147/0x190 block/blk-mq-sysfs.c:40
> >  kobject_cleanup lib/kobject.c:705 [inline]
> >  kobject_release lib/kobject.c:736 [inline]
> >  kref_put include/linux/kref.h:65 [inline]
> >  kobject_put+0x1c8/0x540 lib/kobject.c:753
> >  blk_mq_release+0x259/0x430 block/blk-mq.c:3094
> >  blk_release_queue+0x2a7/0x4a0 block/blk-sysfs.c:823
> >  kobject_cleanup lib/kobject.c:705 [inline]
> >  kobject_release lib/kobject.c:736 [inline]
> >  kref_put include/linux/kref.h:65 [inline]
> >  kobject_put+0x1c8/0x540 lib/kobject.c:753
> >  __blk_mq_alloc_disk+0x12e/0x160 block/blk-mq.c:3142
> >  nbd_dev_add+0x3be/0xb90 drivers/block/nbd.c:1716
> >  nbd_genl_connect+0x11f3/0x1930 drivers/block/nbd.c:1884
> >  genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
> >  genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
> >  genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
> >  netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
> >  genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
> >  netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
> >  netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340
> >  netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929
> >  sock_sendmsg_nosec net/socket.c:704 [inline]
> >  sock_sendmsg+0xcf/0x120 net/socket.c:724
> >  ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
> >  ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
> >  __sys_sendmsg+0xf3/0x1c0 net/socket.c:2492
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7fee673e4739
> > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007fee66b3a188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> > RAX: ffffffffffffffda RBX: 00007fee674e9038 RCX: 00007fee673e4739
> > RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000007
> > RBP: 00007fee6743ecc4 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fee674e9038
> > R13: 00007ffe3695106f R14: 00007fee66b3a300 R15: 0000000000022000
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > syzbot can test patches for this issue, for details see:
> > https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release
  2021-09-22 23:27   ` Paul E. McKenney
@ 2021-09-23  3:03     ` Ming Lei
  0 siblings, 0 replies; 4+ messages in thread
From: Ming Lei @ 2021-09-23  3:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Thomas Gleixner, syzbot, axboe, linux-block, linux-kernel,
	netdev, syzkaller-bugs

On Wed, Sep 22, 2021 at 04:27:02PM -0700, Paul E. McKenney wrote:
> On Thu, Sep 23, 2021 at 12:56:39AM +0200, Thomas Gleixner wrote:
> > On Mon, Sep 20 2021 at 20:15, syzbot wrote:
> > 
> > Cc+: paulmck
> > 
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    85c698863c15 net/ipv4/tcp_minisocks.c: remove superfluous ..
> > > git tree:       net-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=13d9d3e7300000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=6d93fe4341f98704
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=a10a3d280be23be45d04
> > > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11bd98f7300000
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+a10a3d280be23be45d04@syzkaller.appspotmail.com
> > >
> > > ------------[ cut here ]------------
> > > ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
> > > WARNING: CPU: 0 PID: 3816 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> > > Modules linked in:
> > > CPU: 0 PID: 3816 Comm: syz-executor.0 Not tainted 5.15.0-rc1-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
> > > Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd c0 39 e4 89 4c 89 ee 48 c7 c7 c0 2d e4 89 e8 ef e3 14 05 <0f> 0b 83 05 35 09 91 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
> > > RSP: 0018:ffffc9000a6770f0 EFLAGS: 00010282
> > > RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000000
> > > RDX: ffff88805b3db900 RSI: ffffffff815dbd88 RDI: fffff520014cee10
> > > RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> > > R10: ffffffff815d5b2e R11: 0000000000000000 R12: ffffffff898de180
> > > R13: ffffffff89e43440 R14: ffffffff8164bae0 R15: 1ffff920014cee29
> > > FS:  00007fee66b3a700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007f362471f3a0 CR3: 00000000272af000 CR4: 00000000001506f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  debug_object_assert_init lib/debugobjects.c:895 [inline]
> > >  debug_object_assert_init+0x1f4/0x2e0 lib/debugobjects.c:866
> > >  debug_timer_assert_init kernel/time/timer.c:739 [inline]
> > >  debug_assert_init kernel/time/timer.c:784 [inline]
> > >  try_to_del_timer_sync+0x6d/0x110 kernel/time/timer.c:1229
> > >  del_timer_sync+0x138/0x1b0 kernel/time/timer.c:1382
> > >  cleanup_srcu_struct+0x14c/0x2f0 kernel/rcu/srcutree.c:379
> > 
> > So cleanup_srcu_struct() tries to delete a timer which was never initialized....
> 
> That does not sound like a way to keep a kernel running...
> 
> For dynamically allocated srcu_struct structures, what is supposed to
> happen is that init_srcu_struct() is invoked at some point before the
> first use.  Then init_srcu_struct() invokes init_srcu_struct_fields()
> which invokes init_srcu_struct_nodes() which initializes that timer.
> Unless the memory allocations in init_srcu_struct_fields() failed,
> in which case init_srcu_struct() should have handed you back a -ENOMEM.
> 
> OK, there is a call to init_srcu_struct() in blk_mq_alloc_hctx().
> It does ignore the init_srcu_struct() return value, so maybe a
> WARN_ON_ONCE(init_srcu_struct(hctx->srcu)) would be good.

Yeah.

That should be the issue because del_timer_sync() from flush_delayed_work() in
cleanup_srcu_struct() isn't complained from the stack trace log, and
just the timer in perpcu field of ->sda isn't initialized, and ->sda is
allocated via alloc_percpu().

> 
> The ->srcu field is declared as follows:
> 
> 	struct srcu_struct	srcu[];
> 
> The blk_mq_hw_ctx_size() function adjusts the size.  All of this is
> controlled by a BLK_MQ_F_BLOCKING flag.  If this flag were to change,
> clearly bad things could happen.  The places where it is set look to me
> to be initialization time, but it might be good to have someone familiar
> with this code double-check this.  Or have a separate bit that records
> the state of BLK_MQ_F_BLOCKING at blk_mq_alloc_hctx() time and complain
> bitterly if there was a change at blk_mq_hw_sysfs_release() time.

BLK_MQ_F_BLOCKING is set from the beginning and never changed.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-23  3:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21  3:15 [syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release syzbot
2021-09-22 22:56 ` Thomas Gleixner
2021-09-22 23:27   ` Paul E. McKenney
2021-09-23  3:03     ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.