linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del
@ 2019-09-16 23:39 syzbot
  2019-09-17  1:58 ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2019-09-16 23:39 UTC (permalink / raw)
  To: ast, daniel, davem, dsahern, f.fainelli, hawk, idosch,
	jakub.kicinski, jhs, jiri, jiri, john.fastabend, kafai,
	linux-kernel, netdev, nikolay, petrm, roopa, songliubraving,
	syzkaller-bugs, vladbu, xdp-newbies, xiyou.wangcong, yhs

Hello,

syzbot found the following crash on:

HEAD commit:    1609d760 Merge tag 'for-linus' of git://git.kernel.org/pub..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10236abe600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=ed2b148cd67382ec
dashboard link: https://syzkaller.appspot.com/bug?extid=ac54455281db908c581e
compiler:       clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=116c4b11600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ff270d600000

The bug was bisected to:

commit c266f64dbfa2a970a13b0574246c0ddfec492365
Author: Vlad Buslov <vladbu@mellanox.com>
Date:   Mon Feb 11 08:55:32 2019 +0000

     net: sched: protect block state with mutex

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16e7ca65600000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=15e7ca65600000
console output: https://syzkaller.appspot.com/x/log.txt?x=11e7ca65600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex")

BUG: sleeping function called from invalid context at  
kernel/locking/mutex.c:909
in_atomic(): 1, irqs_disabled(): 0, pid: 9297, name: syz-executor942
INFO: lockdep is turned off.
Preemption disabled at:
[<ffffffff8604de24>] spin_lock_bh include/linux/spinlock.h:343 [inline]
[<ffffffff8604de24>] sch_tree_lock include/net/sch_generic.h:570 [inline]
[<ffffffff8604de24>] sfb_change+0x284/0xd30 net/sched/sch_sfb.c:519
CPU: 0 PID: 9297 Comm: syz-executor942 Not tainted 5.3.0-rc8+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113
  ___might_sleep+0x3ff/0x530 kernel/sched/core.c:6608
  __might_sleep+0x8f/0x100 kernel/sched/core.c:6561
  __mutex_lock_common+0x4e/0x2820 kernel/locking/mutex.c:909
  __mutex_lock kernel/locking/mutex.c:1077 [inline]
  mutex_lock_nested+0x1b/0x30 kernel/locking/mutex.c:1092
  tcf_chain0_head_change_cb_del+0x30/0x390 net/sched/cls_api.c:932
  tcf_block_put_ext+0x3d/0x2a0 net/sched/cls_api.c:1502
  tcf_block_put+0x6e/0x90 net/sched/cls_api.c:1515
  sfb_destroy+0x47/0x70 net/sched/sch_sfb.c:467
  qdisc_destroy+0x147/0x4d0 net/sched/sch_generic.c:968
  qdisc_put+0x58/0x90 net/sched/sch_generic.c:992
  sfb_change+0x52d/0xd30 net/sched/sch_sfb.c:522
  qdisc_change net/sched/sch_api.c:1321 [inline]
  tc_modify_qdisc+0x184d/0x1ea0 net/sched/sch_api.c:1623
  rtnetlink_rcv_msg+0x889/0xd40 net/core/rtnetlink.c:5223
  netlink_rcv_skb+0x19e/0x3d0 net/netlink/af_netlink.c:2477
  rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:5241
  netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
  netlink_unicast+0x787/0x900 net/netlink/af_netlink.c:1328
  netlink_sendmsg+0x993/0xc50 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:637 [inline]
  sock_sendmsg net/socket.c:657 [inline]
  ___sys_sendmsg+0x60d/0x910 net/socket.c:2311
  __sys_sendmsg net/socket.c:2356 [inline]
  __do_sys_sendmsg net/socket.c:2365 [inline]
  __se_sys_sendmsg net/socket.c:2363 [inline]
  __x64_sys_sendmsg+0x17c/0x200 net/socket.c:2363
  do_syscall_64+0xfe/0x140 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x447509
Code: e8 5c 14 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 ab 0e fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f49d6c94db8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc78 RCX: 0000000000447509
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000007
RBP: 00000000006dcc70 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000006dcc7c
R13: 00007ffc5c2e9dff R14: 00007f49d6c959c0 R15: 000000000000002d


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del
  2019-09-16 23:39 BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del syzbot
@ 2019-09-17  1:58 ` Cong Wang
  2019-09-17  8:27   ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2019-09-17  1:58 UTC (permalink / raw)
  To: syzbot
  Cc: Alexei Starovoitov, Daniel Borkmann, David Miller, David Ahern,
	Florian Fainelli, hawk, Ido Schimmel, Jakub Kicinski,
	Jamal Hadi Salim, Jiri Pirko, Jiri Pirko, John Fastabend,
	Martin KaFai Lau, LKML, Linux Kernel Network Developers,
	Nikolay Aleksandrov, petrm, Roopa Prabhu, Song Liu,
	syzkaller-bugs, Vlad Buslov, xdp-newbies, yhs

On Mon, Sep 16, 2019 at 4:39 PM syzbot
<syzbot+ac54455281db908c581e@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    1609d760 Merge tag 'for-linus' of git://git.kernel.org/pub..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10236abe600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ed2b148cd67382ec
> dashboard link: https://syzkaller.appspot.com/bug?extid=ac54455281db908c581e
> compiler:       clang version 9.0.0 (/home/glider/llvm/clang
> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=116c4b11600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ff270d600000
>
> The bug was bisected to:
>
> commit c266f64dbfa2a970a13b0574246c0ddfec492365
> Author: Vlad Buslov <vladbu@mellanox.com>
> Date:   Mon Feb 11 08:55:32 2019 +0000
>
>      net: sched: protect block state with mutex
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16e7ca65600000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15e7ca65600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=11e7ca65600000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
> Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex")
>
> BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:909
> in_atomic(): 1, irqs_disabled(): 0, pid: 9297, name: syz-executor942
> INFO: lockdep is turned off.
> Preemption disabled at:
> [<ffffffff8604de24>] spin_lock_bh include/linux/spinlock.h:343 [inline]
> [<ffffffff8604de24>] sch_tree_lock include/net/sch_generic.h:570 [inline]
> [<ffffffff8604de24>] sfb_change+0x284/0xd30 net/sched/sch_sfb.c:519
> CPU: 0 PID: 9297 Comm: syz-executor942 Not tainted 5.3.0-rc8+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113
>   ___might_sleep+0x3ff/0x530 kernel/sched/core.c:6608
>   __might_sleep+0x8f/0x100 kernel/sched/core.c:6561
>   __mutex_lock_common+0x4e/0x2820 kernel/locking/mutex.c:909
>   __mutex_lock kernel/locking/mutex.c:1077 [inline]
>   mutex_lock_nested+0x1b/0x30 kernel/locking/mutex.c:1092
>   tcf_chain0_head_change_cb_del+0x30/0x390 net/sched/cls_api.c:932
>   tcf_block_put_ext+0x3d/0x2a0 net/sched/cls_api.c:1502
>   tcf_block_put+0x6e/0x90 net/sched/cls_api.c:1515
>   sfb_destroy+0x47/0x70 net/sched/sch_sfb.c:467
>   qdisc_destroy+0x147/0x4d0 net/sched/sch_generic.c:968
>   qdisc_put+0x58/0x90 net/sched/sch_generic.c:992
>   sfb_change+0x52d/0xd30 net/sched/sch_sfb.c:522

I don't think we have to hold the qdisc tree lock when destroying
the old qdisc. Does the following change make sense?

diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index 1dff8506a715..726d0fa956b1 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -488,7 +488,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,
                      struct netlink_ext_ack *extack)
 {
        struct sfb_sched_data *q = qdisc_priv(sch);
-       struct Qdisc *child;
+       struct Qdisc *child, *tmp;
        struct nlattr *tb[TCA_SFB_MAX + 1];
        const struct tc_sfb_qopt *ctl = &sfb_default_ops;
        u32 limit;
@@ -519,7 +519,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,
        sch_tree_lock(sch);

        qdisc_tree_flush_backlog(q->qdisc);
-       qdisc_put(q->qdisc);
+       tmp = q->qdisc;
        q->qdisc = child;

        q->rehash_interval = msecs_to_jiffies(ctl->rehash_interval);
@@ -543,6 +543,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,

        sch_tree_unlock(sch);

+       qdisc_put(tmp);
        return 0;
 }


What do you think, Vlad?

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del
  2019-09-17  1:58 ` Cong Wang
@ 2019-09-17  8:27   ` Vlad Buslov
  2019-09-17 17:03     ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Vlad Buslov @ 2019-09-17  8:27 UTC (permalink / raw)
  To: Cong Wang
  Cc: syzbot, Alexei Starovoitov, Daniel Borkmann, David Miller,
	David Ahern, Florian Fainelli, hawk, Ido Schimmel,
	Jakub Kicinski, Jamal Hadi Salim, Jiri Pirko, Jiri Pirko,
	John Fastabend, Martin KaFai Lau, LKML,
	Linux Kernel Network Developers, Nikolay Aleksandrov,
	Petr Machata, Roopa Prabhu, Song Liu, syzkaller-bugs,
	Vlad Buslov, xdp-newbies, yhs


On Tue 17 Sep 2019 at 04:58, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Sep 16, 2019 at 4:39 PM syzbot
> <syzbot+ac54455281db908c581e@syzkaller.appspotmail.com> wrote:
>>
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    1609d760 Merge tag 'for-linus' of git://git.kernel.org/pub..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=10236abe600000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=ed2b148cd67382ec
>> dashboard link: https://syzkaller.appspot.com/bug?extid=ac54455281db908c581e
>> compiler:       clang version 9.0.0 (/home/glider/llvm/clang
>> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=116c4b11600000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ff270d600000
>>
>> The bug was bisected to:
>>
>> commit c266f64dbfa2a970a13b0574246c0ddfec492365
>> Author: Vlad Buslov <vladbu@mellanox.com>
>> Date:   Mon Feb 11 08:55:32 2019 +0000
>>
>>      net: sched: protect block state with mutex
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16e7ca65600000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15e7ca65600000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11e7ca65600000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
>> Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex")
>>
>> BUG: sleeping function called from invalid context at
>> kernel/locking/mutex.c:909
>> in_atomic(): 1, irqs_disabled(): 0, pid: 9297, name: syz-executor942
>> INFO: lockdep is turned off.
>> Preemption disabled at:
>> [<ffffffff8604de24>] spin_lock_bh include/linux/spinlock.h:343 [inline]
>> [<ffffffff8604de24>] sch_tree_lock include/net/sch_generic.h:570 [inline]
>> [<ffffffff8604de24>] sfb_change+0x284/0xd30 net/sched/sch_sfb.c:519
>> CPU: 0 PID: 9297 Comm: syz-executor942 Not tainted 5.3.0-rc8+ #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>   __dump_stack lib/dump_stack.c:77 [inline]
>>   dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113
>>   ___might_sleep+0x3ff/0x530 kernel/sched/core.c:6608
>>   __might_sleep+0x8f/0x100 kernel/sched/core.c:6561
>>   __mutex_lock_common+0x4e/0x2820 kernel/locking/mutex.c:909
>>   __mutex_lock kernel/locking/mutex.c:1077 [inline]
>>   mutex_lock_nested+0x1b/0x30 kernel/locking/mutex.c:1092
>>   tcf_chain0_head_change_cb_del+0x30/0x390 net/sched/cls_api.c:932
>>   tcf_block_put_ext+0x3d/0x2a0 net/sched/cls_api.c:1502
>>   tcf_block_put+0x6e/0x90 net/sched/cls_api.c:1515
>>   sfb_destroy+0x47/0x70 net/sched/sch_sfb.c:467
>>   qdisc_destroy+0x147/0x4d0 net/sched/sch_generic.c:968
>>   qdisc_put+0x58/0x90 net/sched/sch_generic.c:992
>>   sfb_change+0x52d/0xd30 net/sched/sch_sfb.c:522
>
> I don't think we have to hold the qdisc tree lock when destroying
> the old qdisc. Does the following change make sense?
>
> diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
> index 1dff8506a715..726d0fa956b1 100644
> --- a/net/sched/sch_sfb.c
> +++ b/net/sched/sch_sfb.c
> @@ -488,7 +488,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,
>                       struct netlink_ext_ack *extack)
>  {
>         struct sfb_sched_data *q = qdisc_priv(sch);
> -       struct Qdisc *child;
> +       struct Qdisc *child, *tmp;
>         struct nlattr *tb[TCA_SFB_MAX + 1];
>         const struct tc_sfb_qopt *ctl = &sfb_default_ops;
>         u32 limit;
> @@ -519,7 +519,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,
>         sch_tree_lock(sch);
>
>         qdisc_tree_flush_backlog(q->qdisc);
> -       qdisc_put(q->qdisc);
> +       tmp = q->qdisc;
>         q->qdisc = child;
>
>         q->rehash_interval = msecs_to_jiffies(ctl->rehash_interval);
> @@ -543,6 +543,7 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt,
>
>         sch_tree_unlock(sch);
>
> +       qdisc_put(tmp);
>         return 0;
>  }
>
>
> What do you think, Vlad?

Hi Cong,

Don't see why we would need qdisc tree lock while releasing the
reference to (or destroying) previous Qdisc. I've skimmed through other
scheds and it looks like sch_multiq, sch_htb and sch_tbf are also
affected. Do you want me to send patches?

Regards,
Vlad

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del
  2019-09-17  8:27   ` Vlad Buslov
@ 2019-09-17 17:03     ` Cong Wang
  2019-09-17 19:57       ` Vlad Buslov
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2019-09-17 17:03 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: syzbot, Alexei Starovoitov, Daniel Borkmann, David Miller,
	David Ahern, Florian Fainelli, hawk, Ido Schimmel,
	Jakub Kicinski, Jamal Hadi Salim, Jiri Pirko, Jiri Pirko,
	John Fastabend, Martin KaFai Lau, LKML,
	Linux Kernel Network Developers, Nikolay Aleksandrov,
	Petr Machata, Roopa Prabhu, Song Liu, syzkaller-bugs,
	xdp-newbies, yhs

On Tue, Sep 17, 2019 at 1:27 AM Vlad Buslov <vladbu@mellanox.com> wrote:
> Hi Cong,
>
> Don't see why we would need qdisc tree lock while releasing the
> reference to (or destroying) previous Qdisc. I've skimmed through other
> scheds and it looks like sch_multiq, sch_htb and sch_tbf are also
> affected. Do you want me to send patches?

Yes, please do.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del
  2019-09-17 17:03     ` Cong Wang
@ 2019-09-17 19:57       ` Vlad Buslov
  0 siblings, 0 replies; 5+ messages in thread
From: Vlad Buslov @ 2019-09-17 19:57 UTC (permalink / raw)
  To: Cong Wang
  Cc: Vlad Buslov, syzbot, Alexei Starovoitov, Daniel Borkmann,
	David Miller, David Ahern, Florian Fainelli, hawk, Ido Schimmel,
	Jakub Kicinski, Jamal Hadi Salim, Jiri Pirko, Jiri Pirko,
	John Fastabend, Martin KaFai Lau, LKML,
	Linux Kernel Network Developers, Nikolay Aleksandrov,
	Petr Machata, Roopa Prabhu, Song Liu, syzkaller-bugs,
	xdp-newbies, yhs


On Tue 17 Sep 2019 at 20:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Tue, Sep 17, 2019 at 1:27 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> Hi Cong,
>>
>> Don't see why we would need qdisc tree lock while releasing the
>> reference to (or destroying) previous Qdisc. I've skimmed through other
>> scheds and it looks like sch_multiq, sch_htb and sch_tbf are also
>> affected. Do you want me to send patches?
>
> Yes, please do.

It looks like tbf is not affected by the bug after all. Relevant part of
code from tbf_change():

	if (q->qdisc != &noop_qdisc) {
		err = fifo_set_limit(q->qdisc, qopt->limit);
		if (err)
			goto done;
	} else if (qopt->limit > 0) {
		child = fifo_create_dflt(sch, &bfifo_qdisc_ops, qopt->limit,
					 extack);
		if (IS_ERR(child)) {
			err = PTR_ERR(child);
			goto done;
		}

		/* child is fifo, no need to check for noop_qdisc */
		qdisc_hash_add(child, true);
	}

	sch_tree_lock(sch);
	if (child) {
		qdisc_tree_flush_backlog(q->qdisc);
		qdisc_put(q->qdisc);
		q->qdisc = child;
	}

It seems that qdisc_put() is redundant here because it is only called
q->qdisc == &noop_qdisc, which is a noop.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-17 19:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16 23:39 BUG: sleeping function called from invalid context in tcf_chain0_head_change_cb_del syzbot
2019-09-17  1:58 ` Cong Wang
2019-09-17  8:27   ` Vlad Buslov
2019-09-17 17:03     ` Cong Wang
2019-09-17 19:57       ` Vlad Buslov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).