* [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue @ 2021-03-09 15:13 syzbot 2021-03-09 15:20 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: syzbot @ 2021-03-09 15:13 UTC (permalink / raw) To: davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong Hello, syzbot found the following issue on: HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 The issue was bisected to: commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 Author: Maxim Mikityanskiy <maximmi@mellanox.com> Date: Tue Jan 19 12:08:13 2021 +0000 sch_htb: Hierarchical QoS hardware offload bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: htb_offload net/sched/sch_htb.c:1011 [inline] htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x466019 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 Modules linked in: CR2: 0000000000000000 ---[ end trace e1544e8206616773 ]--- RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue 2021-03-09 15:13 [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue syzbot @ 2021-03-09 15:20 ` Eric Dumazet 2021-03-10 14:54 ` Maxim Mikityanskiy 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2021-03-09 15:20 UTC (permalink / raw) To: syzbot, davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong On 3/9/21 4:13 PM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f > dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 > > The issue was bisected to: > > commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 > Author: Maxim Mikityanskiy <maximmi@mellanox.com> > Date: Tue Jan 19 12:08:13 2021 +0000 > > sch_htb: Hierarchical QoS hardware offload > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 > final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com > Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > #PF: supervisor instruction fetch in kernel mode > #PF: error_code(0x0010) - not-present page > PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 > Oops: 0010 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > RIP: 0010:0x0 > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. > RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 > RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 > RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 > R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 > R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 > FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > htb_offload net/sched/sch_htb.c:1011 [inline] > htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 > tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 > rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 > netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 > netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] > netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 > netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 > sock_sendmsg_nosec net/socket.c:652 [inline] > sock_sendmsg+0xcf/0x120 net/socket.c:672 > ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 > ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 > __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x466019 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 > RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 > RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 > R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 > Modules linked in: > CR2: 0000000000000000 > ---[ end trace e1544e8206616773 ]--- > RIP: 0010:0x0 > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. > RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 > RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 > RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 > R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 > R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 > FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > syzbot can test patches for this issue, for details see: > https://goo.gl/tpsmEJ#testing-patches > Hmm... what about this : diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index f87d07736a1404edcfd17a792321758cd4bdd173..680afb5bfe2294a5531c7aaeed698b95ea3ab20c 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1651,15 +1651,16 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, err = -ENOENT; } } else { - struct netdev_queue *dev_queue; + struct netdev_queue *dev_queue = NULL; if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue) dev_queue = p->ops->cl_ops->select_queue(p, tcm); - else if (p) - dev_queue = p->dev_queue; - else - dev_queue = netdev_get_tx_queue(dev, 0); - + if (!dev_queue) { + if (p) + dev_queue = p->dev_queue; + else + dev_queue = netdev_get_tx_queue(dev, 0); + } q = qdisc_create(dev, dev_queue, p, tcm->tcm_parent, tcm->tcm_handle, tca, &err, extack); diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..cc6eccd688701ae00255f07e32fb4b0efbaf45ce 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1008,6 +1008,8 @@ static void htb_set_lockdep_class_child(struct Qdisc *q) static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt) { + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) + return -EOPNOTSUPP; return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_HTB, opt); } ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue 2021-03-09 15:20 ` Eric Dumazet @ 2021-03-10 14:54 ` Maxim Mikityanskiy 2021-03-10 17:03 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Maxim Mikityanskiy @ 2021-03-10 14:54 UTC (permalink / raw) To: Eric Dumazet, syzbot, davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong On 2021-03-09 17:20, Eric Dumazet wrote: > > > On 3/9/21 4:13 PM, syzbot wrote: >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() >> git tree: net-next >> console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f >> dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 >> >> The issue was bisected to: >> >> commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 >> Author: Maxim Mikityanskiy <maximmi@mellanox.com> >> Date: Tue Jan 19 12:08:13 2021 +0000 >> >> sch_htb: Hierarchical QoS hardware offload >> >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 >> final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 >> console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com >> Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") >> >> BUG: kernel NULL pointer dereference, address: 0000000000000000 >> #PF: supervisor instruction fetch in kernel mode >> #PF: error_code(0x0010) - not-present page >> PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 >> Oops: 0010 [#1] PREEMPT SMP KASAN >> CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> RIP: 0010:0x0 >> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >> FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> htb_offload net/sched/sch_htb.c:1011 [inline] >> htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 >> tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 >> rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 >> netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 >> netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] >> netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 >> netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 >> sock_sendmsg_nosec net/socket.c:652 [inline] >> sock_sendmsg+0xcf/0x120 net/socket.c:672 >> ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 >> ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 >> __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 >> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 >> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> RIP: 0033:0x466019 >> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 >> RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e >> RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 >> RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 >> RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 >> R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 >> Modules linked in: >> CR2: 0000000000000000 >> ---[ end trace e1544e8206616773 ]--- >> RIP: 0010:0x0 >> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >> FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> >> >> --- >> This report is generated by a bot. It may contain errors. >> See https://goo.gl/tpsmEJ for more information about syzbot. >> syzbot engineers can be reached at syzkaller@googlegroups.com. >> >> syzbot will keep track of this issue. See: >> https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >> syzbot can test patches for this issue, for details see: >> https://goo.gl/tpsmEJ#testing-patches >> > > > Hmm... what about this : > > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c > index f87d07736a1404edcfd17a792321758cd4bdd173..680afb5bfe2294a5531c7aaeed698b95ea3ab20c 100644 > --- a/net/sched/sch_api.c > +++ b/net/sched/sch_api.c > @@ -1651,15 +1651,16 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, > err = -ENOENT; > } > } else { > - struct netdev_queue *dev_queue; > + struct netdev_queue *dev_queue = NULL; > > if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue) > dev_queue = p->ops->cl_ops->select_queue(p, tcm); > - else if (p) > - dev_queue = p->dev_queue; > - else > - dev_queue = netdev_get_tx_queue(dev, 0); > - > + if (!dev_queue) { > + if (p) > + dev_queue = p->dev_queue; > + else > + dev_queue = netdev_get_tx_queue(dev, 0); > + } > q = qdisc_create(dev, dev_queue, p, > tcm->tcm_parent, tcm->tcm_handle, > tca, &err, extack); > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..cc6eccd688701ae00255f07e32fb4b0efbaf45ce 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -1008,6 +1008,8 @@ static void htb_set_lockdep_class_child(struct Qdisc *q) > > static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt) > { > + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) > + return -EOPNOTSUPP; My fault, all calls to htb_offload must be protected by if (q->offload). Rather than checking tc_can_offload and ndo_setup_tc in htb_offload every time, I suggest to fix htb_select_queue: diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index dff3adf5a915..b23203159996 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1340,8 +1340,12 @@ htb_select_queue(struct Qdisc *sch, struct tcmsg *tcm) { struct net_device *dev = qdisc_dev(sch); struct tc_htb_qopt_offload offload_opt; + struct htb_sched *q = qdisc_priv(sch); int err; + if (!q->offload) + return sch->dev_queue; + offload_opt = (struct tc_htb_qopt_offload) { .command = TC_HTB_LEAF_QUERY_QUEUE, .classid = TC_H_MIN(tcm->tcm_parent), htb_init ensures that tc_can_offload and ndo_setup_tc are checked if q->offload is true. Also, we can avoid changing tc_modify_qdisc if htb_select_queue mimics its behavior in non-offload mode, as shown above. There is also a case where htb_select_queue returns NULL on errors, and that is handled in qdisc_create (the error message will be "No device queue given"), which I think is a sane behavior. What do you think of this fix? If it fits, I'll send it as a patch. > return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_HTB, opt); > } > > ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue 2021-03-10 14:54 ` Maxim Mikityanskiy @ 2021-03-10 17:03 ` Eric Dumazet 2021-03-10 18:55 ` Maxim Mikityanskiy 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2021-03-10 17:03 UTC (permalink / raw) To: Maxim Mikityanskiy, Eric Dumazet, syzbot, davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: > On 2021-03-09 17:20, Eric Dumazet wrote: >> >> >> On 3/9/21 4:13 PM, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() >>> git tree: net-next >>> console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f >>> dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 >>> >>> The issue was bisected to: >>> >>> commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 >>> Author: Maxim Mikityanskiy <maximmi@mellanox.com> >>> Date: Tue Jan 19 12:08:13 2021 +0000 >>> >>> sch_htb: Hierarchical QoS hardware offload >>> >>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 >>> final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 >>> console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com >>> Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") >>> >>> BUG: kernel NULL pointer dereference, address: 0000000000000000 >>> #PF: supervisor instruction fetch in kernel mode >>> #PF: error_code(0x0010) - not-present page >>> PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 >>> Oops: 0010 [#1] PREEMPT SMP KASAN >>> CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> RIP: 0010:0x0 >>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>> FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> htb_offload net/sched/sch_htb.c:1011 [inline] >>> htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 >>> tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 >>> rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 >>> netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 >>> netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] >>> netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 >>> netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 >>> sock_sendmsg_nosec net/socket.c:652 [inline] >>> sock_sendmsg+0xcf/0x120 net/socket.c:672 >>> ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 >>> ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 >>> __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 >>> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> RIP: 0033:0x466019 >>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e >>> RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 >>> RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 >>> RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 >>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 >>> R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 >>> Modules linked in: >>> CR2: 0000000000000000 >>> ---[ end trace e1544e8206616773 ]--- >>> RIP: 0010:0x0 >>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>> FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> >>> >>> --- >>> This report is generated by a bot. It may contain errors. >>> See https://goo.gl/tpsmEJ for more information about syzbot. >>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>> >>> syzbot will keep track of this issue. See: >>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >>> syzbot can test patches for this issue, for details see: >>> https://goo.gl/tpsmEJ#testing-patches >>> >> >> >> Hmm... what about this : >> >> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c >> index f87d07736a1404edcfd17a792321758cd4bdd173..680afb5bfe2294a5531c7aaeed698b95ea3ab20c 100644 >> --- a/net/sched/sch_api.c >> +++ b/net/sched/sch_api.c >> @@ -1651,15 +1651,16 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, >> err = -ENOENT; >> } >> } else { >> - struct netdev_queue *dev_queue; >> + struct netdev_queue *dev_queue = NULL; >> if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue) >> dev_queue = p->ops->cl_ops->select_queue(p, tcm); >> - else if (p) >> - dev_queue = p->dev_queue; >> - else >> - dev_queue = netdev_get_tx_queue(dev, 0); >> - >> + if (!dev_queue) { >> + if (p) >> + dev_queue = p->dev_queue; >> + else >> + dev_queue = netdev_get_tx_queue(dev, 0); >> + } >> q = qdisc_create(dev, dev_queue, p, >> tcm->tcm_parent, tcm->tcm_handle, >> tca, &err, extack); >> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >> index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..cc6eccd688701ae00255f07e32fb4b0efbaf45ce 100644 >> --- a/net/sched/sch_htb.c >> +++ b/net/sched/sch_htb.c >> @@ -1008,6 +1008,8 @@ static void htb_set_lockdep_class_child(struct Qdisc *q) >> static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt) >> { >> + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) >> + return -EOPNOTSUPP; > > My fault, all calls to htb_offload must be protected by if (q->offload). Rather than checking tc_can_offload and ndo_setup_tc in htb_offload every time, I suggest to fix htb_select_queue: > > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index dff3adf5a915..b23203159996 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -1340,8 +1340,12 @@ htb_select_queue(struct Qdisc *sch, struct tcmsg *tcm) > { > struct net_device *dev = qdisc_dev(sch); > struct tc_htb_qopt_offload offload_opt; > + struct htb_sched *q = qdisc_priv(sch); > int err; > > + if (!q->offload) > + return sch->dev_queue; > + > offload_opt = (struct tc_htb_qopt_offload) { > .command = TC_HTB_LEAF_QUERY_QUEUE, > .classid = TC_H_MIN(tcm->tcm_parent), > > htb_init ensures that tc_can_offload and ndo_setup_tc are checked if q->offload is true. Also, we can avoid changing tc_modify_qdisc if htb_select_queue mimics its behavior in non-offload mode, as shown above. > > There is also a case where htb_select_queue returns NULL on errors, and that is handled in qdisc_create (the error message will be "No device queue given"), which I think is a sane behavior. > > What do you think of this fix? If it fits, I'll send it as a patch. I think that it is not enough, since you overwrite q->offload in htb_init() even if an error will be provided. So a malicious user will find its way. You probably also need this : diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..d15ee7cf33b34221d09dfc81105dcb6c2b2fd489 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1020,6 +1020,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, struct nlattr *tb[TCA_HTB_MAX + 1]; struct tc_htb_glob *gopt; unsigned int ntx; + bool offload; int err; qdisc_watchdog_init(&q->watchdog, sch); @@ -1044,9 +1045,9 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, if (gopt->version != HTB_VER >> 16) return -EINVAL; - q->offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); + offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); - if (q->offload) { + if (offload) { if (sch->parent != TC_H_ROOT) return -EOPNOTSUPP; @@ -1060,6 +1061,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, if (!q->direct_qdiscs) return -ENOMEM; } + q->offload = offload; err = qdisc_class_hash_init(&q->clhash); if (err < 0) ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue 2021-03-10 17:03 ` Eric Dumazet @ 2021-03-10 18:55 ` Maxim Mikityanskiy 2021-03-10 19:39 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Maxim Mikityanskiy @ 2021-03-10 18:55 UTC (permalink / raw) To: Eric Dumazet, syzbot, davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong On 2021-03-10 19:03, Eric Dumazet wrote: > > > On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: >> On 2021-03-09 17:20, Eric Dumazet wrote: >>> >>> >>> On 3/9/21 4:13 PM, syzbot wrote: >>>> Hello, >>>> >>>> syzbot found the following issue on: >>>> >>>> HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() >>>> git tree: net-next >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 >>>> >>>> The issue was bisected to: >>>> >>>> commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 >>>> Author: Maxim Mikityanskiy <maximmi@mellanox.com> >>>> Date: Tue Jan 19 12:08:13 2021 +0000 >>>> >>>> sch_htb: Hierarchical QoS hardware offload >>>> >>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 >>>> final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 >>>> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>> Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com >>>> Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") >>>> >>>> BUG: kernel NULL pointer dereference, address: 0000000000000000 >>>> #PF: supervisor instruction fetch in kernel mode >>>> #PF: error_code(0x0010) - not-present page >>>> PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 >>>> Oops: 0010 [#1] PREEMPT SMP KASAN >>>> CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 >>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>> RIP: 0010:0x0 >>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>>> FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>>> Call Trace: >>>> htb_offload net/sched/sch_htb.c:1011 [inline] >>>> htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 >>>> tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 >>>> rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 >>>> netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 >>>> netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] >>>> netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 >>>> netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 >>>> sock_sendmsg_nosec net/socket.c:652 [inline] >>>> sock_sendmsg+0xcf/0x120 net/socket.c:672 >>>> ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 >>>> ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 >>>> __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 >>>> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 >>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>> RIP: 0033:0x466019 >>>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 >>>> RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e >>>> RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 >>>> RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 >>>> RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 >>>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 >>>> R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 >>>> Modules linked in: >>>> CR2: 0000000000000000 >>>> ---[ end trace e1544e8206616773 ]--- >>>> RIP: 0010:0x0 >>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>>> FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>>> >>>> >>>> --- >>>> This report is generated by a bot. It may contain errors. >>>> See https://goo.gl/tpsmEJ for more information about syzbot. >>>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>>> >>>> syzbot will keep track of this issue. See: >>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >>>> syzbot can test patches for this issue, for details see: >>>> https://goo.gl/tpsmEJ#testing-patches >>>> >>> >>> >>> Hmm... what about this : >>> >>> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c >>> index f87d07736a1404edcfd17a792321758cd4bdd173..680afb5bfe2294a5531c7aaeed698b95ea3ab20c 100644 >>> --- a/net/sched/sch_api.c >>> +++ b/net/sched/sch_api.c >>> @@ -1651,15 +1651,16 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, >>> err = -ENOENT; >>> } >>> } else { >>> - struct netdev_queue *dev_queue; >>> + struct netdev_queue *dev_queue = NULL; >>> if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue) >>> dev_queue = p->ops->cl_ops->select_queue(p, tcm); >>> - else if (p) >>> - dev_queue = p->dev_queue; >>> - else >>> - dev_queue = netdev_get_tx_queue(dev, 0); >>> - >>> + if (!dev_queue) { >>> + if (p) >>> + dev_queue = p->dev_queue; >>> + else >>> + dev_queue = netdev_get_tx_queue(dev, 0); >>> + } >>> q = qdisc_create(dev, dev_queue, p, >>> tcm->tcm_parent, tcm->tcm_handle, >>> tca, &err, extack); >>> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >>> index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..cc6eccd688701ae00255f07e32fb4b0efbaf45ce 100644 >>> --- a/net/sched/sch_htb.c >>> +++ b/net/sched/sch_htb.c >>> @@ -1008,6 +1008,8 @@ static void htb_set_lockdep_class_child(struct Qdisc *q) >>> static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt) >>> { >>> + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) >>> + return -EOPNOTSUPP; >> >> My fault, all calls to htb_offload must be protected by if (q->offload). Rather than checking tc_can_offload and ndo_setup_tc in htb_offload every time, I suggest to fix htb_select_queue: >> >> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >> index dff3adf5a915..b23203159996 100644 >> --- a/net/sched/sch_htb.c >> +++ b/net/sched/sch_htb.c >> @@ -1340,8 +1340,12 @@ htb_select_queue(struct Qdisc *sch, struct tcmsg *tcm) >> { >> struct net_device *dev = qdisc_dev(sch); >> struct tc_htb_qopt_offload offload_opt; >> + struct htb_sched *q = qdisc_priv(sch); >> int err; >> >> + if (!q->offload) >> + return sch->dev_queue; >> + >> offload_opt = (struct tc_htb_qopt_offload) { >> .command = TC_HTB_LEAF_QUERY_QUEUE, >> .classid = TC_H_MIN(tcm->tcm_parent), >> >> htb_init ensures that tc_can_offload and ndo_setup_tc are checked if q->offload is true. Also, we can avoid changing tc_modify_qdisc if htb_select_queue mimics its behavior in non-offload mode, as shown above. >> >> There is also a case where htb_select_queue returns NULL on errors, and that is handled in qdisc_create (the error message will be "No device queue given"), which I think is a sane behavior. >> >> What do you think of this fix? If it fits, I'll send it as a patch. > > > I think that it is not enough, since you overwrite q->offload in htb_init() > even if an error will be provided. > > So a malicious user will find its way. I doubt that, because if htb_init returns an error, the qdisc gets destroyed immediately (well, after a call to htb_destroy), and I believe all these operations are protected by RTNL, so a malicious user has no way to insert a call to another callback. > You probably also need this : However, I'll likely need something like this anyway, because HTB must not call ndo_setup_tc on destroy if it didn't call it on init. It may crash in a similar way if ndo_setup_tc is not implemented. Thanks for helping me spot that - if you don't mind, I'll base my second patch on your code below. > > > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..d15ee7cf33b34221d09dfc81105dcb6c2b2fd489 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -1020,6 +1020,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, > struct nlattr *tb[TCA_HTB_MAX + 1]; > struct tc_htb_glob *gopt; > unsigned int ntx; > + bool offload; > int err; > > qdisc_watchdog_init(&q->watchdog, sch); > @@ -1044,9 +1045,9 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, > if (gopt->version != HTB_VER >> 16) > return -EINVAL; > > - q->offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); > + offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); > > - if (q->offload) { > + if (offload) { > if (sch->parent != TC_H_ROOT) > return -EOPNOTSUPP; > > @@ -1060,6 +1061,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, > if (!q->direct_qdiscs) > return -ENOMEM; > } > + q->offload = offload; > > err = qdisc_class_hash_init(&q->clhash); > if (err < 0) > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue 2021-03-10 18:55 ` Maxim Mikityanskiy @ 2021-03-10 19:39 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2021-03-10 19:39 UTC (permalink / raw) To: Maxim Mikityanskiy, Eric Dumazet, syzbot, davem, jhs, jiri, kuba, linux-kernel, maximmi, netdev, syzkaller-bugs, tariqt, xiyou.wangcong On 3/10/21 7:55 PM, Maxim Mikityanskiy wrote: > On 2021-03-10 19:03, Eric Dumazet wrote: >> >> >> On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: >>> On 2021-03-09 17:20, Eric Dumazet wrote: >>>> >>>> >>>> On 3/9/21 4:13 PM, syzbot wrote: >>>>> Hello, >>>>> >>>>> syzbot found the following issue on: >>>>> >>>>> HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() >>>>> git tree: net-next >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d00000 >>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f >>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c >>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd00000 >>>>> >>>>> The issue was bisected to: >>>>> >>>>> commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 >>>>> Author: Maxim Mikityanskiy <maximmi@mellanox.com> >>>>> Date: Tue Jan 19 12:08:13 2021 +0000 >>>>> >>>>> sch_htb: Hierarchical QoS hardware offload >>>>> >>>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd00000 >>>>> final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd00000 >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd00000 >>>>> >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>> Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com >>>>> Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") >>>>> >>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000 >>>>> #PF: supervisor instruction fetch in kernel mode >>>>> #PF: error_code(0x0010) - not-present page >>>>> PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 >>>>> Oops: 0010 [#1] PREEMPT SMP KASAN >>>>> CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 >>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>> RIP: 0010:0x0 >>>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>>>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>>>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>>>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>>>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>>>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>>>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>>>> FS: 00007f73f9698700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 >>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506f0 >>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>>>> Call Trace: >>>>> htb_offload net/sched/sch_htb.c:1011 [inline] >>>>> htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 >>>>> tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 >>>>> rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 >>>>> netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 >>>>> netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] >>>>> netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 >>>>> netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 >>>>> sock_sendmsg_nosec net/socket.c:652 [inline] >>>>> sock_sendmsg+0xcf/0x120 net/socket.c:672 >>>>> ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 >>>>> ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 >>>>> __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 >>>>> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 >>>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>>> RIP: 0033:0x466019 >>>>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 >>>>> RSP: 002b:00007f73f9698188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e >>>>> RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466019 >>>>> RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004 >>>>> RBP: 00000000004bd067 R08: 0000000000000000 R09: 0000000000000000 >>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 >>>>> R13: 00007fffefccc11f R14: 00007f73f9698300 R15: 0000000000022000 >>>>> Modules linked in: >>>>> CR2: 0000000000000000 >>>>> ---[ end trace e1544e8206616773 ]--- >>>>> RIP: 0010:0x0 >>>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. >>>>> RSP: 0018:ffffc9000a9c74e8 EFLAGS: 00010246 >>>>> RAX: dffffc0000000000 RBX: 1ffff92001538e9e RCX: 0000000000000000 >>>>> RDX: ffffc9000a9c7520 RSI: 0000000000000012 RDI: ffff88802d158000 >>>>> RBP: ffff88802d158000 R08: 00000000fffffff1 R09: 0000000000000400 >>>>> R10: ffffffff871631c4 R11: 0000000000000000 R12: ffffffff89ea6b40 >>>>> R13: dffffc0000000000 R14: ffff888012b79c00 R15: 00000000ffff0000 >>>>> FS: 00007f73f9698700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 >>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> CR2: ffffffffffffffd6 CR3: 00000000173b5000 CR4: 00000000001506e0 >>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>>>> >>>>> >>>>> --- >>>>> This report is generated by a bot. It may contain errors. >>>>> See https://goo.gl/tpsmEJ for more information about syzbot. >>>>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>>>> >>>>> syzbot will keep track of this issue. See: >>>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >>>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >>>>> syzbot can test patches for this issue, for details see: >>>>> https://goo.gl/tpsmEJ#testing-patches >>>>> >>>> >>>> >>>> Hmm... what about this : >>>> >>>> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c >>>> index f87d07736a1404edcfd17a792321758cd4bdd173..680afb5bfe2294a5531c7aaeed698b95ea3ab20c 100644 >>>> --- a/net/sched/sch_api.c >>>> +++ b/net/sched/sch_api.c >>>> @@ -1651,15 +1651,16 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, >>>> err = -ENOENT; >>>> } >>>> } else { >>>> - struct netdev_queue *dev_queue; >>>> + struct netdev_queue *dev_queue = NULL; >>>> if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue) >>>> dev_queue = p->ops->cl_ops->select_queue(p, tcm); >>>> - else if (p) >>>> - dev_queue = p->dev_queue; >>>> - else >>>> - dev_queue = netdev_get_tx_queue(dev, 0); >>>> - >>>> + if (!dev_queue) { >>>> + if (p) >>>> + dev_queue = p->dev_queue; >>>> + else >>>> + dev_queue = netdev_get_tx_queue(dev, 0); >>>> + } >>>> q = qdisc_create(dev, dev_queue, p, >>>> tcm->tcm_parent, tcm->tcm_handle, >>>> tca, &err, extack); >>>> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >>>> index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..cc6eccd688701ae00255f07e32fb4b0efbaf45ce 100644 >>>> --- a/net/sched/sch_htb.c >>>> +++ b/net/sched/sch_htb.c >>>> @@ -1008,6 +1008,8 @@ static void htb_set_lockdep_class_child(struct Qdisc *q) >>>> static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt) >>>> { >>>> + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) >>>> + return -EOPNOTSUPP; >>> >>> My fault, all calls to htb_offload must be protected by if (q->offload). Rather than checking tc_can_offload and ndo_setup_tc in htb_offload every time, I suggest to fix htb_select_queue: >>> >>> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >>> index dff3adf5a915..b23203159996 100644 >>> --- a/net/sched/sch_htb.c >>> +++ b/net/sched/sch_htb.c >>> @@ -1340,8 +1340,12 @@ htb_select_queue(struct Qdisc *sch, struct tcmsg *tcm) >>> { >>> struct net_device *dev = qdisc_dev(sch); >>> struct tc_htb_qopt_offload offload_opt; >>> + struct htb_sched *q = qdisc_priv(sch); >>> int err; >>> >>> + if (!q->offload) >>> + return sch->dev_queue; >>> + >>> offload_opt = (struct tc_htb_qopt_offload) { >>> .command = TC_HTB_LEAF_QUERY_QUEUE, >>> .classid = TC_H_MIN(tcm->tcm_parent), >>> >>> htb_init ensures that tc_can_offload and ndo_setup_tc are checked if q->offload is true. Also, we can avoid changing tc_modify_qdisc if htb_select_queue mimics its behavior in non-offload mode, as shown above. >>> >>> There is also a case where htb_select_queue returns NULL on errors, and that is handled in qdisc_create (the error message will be "No device queue given"), which I think is a sane behavior. >>> >>> What do you think of this fix? If it fits, I'll send it as a patch. >> >> >> I think that it is not enough, since you overwrite q->offload in htb_init() >> even if an error will be provided. >> >> So a malicious user will find its way. > > I doubt that, because if htb_init returns an error, the qdisc gets destroyed immediately (well, after a call to htb_destroy), and I believe all these operations are protected by RTNL, so a malicious user has no way to insert a call to another callback. Yes, but htb_destroy() will crash, as I tried to point out ;) > >> You probably also need this : > > However, I'll likely need something like this anyway, because HTB must not call ndo_setup_tc on destroy if it didn't call it on init. It may crash in a similar way if ndo_setup_tc is not implemented. Thanks for helping me spot that - if you don't mind, I'll base my second patch on your code below. Sure. > >> >> >> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >> index dff3adf5a9156c2412c64a10ad1b2ce9e1367433..d15ee7cf33b34221d09dfc81105dcb6c2b2fd489 100644 >> --- a/net/sched/sch_htb.c >> +++ b/net/sched/sch_htb.c >> @@ -1020,6 +1020,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, >> struct nlattr *tb[TCA_HTB_MAX + 1]; >> struct tc_htb_glob *gopt; >> unsigned int ntx; >> + bool offload; >> int err; >> qdisc_watchdog_init(&q->watchdog, sch); >> @@ -1044,9 +1045,9 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, >> if (gopt->version != HTB_VER >> 16) >> return -EINVAL; >> - q->offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); >> + offload = nla_get_flag(tb[TCA_HTB_OFFLOAD]); >> - if (q->offload) { >> + if (offload) { >> if (sch->parent != TC_H_ROOT) >> return -EOPNOTSUPP; >> @@ -1060,6 +1061,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, >> if (!q->direct_qdiscs) >> return -ENOMEM; >> } >> + q->offload = offload; >> err = qdisc_class_hash_init(&q->clhash); >> if (err < 0) >> > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-10 19:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-09 15:13 [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue syzbot 2021-03-09 15:20 ` Eric Dumazet 2021-03-10 14:54 ` Maxim Mikityanskiy 2021-03-10 17:03 ` Eric Dumazet 2021-03-10 18:55 ` Maxim Mikityanskiy 2021-03-10 19:39 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).