linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: rcu detected stall in tc_modify_qdisc
@ 2020-07-29  5:53 syzbot
  2020-07-29  7:28 ` 回复: " Zhang, Qiang
  0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2020-07-29  5:53 UTC (permalink / raw)
  To: davem, fweisbec, jhs, jiri, linux-kernel, mingo, netdev,
	syzkaller-bugs, tglx, vinicius.gomes, xiyou.wangcong

Hello,

syzbot found the following issue on:

HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
git tree:       net
console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15b2d790900000

The issue was bisected to:

commit 5a781ccbd19e4664babcbe4b4ead7aa2b9283d22
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date:   Sat Sep 29 00:59:43 2018 +0000

    tc: Add support for configuring the taprio scheduler

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=160e1bac900000
console output: https://syzkaller.appspot.com/x/log.txt?x=110e1bac900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-...!: (1 GPs behind) idle=6f6/1/0x4000000000000000 softirq=10195/10196 fqs=1 
	(t=27930 jiffies g=9233 q=413)
rcu: rcu_preempt kthread starved for 27901 jiffies! g9233 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_preempt     R  running task    29112    10      2 0x00004000
Call Trace:
 context_switch kernel/sched/core.c:3458 [inline]
 __schedule+0x8ea/0x2210 kernel/sched/core.c:4219
 schedule+0xd0/0x2a0 kernel/sched/core.c:4294
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1908
 rcu_gp_fqs_loop kernel/rcu/tree.c:1874 [inline]
 rcu_gp_kthread+0xae5/0x1b50 kernel/rcu/tree.c:2044
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
NMI backtrace for cpu 1
CPU: 1 PID: 6799 Comm: syz-executor494 Not tainted 5.8.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x1b3/0x223 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x194/0x1cf kernel/rcu/tree_stall.h:320
 print_cpu_stall kernel/rcu/tree_stall.h:553 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:627 [inline]
 rcu_pending kernel/rcu/tree.c:3489 [inline]
 rcu_sched_clock_irq.cold+0x5b3/0xccc kernel/rcu/tree.c:2504
 update_process_times+0x25/0x60 kernel/time/timer.c:1737
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
 tick_sched_timer+0x108/0x290 kernel/time/tick-sched.c:1320
 __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
 __hrtimer_run_queues+0x1d5/0xfc0 kernel/time/hrtimer.c:1584
 hrtimer_interrupt+0x32a/0x930 kernel/time/hrtimer.c:1646
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
 __sysvec_apic_timer_interrupt+0x142/0x5e0 arch/x86/kernel/apic/apic.c:1097
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_apic_timer_interrupt+0xe0/0x120 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:585
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:770 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
Code: 48 c7 c0 88 e0 b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d e3 52 cc 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 35 e5 66 f9 65 8b 05 fe 70 19 78
RSP: 0018:ffffc900016672c0 EFLAGS: 00000282
RAX: 1ffffffff1369c11 RBX: 0000000000000282 RCX: 0000000000000002
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
RBP: ffff888093a052e8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000282
R13: 00000078100c35c3 R14: ffff888093a05000 R15: 0000000000000000
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 taprio_change+0x1fdc/0x2960 net/sched/sch_taprio.c:1557
 taprio_init+0x52e/0x670 net/sched/sch_taprio.c:1670
 qdisc_create+0x4b6/0x12e0 net/sched/sch_api.c:1246
 tc_modify_qdisc+0x4c8/0x1990 net/sched/sch_api.c:1662
 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5461
 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2352
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2406
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x443819
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 0f fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff687c83d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000443819
RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004
RBP: 00007fff687c83e0 R08: 0000000001bbbbbb R09: 0000000001bbbbbb
R10: 0000000001bbbbbb R11: 0000000000000246 R12: 00007fff687c83f0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-29  5:53 INFO: rcu detected stall in tc_modify_qdisc syzbot
@ 2020-07-29  7:28 ` Zhang, Qiang
  2020-07-29 19:13   ` Vinicius Costa Gomes
  0 siblings, 1 reply; 9+ messages in thread
From: Zhang, Qiang @ 2020-07-29  7:28 UTC (permalink / raw)
  To: syzbot, davem, fweisbec, jhs, jiri, linux-kernel, mingo, netdev,
	syzkaller-bugs, tglx, vinicius.gomes, xiyou.wangcong



________________________________________
发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
发送时间: 2020年7月29日 13:53
收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
主题: INFO: rcu detected stall in tc_modify_qdisc

Hello,

syzbot found the following issue on:

HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
git tree:       net
console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15b2d790900000

The issue was bisected to:

commit 5a781ccbd19e4664babcbe4b4ead7aa2b9283d22
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date:   Sat Sep 29 00:59:43 2018 +0000

    tc: Add support for configuring the taprio scheduler

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=160e1bac900000
console output: https://syzkaller.appspot.com/x/log.txt?x=110e1bac900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:    1-...!: (1 GPs behind) idle=6f6/1/0x4000000000000000 softirq=10195/10196 fqs=1
        (t=27930 jiffies g=9233 q=413)
rcu: rcu_preempt kthread starved for 27901 jiffies! g9233 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_preempt     R  running task    29112    10      2 0x00004000
Call Trace:
 context_switch kernel/sched/core.c:3458 [inline]
 __schedule+0x8ea/0x2210 kernel/sched/core.c:4219
 schedule+0xd0/0x2a0 kernel/sched/core.c:4294
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1908
 rcu_gp_fqs_loop kernel/rcu/tree.c:1874 [inline]
 rcu_gp_kthread+0xae5/0x1b50 kernel/rcu/tree.c:2044
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
NMI backtrace for cpu 1
CPU: 1 PID: 6799 Comm: syz-executor494 Not tainted 5.8.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x1b3/0x223 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x194/0x1cf kernel/rcu/tree_stall.h:320
 print_cpu_stall kernel/rcu/tree_stall.h:553 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:627 [inline]
 rcu_pending kernel/rcu/tree.c:3489 [inline]
 rcu_sched_clock_irq.cold+0x5b3/0xccc kernel/rcu/tree.c:2504
 update_process_times+0x25/0x60 kernel/time/timer.c:1737
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
 tick_sched_timer+0x108/0x290 kernel/time/tick-sched.c:1320
 __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
 __hrtimer_run_queues+0x1d5/0xfc0 kernel/time/hrtimer.c:1584
 hrtimer_interrupt+0x32a/0x930 kernel/time/hrtimer.c:1646
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
 __sysvec_apic_timer_interrupt+0x142/0x5e0 arch/x86/kernel/apic/apic.c:1097
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_apic_timer_interrupt+0xe0/0x120 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:585
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:770 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
Code: 48 c7 c0 88 e0 b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d e3 52 cc 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 35 e5 66 f9 65 8b 05 fe 70 19 78
RSP: 0018:ffffc900016672c0 EFLAGS: 00000282
RAX: 1ffffffff1369c11 RBX: 0000000000000282 RCX: 0000000000000002
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
RBP: ffff888093a052e8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000282
R13: 00000078100c35c3 R14: ffff888093a05000 R15: 0000000000000000
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 taprio_change+0x1fdc/0x2960 net/sched/sch_taprio.c:1557


 It looks like  that some loops in "taprio_init" func  occupy CPU 1 for a long time, resulting in the fact that the RCU has not detected CPU1 quiescent state.


 taprio_init+0x52e/0x670 net/sched/sch_taprio.c:1670
 qdisc_create+0x4b6/0x12e0 net/sched/sch_api.c:1246
 tc_modify_qdisc+0x4c8/0x1990 net/sched/sch_api.c:1662
 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5461
 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2352
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2406
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x443819
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 0f fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff687c83d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000443819
RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004
RBP: 00007fff687c83e0 R08: 0000000001bbbbbb R09: 0000000001bbbbbb
R10: 0000000001bbbbbb R11: 0000000000000246 R12: 00007fff687c83f0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-29  7:28 ` 回复: " Zhang, Qiang
@ 2020-07-29 19:13   ` Vinicius Costa Gomes
  2020-07-30  5:58     ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Vinicius Costa Gomes @ 2020-07-29 19:13 UTC (permalink / raw)
  To: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

Hi,

"Zhang, Qiang" <Qiang.Zhang@windriver.com> writes:

> ________________________________________
> 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
> 发送时间: 2020年7月29日 13:53
> 收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
> 主题: INFO: rcu detected stall in tc_modify_qdisc
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
> git tree:       net
> console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
> dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
> compiler:       gcc (GCC) 10.1.0-syz 20200507
> syz repro:
> https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000

It seems that syzkaller is generating an schedule with too small
intervals (3ns in this case) which causes a hrtimer busy-loop which
starves other kernel threads.

We could put some limits on the interval when running in software mode,
but I don't like this too much, because we are talking about users with
CAP_NET_ADMIN and they have easier ways to do bad things to the system.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-29 19:13   ` Vinicius Costa Gomes
@ 2020-07-30  5:58     ` Dmitry Vyukov
  2020-07-30 17:44       ` Vinicius Costa Gomes
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-07-30  5:58 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

On Wed, Jul 29, 2020 at 9:13 PM Vinicius Costa Gomes
<vinicius.gomes@intel.com> wrote:
>
> Hi,
>
> "Zhang, Qiang" <Qiang.Zhang@windriver.com> writes:
>
> > ________________________________________
> > 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
> > 发送时间: 2020年7月29日 13:53
> > 收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
> > 主题: INFO: rcu detected stall in tc_modify_qdisc
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
> > git tree:       net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
> > dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
> > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > syz repro:
> > https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
>
> It seems that syzkaller is generating an schedule with too small
> intervals (3ns in this case) which causes a hrtimer busy-loop which
> starves other kernel threads.
>
> We could put some limits on the interval when running in software mode,
> but I don't like this too much, because we are talking about users with
> CAP_NET_ADMIN and they have easier ways to do bad things to the system.

Hi Vinicius,

Could you explain why you don't like the argument if it's for CAP_NET_ADMIN?
Good code should check arguments regardless I think and it's useful to
protect root from, say, programming bugs rather than kill the machine
on any bug and misconfiguration. What am I missing?

Also are we talking about CAP_NET_ADMIN in a user ns as well
(effectively nobody)?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-30  5:58     ` Dmitry Vyukov
@ 2020-07-30 17:44       ` Vinicius Costa Gomes
  2020-07-30 18:36         ` Eric Dumazet
  2020-07-30 19:19         ` Dmitry Vyukov
  0 siblings, 2 replies; 9+ messages in thread
From: Vinicius Costa Gomes @ 2020-07-30 17:44 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

Hi,

Dmitry Vyukov <dvyukov@google.com> writes:

> On Wed, Jul 29, 2020 at 9:13 PM Vinicius Costa Gomes
> <vinicius.gomes@intel.com> wrote:
>>
>> Hi,
>>
>> "Zhang, Qiang" <Qiang.Zhang@windriver.com> writes:
>>
>> > ________________________________________
>> > 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
>> > 发送时间: 2020年7月29日 13:53
>> > 收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
>> > 主题: INFO: rcu detected stall in tc_modify_qdisc
>> >
>> > Hello,
>> >
>> > syzbot found the following issue on:
>> >
>> > HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
>> > git tree:       net
>> > console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
>> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
>> > dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
>> > compiler:       gcc (GCC) 10.1.0-syz 20200507
>> > syz repro:
>> > https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
>>
>> It seems that syzkaller is generating an schedule with too small
>> intervals (3ns in this case) which causes a hrtimer busy-loop which
>> starves other kernel threads.
>>
>> We could put some limits on the interval when running in software mode,
>> but I don't like this too much, because we are talking about users with
>> CAP_NET_ADMIN and they have easier ways to do bad things to the system.
>
> Hi Vinicius,
>
> Could you explain why you don't like the argument if it's for CAP_NET_ADMIN?
> Good code should check arguments regardless I think and it's useful to
> protect root from, say, programming bugs rather than kill the machine
> on any bug and misconfiguration. What am I missing?

I admit that I am on the fence on that argument: do not let even root
crash the system (the point that my code is crashing the system gives
weight to this side) vs. root has great powers, they need to know what
they are doing.

The argument that I used to convince myself was: root can easily create
a bunch of processes and give them the highest priority and do
effectively the same thing as this issue, so I went with a the "they
need to know what they are doing side".

A bit more on the specifics here:

  - Using a small interval size, is only a limitation of the taprio
  software mode, when using hardware offloads (which I think most users
  do), any interval size (supported by the hardware) can be used;

  - Choosing a good lower limit for this seems kind of hard: something
  below 1us would never work well, I think, but things 1us < x < 100us
  will depend on the hardware/kernel config/system load, and this is the
  range includes "useful" values for many systems.

Perhaps a middle ground would be to impose a limit based on the link
speed, the interval can never be smaller than the time it takes to send
the minimum ethernet frame (for 1G links this would be ~480ns, should be
enough to catch most programming mistakes). I am going to add this and
see how it looks like.

Sorry for the brain dump :-)

>
> Also are we talking about CAP_NET_ADMIN in a user ns as well
> (effectively nobody)?

Just checked, we are talking about CAP_NET_ADMIN in user namespace as
well.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-30 17:44       ` Vinicius Costa Gomes
@ 2020-07-30 18:36         ` Eric Dumazet
  2020-07-30 21:01           ` Vinicius Costa Gomes
  2020-07-30 19:19         ` Dmitry Vyukov
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2020-07-30 18:36 UTC (permalink / raw)
  To: Vinicius Costa Gomes, Dmitry Vyukov
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong



On 7/30/20 10:44 AM, Vinicius Costa Gomes wrote:
> Hi,
> 
> Dmitry Vyukov <dvyukov@google.com> writes:
> 
>> On Wed, Jul 29, 2020 at 9:13 PM Vinicius Costa Gomes
>> <vinicius.gomes@intel.com> wrote:
>>>
>>> Hi,
>>>
>>> "Zhang, Qiang" <Qiang.Zhang@windriver.com> writes:
>>>
>>>> ________________________________________
>>>> 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
>>>> 发送时间: 2020年7月29日 13:53
>>>> 收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
>>>> 主题: INFO: rcu detected stall in tc_modify_qdisc
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
>>>> git tree:       net
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
>>>> compiler:       gcc (GCC) 10.1.0-syz 20200507
>>>> syz repro:
>>>> https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
>>>
>>> It seems that syzkaller is generating an schedule with too small
>>> intervals (3ns in this case) which causes a hrtimer busy-loop which
>>> starves other kernel threads.
>>>
>>> We could put some limits on the interval when running in software mode,
>>> but I don't like this too much, because we are talking about users with
>>> CAP_NET_ADMIN and they have easier ways to do bad things to the system.
>>
>> Hi Vinicius,
>>
>> Could you explain why you don't like the argument if it's for CAP_NET_ADMIN?
>> Good code should check arguments regardless I think and it's useful to
>> protect root from, say, programming bugs rather than kill the machine
>> on any bug and misconfiguration. What am I missing?
> 
> I admit that I am on the fence on that argument: do not let even root
> crash the system (the point that my code is crashing the system gives
> weight to this side) vs. root has great powers, they need to know what
> they are doing.
> 
> The argument that I used to convince myself was: root can easily create
> a bunch of processes and give them the highest priority and do
> effectively the same thing as this issue, so I went with a the "they
> need to know what they are doing side".
> 
> A bit more on the specifics here:
> 
>   - Using a small interval size, is only a limitation of the taprio
>   software mode, when using hardware offloads (which I think most users
>   do), any interval size (supported by the hardware) can be used;
> 
>   - Choosing a good lower limit for this seems kind of hard: something
>   below 1us would never work well, I think, but things 1us < x < 100us
>   will depend on the hardware/kernel config/system load, and this is the
>   range includes "useful" values for many systems.
> 
> Perhaps a middle ground would be to impose a limit based on the link
> speed, the interval can never be smaller than the time it takes to send
> the minimum ethernet frame (for 1G links this would be ~480ns, should be
> enough to catch most programming mistakes). I am going to add this and
> see how it looks like.
> 
> Sorry for the brain dump :-)


I do not know taprio details, but do you really need a periodic timer ?

Presumably there is no need to fire a timer before next packet departure time ?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-30 17:44       ` Vinicius Costa Gomes
  2020-07-30 18:36         ` Eric Dumazet
@ 2020-07-30 19:19         ` Dmitry Vyukov
  2020-07-30 22:01           ` Vinicius Costa Gomes
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-07-30 19:19 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

On Thu, Jul 30, 2020 at 7:44 PM Vinicius Costa Gomes
<vinicius.gomes@intel.com> wrote:
>
> Hi,
>
> Dmitry Vyukov <dvyukov@google.com> writes:
>
> > On Wed, Jul 29, 2020 at 9:13 PM Vinicius Costa Gomes
> > <vinicius.gomes@intel.com> wrote:
> >>
> >> Hi,
> >>
> >> "Zhang, Qiang" <Qiang.Zhang@windriver.com> writes:
> >>
> >> > ________________________________________
> >> > 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 syzbot <syzbot+9f78d5c664a8c33f4cce@syzkaller.appspotmail.com>
> >> > 发送时间: 2020年7月29日 13:53
> >> > 收件人: davem@davemloft.net; fweisbec@gmail.com; jhs@mojatatu.com; jiri@resnulli.us; linux-kernel@vger.kernel.org; mingo@kernel.org; netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; vinicius.gomes@intel.com; xiyou.wangcong@gmail.com
> >> > 主题: INFO: rcu detected stall in tc_modify_qdisc
> >> >
> >> > Hello,
> >> >
> >> > syzbot found the following issue on:
> >> >
> >> > HEAD commit:    181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
> >> > git tree:       net
> >> > console output: https://syzkaller.appspot.com/x/log.txt?x=12925e38900000
> >> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
> >> > dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
> >> > compiler:       gcc (GCC) 10.1.0-syz 20200507
> >> > syz repro:
> >> > https://syzkaller.appspot.com/x/repro.syz?x=16587f8c900000
> >>
> >> It seems that syzkaller is generating an schedule with too small
> >> intervals (3ns in this case) which causes a hrtimer busy-loop which
> >> starves other kernel threads.
> >>
> >> We could put some limits on the interval when running in software mode,
> >> but I don't like this too much, because we are talking about users with
> >> CAP_NET_ADMIN and they have easier ways to do bad things to the system.
> >
> > Hi Vinicius,
> >
> > Could you explain why you don't like the argument if it's for CAP_NET_ADMIN?
> > Good code should check arguments regardless I think and it's useful to
> > protect root from, say, programming bugs rather than kill the machine
> > on any bug and misconfiguration. What am I missing?
>
> I admit that I am on the fence on that argument: do not let even root
> crash the system (the point that my code is crashing the system gives
> weight to this side) vs. root has great powers, they need to know what
> they are doing.
>
> The argument that I used to convince myself was: root can easily create
> a bunch of processes and give them the highest priority and do
> effectively the same thing as this issue, so I went with a the "they
> need to know what they are doing side".
>
> A bit more on the specifics here:
>
>   - Using a small interval size, is only a limitation of the taprio
>   software mode, when using hardware offloads (which I think most users
>   do), any interval size (supported by the hardware) can be used;
>
>   - Choosing a good lower limit for this seems kind of hard: something
>   below 1us would never work well, I think, but things 1us < x < 100us
>   will depend on the hardware/kernel config/system load, and this is the
>   range includes "useful" values for many systems.
>
> Perhaps a middle ground would be to impose a limit based on the link
> speed, the interval can never be smaller than the time it takes to send
> the minimum ethernet frame (for 1G links this would be ~480ns, should be
> enough to catch most programming mistakes). I am going to add this and
> see how it looks like.
>
> Sorry for the brain dump :-)
>
> >
> > Also are we talking about CAP_NET_ADMIN in a user ns as well
> > (effectively nobody)?
>
> Just checked, we are talking about CAP_NET_ADMIN in user namespace as
> well.

OK, so this is not root/admin, this is just any user.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-30 18:36         ` Eric Dumazet
@ 2020-07-30 21:01           ` Vinicius Costa Gomes
  0 siblings, 0 replies; 9+ messages in thread
From: Vinicius Costa Gomes @ 2020-07-30 21:01 UTC (permalink / raw)
  To: Eric Dumazet, Dmitry Vyukov
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

Hi Eric,

Eric Dumazet <eric.dumazet@gmail.com> writes:

>> I admit that I am on the fence on that argument: do not let even root
>> crash the system (the point that my code is crashing the system gives
>> weight to this side) vs. root has great powers, they need to know what
>> they are doing.
>> 
>> The argument that I used to convince myself was: root can easily create
>> a bunch of processes and give them the highest priority and do
>> effectively the same thing as this issue, so I went with a the "they
>> need to know what they are doing side".
>> 
>> A bit more on the specifics here:
>> 
>>   - Using a small interval size, is only a limitation of the taprio
>>   software mode, when using hardware offloads (which I think most users
>>   do), any interval size (supported by the hardware) can be used;
>> 
>>   - Choosing a good lower limit for this seems kind of hard: something
>>   below 1us would never work well, I think, but things 1us < x < 100us
>>   will depend on the hardware/kernel config/system load, and this is the
>>   range includes "useful" values for many systems.
>> 
>> Perhaps a middle ground would be to impose a limit based on the link
>> speed, the interval can never be smaller than the time it takes to send
>> the minimum ethernet frame (for 1G links this would be ~480ns, should be
>> enough to catch most programming mistakes). I am going to add this and
>> see how it looks like.
>> 
>> Sorry for the brain dump :-)
>
>
> I do not know taprio details, but do you really need a periodic timer
> ?

As we can control the transmission time of packets, you are right, I
don't.

Just a bit more detail about the current implementation taprio,
basically it has a sequence of { Traffic Classes that are open; Interval
} that repeats cyclicly, it uses an hrtimer to advance the pointer for
the current element, so during dequeue I can check if a traffic class is
"open" or "closed".

But again, if I calculate the 'skb->tstamp' of each packet during
enqueue, I don't need the hrtimer. What we have in the txtime-assisted
mode is half way there.

I think this is what you had in mind.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 回复: INFO: rcu detected stall in tc_modify_qdisc
  2020-07-30 19:19         ` Dmitry Vyukov
@ 2020-07-30 22:01           ` Vinicius Costa Gomes
  0 siblings, 0 replies; 9+ messages in thread
From: Vinicius Costa Gomes @ 2020-07-30 22:01 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Zhang, Qiang, syzbot, davem, fweisbec, jhs, jiri, linux-kernel,
	mingo, netdev, syzkaller-bugs, tglx, xiyou.wangcong

Dmitry Vyukov <dvyukov@google.com> writes:

>> >
>> > Also are we talking about CAP_NET_ADMIN in a user ns as well
>> > (effectively nobody)?
>>
>> Just checked, we are talking about CAP_NET_ADMIN in user namespace as
>> well.
>
> OK, so this is not root/admin, this is just any user.

Yeah, will fix this.


Thanks,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-30 22:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-29  5:53 INFO: rcu detected stall in tc_modify_qdisc syzbot
2020-07-29  7:28 ` 回复: " Zhang, Qiang
2020-07-29 19:13   ` Vinicius Costa Gomes
2020-07-30  5:58     ` Dmitry Vyukov
2020-07-30 17:44       ` Vinicius Costa Gomes
2020-07-30 18:36         ` Eric Dumazet
2020-07-30 21:01           ` Vinicius Costa Gomes
2020-07-30 19:19         ` Dmitry Vyukov
2020-07-30 22:01           ` Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).