All of lore.kernel.org
 help / color / mirror / Atom feed
* INFO: rcu detected stall in is_bpf_text_address
@ 2018-05-19 15:55 syzbot
  2018-05-19 15:57 ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2018-05-19 15:55 UTC (permalink / raw)
  To: ast, daniel, linux-kernel, netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=156daf97800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com

8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
INFO: rcu_sched self-detected stall on CPU
	0-...!: (124998 ticks this GP) idle=0be/1/4611686018427387908  
softirq=15234/15234 fqs=59
	 (t=125000 jiffies g=7610 c=7609 q=351640)
rcu_sched kthread starved for 124739 jiffies! g7610 c7609 f0x2  
RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1
RCU grace-period kthread stack dump:
rcu_sched       R  running task    23896     9      2 0x80000000
Call Trace:
  context_switch kernel/sched/core.c:2859 [inline]
  __schedule+0x801/0x1e30 kernel/sched/core.c:3501
  schedule+0xef/0x430 kernel/sched/core.c:3545
  schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
  rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
  kthread+0x345/0x410 kernel/kthread.c:240
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
NMI backtrace for cpu 0
CPU: 0 PID: 6381 Comm: sh Not tainted 4.17.0-rc5+ #58
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
  __rcu_pending kernel/rcu/tree.c:3356 [inline]
  rcu_pending kernel/rcu/tree.c:3401 [inline]
  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164
  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274
  __run_hrtimer kernel/time/hrtimer.c:1398 [inline]
  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1460
  hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1518
  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783  
[inline]
RIP: 0010:lock_acquire+0x257/0x520 kernel/locking/lockdep.c:3923
RSP: 0000:ffff8801dae06b60 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: 1ffff1003b5c0d71 RCX: 0000000000000000
RDX: 1ffffffff11a30e5 RSI: ffff8801c7b54c38 RDI: 0000000000000286
RBP: ffff8801dae06c50 R08: 0000000000000008 R09: 0000000000000003
R10: ffff8801c7b54cb0 R11: ffff8801c7b54400 R12: ffff8801c7b54400
R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
  rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
  rcu_read_lock include/linux/rcupdate.h:632 [inline]
  is_bpf_text_address+0x3b/0x170 kernel/bpf/core.c:478
  kernel_text_address+0x79/0xf0 kernel/extable.c:152
  __kernel_text_address+0xd/0x40 kernel/extable.c:107
  unwind_get_return_address+0x61/0xa0 arch/x86/kernel/unwind_frame.c:18
  __save_stack_trace+0x7e/0xd0 arch/x86/kernel/stacktrace.c:45
  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
  slab_post_alloc_hook mm/slab.h:444 [inline]
  slab_alloc_node mm/slab.c:3335 [inline]
  kmem_cache_alloc_node+0x131/0x780 mm/slab.c:3642
  __alloc_skb+0x111/0x780 net/core/skbuff.c:193
  alloc_skb include/linux/skbuff.h:987 [inline]
  _sctp_make_chunk+0x58/0x280 net/sctp/sm_make_chunk.c:1417
  sctp_make_control net/sctp/sm_make_chunk.c:1464 [inline]
  sctp_make_heartbeat+0x8f/0x430 net/sctp/sm_make_chunk.c:1177
  sctp_sf_heartbeat.isra.23+0x26/0x180 net/sctp/sm_statefuns.c:1005
  sctp_sf_sendbeat_8_3+0x38e/0x550 net/sctp/sm_statefuns.c:1049
  sctp_do_sm+0x1ab/0x7160 net/sctp/sm_sideeffect.c:1188
  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
  expire_timers kernel/time/timer.c:1363 [inline]
  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
  invoke_softirq kernel/softirq.c:365 [inline]
  irq_exit+0x1d1/0x200 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
  </IRQ>
RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:31 [inline]
RIP: 0010:atomic_read include/asm-generic/atomic-instrumented.h:22 [inline]
RIP: 0010:rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:349 [inline]
RIP: 0010:rcu_is_watching+0xc4/0x140 kernel/rcu/tree.c:1075
RSP: 0000:ffff8801c957f398 EFLAGS: 00000a06 ORIG_RAX: ffffffffffffff13
RAX: 00000000000130be RBX: 1ffff100392afe74 RCX: 1ffff100392afe78
RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8801dae23610
RBP: ffff8801c957f428 R08: ffffed003b5c46c3 R09: ffffed003b5c46c2
R10: ffffed003b5c46c2 R11: ffff8801dae23613 R12: ffff8801c957f3c0
R13: ffff8801dae23610 R14: ffff8801c957f400 R15: dffffc0000000000
  rcu_read_lock_sched_held+0x8d/0x120 kernel/rcu/update.c:112
  trace_kmem_cache_alloc include/trace/events/kmem.h:54 [inline]
  kmem_cache_alloc+0x5fa/0x760 mm/slab.c:3555
  anon_vma_alloc mm/rmap.c:81 [inline]
  __anon_vma_prepare+0x3b3/0x700 mm/rmap.c:189
  anon_vma_prepare include/linux/rmap.h:153 [inline]
  wp_page_copy+0xdf1/0x1440 mm/memory.c:2481
  do_wp_page+0x425/0x1990 mm/memory.c:2776
  handle_pte_fault mm/memory.c:3979 [inline]
  __handle_mm_fault+0x2996/0x4310 mm/memory.c:4087
  handle_mm_fault+0x53a/0xc70 mm/memory.c:4124
  __do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1399
  do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1474
  page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
RIP: 0033:0x7f4232a6d1fb
RSP: 002b:00007fff2aeebd28 EFLAGS: 00010246
RAX: 000000000000000d RBX: 00000000756e6547 RCX: 000000006c65746e
RDX: 0000000049656e69 RSI: 0000000000000025 RDI: 0000000000000002
RBP: 00007fff2aeebe20 R08: 0000000000000001 R09: 000000000000002f
R10: 00007f4232a6caf0 R11: 00007fff2aeebb70 R12: 00007f4232a4e000
R13: 00007f4232dd5038 R14: 00007f4232a6cac0 R15: 00007f4232dd5d98
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 126s!
BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0 stuck for 125s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=7/256
     pending: defense_work_handler, defense_work_handler,  
defense_work_handler, defense_work_handler, defense_work_handler,  
vmstat_shepherd, cache_reap
workqueue events_power_efficient: flags=0x80
   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
     in-flight: 2107:neigh_periodic_work
workqueue mm_percpu_wq: flags=0x8
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
     pending: vmstat_update
workqueue writeback: flags=0x4e
   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
     in-flight: 4784:wb_workfn
workqueue ib_addr: flags=0xa0002
workqueue dm_bufio_cache: flags=0x8
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
     pending: work_fn
workqueue ipv6_addrconf: flags=0x40008
   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1
     pending: addrconf_dad_work
     delayed: addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_verify_work
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
     pending: addrconf_dad_work
     delayed: addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work,  
addrconf_dad_work
pool 2: cpus=1 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 4707 24 4924
pool 4: cpus=0-1 flags=0x4 nice=0 hung=0s workers=4 idle: 44 4846 6


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-19 15:55 INFO: rcu detected stall in is_bpf_text_address syzbot
@ 2018-05-19 15:57 ` Eric Dumazet
  2018-05-20  1:45   ` Marcelo Ricardo Leitner
  2018-05-20  8:26   ` Xin Long
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2018-05-19 15:57 UTC (permalink / raw)
  To: syzbot, ast, daniel, linux-kernel, netdev, syzkaller-bugs,
	Xin Long, Marcelo Ricardo Leitner

SCTP experts, please take a look.

On 05/19/2018 08:55 AM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=156daf97800000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
> 
> 8021q: adding VLAN 0 to HW filter on device team0
> 8021q: adding VLAN 0 to HW filter on device team0
> 8021q: adding VLAN 0 to HW filter on device team0
> 8021q: adding VLAN 0 to HW filter on device team0
> 8021q: adding VLAN 0 to HW filter on device team0
> INFO: rcu_sched self-detected stall on CPU
>     0-...!: (124998 ticks this GP) idle=0be/1/4611686018427387908 softirq=15234/15234 fqs=59
>      (t=125000 jiffies g=7610 c=7609 q=351640)
> rcu_sched kthread starved for 124739 jiffies! g7610 c7609 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1
> RCU grace-period kthread stack dump:
> rcu_sched       R  running task    23896     9      2 0x80000000
> Call Trace:
>  context_switch kernel/sched/core.c:2859 [inline]
>  __schedule+0x801/0x1e30 kernel/sched/core.c:3501
>  schedule+0xef/0x430 kernel/sched/core.c:3545
>  schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
>  rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
>  kthread+0x345/0x410 kernel/kthread.c:240
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> NMI backtrace for cpu 0
> CPU: 0 PID: 6381 Comm: sh Not tainted 4.17.0-rc5+ #58
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
>  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
>  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
>  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
>  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>  rcu_pending kernel/rcu/tree.c:3401 [inline]
>  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164
>  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274
>  __run_hrtimer kernel/time/hrtimer.c:1398 [inline]
>  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1460
>  hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1518
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783 [inline]
> RIP: 0010:lock_acquire+0x257/0x520 kernel/locking/lockdep.c:3923
> RSP: 0000:ffff8801dae06b60 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
> RAX: dffffc0000000000 RBX: 1ffff1003b5c0d71 RCX: 0000000000000000
> RDX: 1ffffffff11a30e5 RSI: ffff8801c7b54c38 RDI: 0000000000000286
> RBP: ffff8801dae06c50 R08: 0000000000000008 R09: 0000000000000003
> R10: ffff8801c7b54cb0 R11: ffff8801c7b54400 R12: ffff8801c7b54400
> R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
>  rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
>  rcu_read_lock include/linux/rcupdate.h:632 [inline]
>  is_bpf_text_address+0x3b/0x170 kernel/bpf/core.c:478
>  kernel_text_address+0x79/0xf0 kernel/extable.c:152
>  __kernel_text_address+0xd/0x40 kernel/extable.c:107
>  unwind_get_return_address+0x61/0xa0 arch/x86/kernel/unwind_frame.c:18
>  __save_stack_trace+0x7e/0xd0 arch/x86/kernel/stacktrace.c:45
>  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>  slab_post_alloc_hook mm/slab.h:444 [inline]
>  slab_alloc_node mm/slab.c:3335 [inline]
>  kmem_cache_alloc_node+0x131/0x780 mm/slab.c:3642
>  __alloc_skb+0x111/0x780 net/core/skbuff.c:193
>  alloc_skb include/linux/skbuff.h:987 [inline]
>  _sctp_make_chunk+0x58/0x280 net/sctp/sm_make_chunk.c:1417
>  sctp_make_control net/sctp/sm_make_chunk.c:1464 [inline]
>  sctp_make_heartbeat+0x8f/0x430 net/sctp/sm_make_chunk.c:1177
>  sctp_sf_heartbeat.isra.23+0x26/0x180 net/sctp/sm_statefuns.c:1005
>  sctp_sf_sendbeat_8_3+0x38e/0x550 net/sctp/sm_statefuns.c:1049
>  sctp_do_sm+0x1ab/0x7160 net/sctp/sm_sideeffect.c:1188
>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>  expire_timers kernel/time/timer.c:1363 [inline]
>  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
>  </IRQ>
> RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
> RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:31 [inline]
> RIP: 0010:atomic_read include/asm-generic/atomic-instrumented.h:22 [inline]
> RIP: 0010:rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:349 [inline]
> RIP: 0010:rcu_is_watching+0xc4/0x140 kernel/rcu/tree.c:1075
> RSP: 0000:ffff8801c957f398 EFLAGS: 00000a06 ORIG_RAX: ffffffffffffff13
> RAX: 00000000000130be RBX: 1ffff100392afe74 RCX: 1ffff100392afe78
> RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8801dae23610
> RBP: ffff8801c957f428 R08: ffffed003b5c46c3 R09: ffffed003b5c46c2
> R10: ffffed003b5c46c2 R11: ffff8801dae23613 R12: ffff8801c957f3c0
> R13: ffff8801dae23610 R14: ffff8801c957f400 R15: dffffc0000000000
>  rcu_read_lock_sched_held+0x8d/0x120 kernel/rcu/update.c:112
>  trace_kmem_cache_alloc include/trace/events/kmem.h:54 [inline]
>  kmem_cache_alloc+0x5fa/0x760 mm/slab.c:3555
>  anon_vma_alloc mm/rmap.c:81 [inline]
>  __anon_vma_prepare+0x3b3/0x700 mm/rmap.c:189
>  anon_vma_prepare include/linux/rmap.h:153 [inline]
>  wp_page_copy+0xdf1/0x1440 mm/memory.c:2481
>  do_wp_page+0x425/0x1990 mm/memory.c:2776
>  handle_pte_fault mm/memory.c:3979 [inline]
>  __handle_mm_fault+0x2996/0x4310 mm/memory.c:4087
>  handle_mm_fault+0x53a/0xc70 mm/memory.c:4124
>  __do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1399
>  do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1474
>  page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
> RIP: 0033:0x7f4232a6d1fb
> RSP: 002b:00007fff2aeebd28 EFLAGS: 00010246
> RAX: 000000000000000d RBX: 00000000756e6547 RCX: 000000006c65746e
> RDX: 0000000049656e69 RSI: 0000000000000025 RDI: 0000000000000002
> RBP: 00007fff2aeebe20 R08: 0000000000000001 R09: 000000000000002f
> R10: 00007f4232a6caf0 R11: 00007fff2aeebb70 R12: 00007f4232a4e000
> R13: 00007f4232dd5038 R14: 00007f4232a6cac0 R15: 00007f4232dd5d98
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 126s!
> BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0 stuck for 125s!
> Showing busy workqueues and worker pools:
> workqueue events: flags=0x0
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=7/256
>     pending: defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, vmstat_shepherd, cache_reap
> workqueue events_power_efficient: flags=0x80
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>     in-flight: 2107:neigh_periodic_work
> workqueue mm_percpu_wq: flags=0x8
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>     pending: vmstat_update
> workqueue writeback: flags=0x4e
>   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
>     in-flight: 4784:wb_workfn
> workqueue ib_addr: flags=0xa0002
> workqueue dm_bufio_cache: flags=0x8
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>     pending: work_fn
> workqueue ipv6_addrconf: flags=0x40008
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1
>     pending: addrconf_dad_work
>     delayed: addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_verify_work
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
>     pending: addrconf_dad_work
>     delayed: addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work, addrconf_dad_work
> pool 2: cpus=1 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 4707 24 4924
> pool 4: cpus=0-1 flags=0x4 nice=0 hung=0s workers=4 idle: 44 4846 6
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-19 15:57 ` Eric Dumazet
@ 2018-05-20  1:45   ` Marcelo Ricardo Leitner
  2018-05-20  8:26   ` Xin Long
  1 sibling, 0 replies; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-05-20  1:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: syzbot, ast, daniel, linux-kernel, netdev, syzkaller-bugs, Xin Long

On Sat, May 19, 2018 at 08:57:19AM -0700, Eric Dumazet wrote:
> SCTP experts, please take a look.

Ugh, same sctp footprint as in the other reports.

>
> >  _sctp_make_chunk+0x58/0x280 net/sctp/sm_make_chunk.c:1417
> >  sctp_make_control net/sctp/sm_make_chunk.c:1464 [inline]
> >  sctp_make_heartbeat+0x8f/0x430 net/sctp/sm_make_chunk.c:1177
> >  sctp_sf_heartbeat.isra.23+0x26/0x180 net/sctp/sm_statefuns.c:1005
> >  sctp_sf_sendbeat_8_3+0x38e/0x550 net/sctp/sm_statefuns.c:1049
> >  sctp_do_sm+0x1ab/0x7160 net/sctp/sm_sideeffect.c:1188
> >  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
> >  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
> >  expire_timers kernel/time/timer.c:1363 [inline]
> >  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
> >  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-19 15:57 ` Eric Dumazet
  2018-05-20  1:45   ` Marcelo Ricardo Leitner
@ 2018-05-20  8:26   ` Xin Long
  2018-05-22  0:43       ` Marcelo Ricardo Leitner
                       ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Xin Long @ 2018-05-20  8:26 UTC (permalink / raw)
  To: Eric Dumazet, Marcelo Ricardo Leitner
  Cc: syzbot, ast, Daniel Borkmann, LKML, network dev, syzkaller-bugs

On Sat, May 19, 2018 at 11:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> SCTP experts, please take a look.
>
> On 05/19/2018 08:55 AM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
>> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000
Thank you.
The Reproducer is more than helpful.

setsockopt$inet_sctp6_SCTP_RTOINFO(r0, 0x84, 0x0,
&(0x7f0000000140)={0x0, 0x6, 0x7, 0x4}, 0x10)

It set rto_min=6 and rto_max=7, these are too small values.
t3_rtx timer works fine with it. But hb_timer will get stuck there, as
in its timer handler it starts this timer again with this value, then
it goes to the timer handler again...

HB has to repeat this and the hb timer's expire may also have to use
'trans->rto >> 1 ...' stuff. But we can limit the RTO's min value, like
HZ/20, which is 'Try again later.' number used when sock lock is
owned by others in all timer handlers.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-20  8:26   ` Xin Long
@ 2018-05-22  0:43       ` Marcelo Ricardo Leitner
  2018-05-28 17:55     ` Marcelo Ricardo Leitner
  2018-05-28 18:22     ` Marcelo Ricardo Leitner
  2 siblings, 0 replies; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-05-22  0:43 UTC (permalink / raw)
  To: Xin Long
  Cc: Eric Dumazet, syzbot, ast, Daniel Borkmann, LKML, network dev,
	syzkaller-bugs, linux-sctp

On Sun, May 20, 2018 at 04:26:03PM +0800, Xin Long wrote:
> On Sat, May 19, 2018 at 11:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > SCTP experts, please take a look.
> >
> > On 05/19/2018 08:55 AM, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000
> Thank you.
> The Reproducer is more than helpful.
> 
> setsockopt$inet_sctp6_SCTP_RTOINFO(r0, 0x84, 0x0,
> &(0x7f0000000140)={0x0, 0x6, 0x7, 0x4}, 0x10)
> 
> It set rto_min=6 and rto_max=7, these are too small values.
> t3_rtx timer works fine with it. But hb_timer will get stuck there, as
> in its timer handler it starts this timer again with this value, then
> it goes to the timer handler again...

Nice, thanks Xin.

> 
> HB has to repeat this and the hb timer's expire may also have to use
> 'trans->rto >> 1 ...' stuff. But we can limit the RTO's min value, like
> HZ/20, which is 'Try again later.' number used when sock lock is
> owned by others in all timer handlers.

I think a good fix for this is to not allow the application to go
below net.sctp.rto_min, and neither above net.sctp.rto_max.

Then they can even be close to each other, won't be an issue, as long
as rto_min is something sensible. Which then brings it to the second
step of a fix: to restrict rto_min to be >= HZ/5 (copying from TCP
here).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
@ 2018-05-22  0:43       ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-05-22  0:43 UTC (permalink / raw)
  To: Xin Long
  Cc: Eric Dumazet, syzbot, ast, Daniel Borkmann, LKML, network dev,
	syzkaller-bugs, linux-sctp

On Sun, May 20, 2018 at 04:26:03PM +0800, Xin Long wrote:
> On Sat, May 19, 2018 at 11:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > SCTP experts, please take a look.
> >
> > On 05/19/2018 08:55 AM, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x\x1462ec0f800000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?xób4e30da84ec1ed
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=cd59a1f907245f891f
> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x\x1079cf8f800000
> Thank you.
> The Reproducer is more than helpful.
> 
> setsockopt$inet_sctp6_SCTP_RTOINFO(r0, 0x84, 0x0,
> &(0x7f0000000140)={0x0, 0x6, 0x7, 0x4}, 0x10)
> 
> It set rto_min=6 and rto_max=7, these are too small values.
> t3_rtx timer works fine with it. But hb_timer will get stuck there, as
> in its timer handler it starts this timer again with this value, then
> it goes to the timer handler again...

Nice, thanks Xin.

> 
> HB has to repeat this and the hb timer's expire may also have to use
> 'trans->rto >> 1 ...' stuff. But we can limit the RTO's min value, like
> HZ/20, which is 'Try again later.' number used when sock lock is
> owned by others in all timer handlers.

I think a good fix for this is to not allow the application to go
below net.sctp.rto_min, and neither above net.sctp.rto_max.

Then they can even be close to each other, won't be an issue, as long
as rto_min is something sensible. Which then brings it to the second
step of a fix: to restrict rto_min to be >= HZ/5 (copying from TCP
here).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-20  8:26   ` Xin Long
  2018-05-22  0:43       ` Marcelo Ricardo Leitner
@ 2018-05-28 17:55     ` Marcelo Ricardo Leitner
  2018-05-28 18:22     ` Marcelo Ricardo Leitner
  2 siblings, 0 replies; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-05-28 17:55 UTC (permalink / raw)
  To: Xin Long
  Cc: Eric Dumazet, syzbot, ast, Daniel Borkmann, LKML, network dev,
	syzkaller-bugs

On Sun, May 20, 2018 at 04:26:03PM +0800, Xin Long wrote:
> On Sat, May 19, 2018 at 11:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > SCTP experts, please take a look.
> >
> > On 05/19/2018 08:55 AM, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000
> Thank you.
> The Reproducer is more than helpful.
> 
> setsockopt$inet_sctp6_SCTP_RTOINFO(r0, 0x84, 0x0,
> &(0x7f0000000140)={0x0, 0x6, 0x7, 0x4}, 0x10)
> 
> It set rto_min=6 and rto_max=7, these are too small values.

Considering
struct sctp_rtoinfo {
        sctp_assoc_t    srto_assoc_id;
        __u32           srto_initial;
        __u32           srto_max;
        __u32           srto_min;
};

Isn't this actually equivalent to:
struct sctp_rtoinfo foo = {
	.srto_assoc_id = 0,
	.srto_initial = 6,
	.srto_max = 7,
	.srto_min = 4     /* instead of 6 */
};

?

This doesn't change a thing in the analysis, it's just to be sure
which one is right.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-20  8:26   ` Xin Long
  2018-05-22  0:43       ` Marcelo Ricardo Leitner
  2018-05-28 17:55     ` Marcelo Ricardo Leitner
@ 2018-05-28 18:22     ` Marcelo Ricardo Leitner
  2018-05-29 11:53       ` Andrey Konovalov
  2 siblings, 1 reply; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-05-28 18:22 UTC (permalink / raw)
  To: Xin Long
  Cc: Eric Dumazet, syzbot, ast, Daniel Borkmann, LKML, network dev,
	syzkaller-bugs

On Sun, May 20, 2018 at 04:26:03PM +0800, Xin Long wrote:
> On Sat, May 19, 2018 at 11:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > SCTP experts, please take a look.
> >
> > On 05/19/2018 08:55 AM, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew)
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f800000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcd59a1f907245f891f
> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1079cf8f800000

The reproducer has:
r0 = socket$inet6(0xa, 0x1, 0x8010000000000084)

Considering socket() prototype uses an int for the 3rd argument, how
should I interpret this 64b value?

0x84 is IPPROTO_SCTP, but don't know about the 0x8010000000000000 in
there.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: rcu detected stall in is_bpf_text_address
  2018-05-28 18:22     ` Marcelo Ricardo Leitner
@ 2018-05-29 11:53       ` Andrey Konovalov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrey Konovalov @ 2018-05-29 11:53 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Xin Long, Eric Dumazet, syzbot, ast, Daniel Borkmann, LKML,
	network dev, syzkaller-bugs

On Mon, May 28, 2018 at 8:22 PM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> The reproducer has:
> r0 = socket$inet6(0xa, 0x1, 0x8010000000000084)
>
> Considering socket() prototype uses an int for the 3rd argument, how
> should I interpret this 64b value?
>
> 0x84 is IPPROTO_SCTP, but don't know about the 0x8010000000000000 in
> there.

Hi Marcelo,

It gets stripped to int the same way C does it, so the value would be 0x84.

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-29 11:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-19 15:55 INFO: rcu detected stall in is_bpf_text_address syzbot
2018-05-19 15:57 ` Eric Dumazet
2018-05-20  1:45   ` Marcelo Ricardo Leitner
2018-05-20  8:26   ` Xin Long
2018-05-22  0:43     ` Marcelo Ricardo Leitner
2018-05-22  0:43       ` Marcelo Ricardo Leitner
2018-05-28 17:55     ` Marcelo Ricardo Leitner
2018-05-28 18:22     ` Marcelo Ricardo Leitner
2018-05-29 11:53       ` Andrey Konovalov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.