All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] INFO: rcu detected stall in gc_worker (3)
@ 2022-03-20 12:02 syzbot
  2023-09-24 10:59 ` [syzbot] [netfilter?] " syzbot
  2024-03-06 22:38 ` syzbot
  0 siblings, 2 replies; 11+ messages in thread
From: syzbot @ 2022-03-20 12:02 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    91265a6da44d Add linux-next specific files for 20220303
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=157605d5700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=143195d9700000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=170736a9700000

Bisection is inconclusive: the issue happens on the oldest tested release.

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=141bc691700000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=161bc691700000
console output: https://syzkaller.appspot.com/x/log.txt?x=121bc691700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+eec403943a2a2455adaa@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	0-...!: (1 GPs behind) idle=a59/1/0x4000000000000000 softirq=5468/5472 fqs=5 
	(detected by 1, t=10502 jiffies, g=4825, q=140)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3644 Comm: kworker/0:1 Not tainted 5.17.0-rc6-next-20220303-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_power_efficient gc_worker
RIP: 0010:hlock_class kernel/locking/lockdep.c:240 [inline]
RIP: 0010:__lock_acquire+0x1460/0x56c0 kernel/locking/lockdep.c:5056
Code: 0f b7 db be 08 00 00 00 48 89 d8 48 c1 f8 06 48 8d 3c c5 80 59 01 90 e8 fe 54 67 00 48 0f a3 1d f6 ed a3 0e 0f 83 2c 06 00 00 <48> 8d 1c 5b 48 c1 e3 06 48 81 c3 a0 5d 01 90 48 8d 7b 40 48 b8 00
RSP: 0018:ffffc90000007be0 EFLAGS: 00000047
RAX: 0000000000000001 RBX: 0000000000000029 RCX: ffffffff815d6b82
RDX: fffffbfff2002b31 RSI: 0000000000000008 RDI: ffffffff90015980
RBP: ffff88801bc4a842 R08: 0000000000000000 R09: ffffffff90015987
R10: fffffbfff2002b30 R11: 0000000000000001 R12: ffff88801bc4a820
R13: ffff88801bc49d40 R14: 0000000000000001 R15: 3d0b7b89c89fff91
FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000600 CR3: 0000000076bbe000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 lock_acquire kernel/locking/lockdep.c:5672 [inline]
 lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5637
 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline]
 _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:170
 __run_hrtimer kernel/time/hrtimer.c:1689 [inline]
 __hrtimer_run_queues+0x243/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:27 [inline]
RIP: 0010:check_kcov_mode kernel/kcov.c:166 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60 kernel/kcov.c:200
Code: 48 89 ef 5d e9 51 f3 4a 00 5d be 03 00 00 00 e9 96 96 74 02 66 0f 1f 44 00 00 48 8b be b0 01 00 00 e8 b4 ff ff ff 31 c0 c3 90 <65> 8b 05 29 49 89 7e 89 c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b
RSP: 0018:ffffc90003ccfc30 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000000
RDX: ffff88801bc49d40 RSI: ffffffff87821df7 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff90015a17
R10: ffffffff87821ded R11: 0000000000000001 R12: dffffc0000000000
R13: 0000000000000000 R14: 0000000000040000 R15: 0000000000040000
 __seqprop_spinlock_sequence include/linux/seqlock.h:277 [inline]
 nf_conntrack_get_ht include/net/netfilter/nf_conntrack.h:331 [inline]
 gc_worker+0x24d/0x12b0 net/netfilter/nf_conntrack_core.c:1441
 process_one_work+0x996/0x1610 kernel/workqueue.c:2289
 worker_thread+0x665/0x1080 kernel/workqueue.c:2436
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.570 msecs
rcu: rcu_preempt kthread starved for 10492 jiffies! g4825 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28752 pid:   16 ppid:     2 flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5043 [inline]
 __schedule+0xa94/0x4910 kernel/sched/core.c:6352
 schedule+0xd2/0x1f0 kernel/sched/core.c:6424
 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881
 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1999
 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2172
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
NMI backtrace for cpu 1
CPU: 1 PID: 45 Comm: kworker/u4:2 Not tainted 5.17.0-rc6-next-20220303-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound toggle_allocation_gate
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
 nmi_trigger_cpumask_backtrace+0x1e6/0x230 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_check_gp_kthread_starvation.cold+0x1fb/0x200 kernel/rcu/tree_stall.h:516
 print_other_cpu_stall kernel/rcu/tree_stall.h:621 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:767 [inline]
 rcu_pending kernel/rcu/tree.c:3960 [inline]
 rcu_sched_clock_irq+0x21ae/0x22a0 kernel/rcu/tree.c:2660
 update_process_times+0x16d/0x200 kernel/time/timer.c:1785
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:243
 tick_sched_timer+0xee/0x120 kernel/time/tick-sched.c:1473
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:check_kcov_mode+0xf/0x40 kernel/kcov.c:166
Code: 7c 24 08 e8 b3 9b 4b 00 e9 61 fd ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 65 8b 05 79 4d 89 7e 89 c2 81 e2 00 01 00 00 <a9> 00 01 ff 00 74 10 31 c0 85 d2 74 15 8b 96 ac 15 00 00 85 d2 74
RSP: 0018:ffffc90000b679c8 EFLAGS: 00000246
RAX: 0000000000000001 RBX: ffff8880b9c42380 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffff888016df8000 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff816d99aa R11: 0000000000000000 R12: ffffed1017388471
R13: 0000000000000000 R14: ffff8880b9c42388 R15: 0000000000000001
 write_comp_data kernel/kcov.c:221 [inline]
 __sanitizer_cov_trace_const_cmp4+0x1c/0x70 kernel/kcov.c:287
 csd_lock_wait kernel/smp.c:443 [inline]
 smp_call_function_many_cond+0x50a/0xc90 kernel/smp.c:972
 on_each_cpu_cond_mask+0x56/0xa0 kernel/smp.c:1138
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:1146 [inline]
 text_poke_bp_batch+0x3e9/0x6b0 arch/x86/kernel/alternative.c:1387
 text_poke_flush arch/x86/kernel/alternative.c:1504 [inline]
 text_poke_flush arch/x86/kernel/alternative.c:1501 [inline]
 text_poke_finish+0x16/0x30 arch/x86/kernel/alternative.c:1511
 arch_jump_label_transform_apply+0x13/0x20 arch/x86/kernel/jump_label.c:146
 jump_label_update+0x32f/0x410 kernel/jump_label.c:830
 static_key_enable_cpuslocked+0x1b1/0x260 kernel/jump_label.c:177
 static_key_enable+0x16/0x20 kernel/jump_label.c:190
 toggle_allocation_gate mm/kfence/core.c:735 [inline]
 toggle_allocation_gate+0x100/0x390 mm/kfence/core.c:727
 process_one_work+0x996/0x1610 kernel/workqueue.c:2289
 worker_thread+0x665/0x1080 kernel/workqueue.c:2436
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
----------------
Code disassembly (best guess):
   0:	0f b7 db             	movzwl %bx,%ebx
   3:	be 08 00 00 00       	mov    $0x8,%esi
   8:	48 89 d8             	mov    %rbx,%rax
   b:	48 c1 f8 06          	sar    $0x6,%rax
   f:	48 8d 3c c5 80 59 01 	lea    -0x6ffea680(,%rax,8),%rdi
  16:	90
  17:	e8 fe 54 67 00       	callq  0x67551a
  1c:	48 0f a3 1d f6 ed a3 	bt     %rbx,0xea3edf6(%rip)        # 0xea3ee1a
  23:	0e
  24:	0f 83 2c 06 00 00    	jae    0x656
* 2a:	48 8d 1c 5b          	lea    (%rbx,%rbx,2),%rbx <-- trapping instruction
  2e:	48 c1 e3 06          	shl    $0x6,%rbx
  32:	48 81 c3 a0 5d 01 90 	add    $0xffffffff90015da0,%rbx
  39:	48 8d 7b 40          	lea    0x40(%rbx),%rdi
  3d:	48                   	rex.W
  3e:	b8                   	.byte 0xb8


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] [netfilter?] INFO: rcu detected stall in gc_worker (3)
  2022-03-20 12:02 [syzbot] INFO: rcu detected stall in gc_worker (3) syzbot
@ 2023-09-24 10:59 ` syzbot
  2024-03-06 22:38 ` syzbot
  1 sibling, 0 replies; 11+ messages in thread
From: syzbot @ 2023-09-24 10:59 UTC (permalink / raw)
  To: bpf, coreteam, davem, dvyukov, edumazet, fw, gautamramk, hdanton,
	jhs, jiri, kadlec, kuba, lesliemonis, linux-kernel,
	mohitbhasi1998, netdev, netfilter-devel, pabeni, pablo, paulmck,
	sdp.sachin, syzkaller-bugs, tahiliani, tglx, vsaicharan1998,
	xiyou.wangcong

syzbot has bisected this issue to:

commit ec97ecf1ebe485a17cd8395a5f35e6b80b57665a
Author: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Date:   Wed Jan 22 18:22:33 2020 +0000

    net: sched: add Flow Queue PIE packet scheduler

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15c5748e680000
start commit:   d4a7ce642100 igc: Fix Kernel Panic during ndo_tx_timeout c..
git tree:       net
final oops:     https://syzkaller.appspot.com/x/report.txt?x=17c5748e680000
console output: https://syzkaller.appspot.com/x/log.txt?x=13c5748e680000
kernel config:  https://syzkaller.appspot.com/x/.config?x=77b9a3cf8f44c6da
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1504b511a80000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=137bf931a80000

Reported-by: syzbot+eec403943a2a2455adaa@syzkaller.appspotmail.com
Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] [netfilter?] INFO: rcu detected stall in gc_worker (3)
  2022-03-20 12:02 [syzbot] INFO: rcu detected stall in gc_worker (3) syzbot
  2023-09-24 10:59 ` [syzbot] [netfilter?] " syzbot
@ 2024-03-06 22:38 ` syzbot
  1 sibling, 0 replies; 11+ messages in thread
From: syzbot @ 2024-03-06 22:38 UTC (permalink / raw)
  To: bpf, coreteam, davem, dvyukov, edumazet, fw, gautamramk, hdanton,
	jhs, jiri, kadlec, kuba, lesliemonis, linux-kernel,
	michal.kubiak, mohitbhasi1998, netdev, netfilter-devel, pabeni,
	pablo, paulmck, sdp.sachin, syzkaller-bugs, tahiliani, tglx,
	vsaicharan1998, xiyou.wangcong

syzbot suspects this issue was fixed by commit:

commit 8c21ab1bae945686c602c5bfa4e3f3352c2452c5
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Aug 29 12:35:41 2023 +0000

    net/sched: fq_pie: avoid stalls in fq_pie_timer()

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1338df0e180000
start commit:   d4a7ce642100 igc: Fix Kernel Panic during ndo_tx_timeout c..
git tree:       net
kernel config:  https://syzkaller.appspot.com/x/.config?x=77b9a3cf8f44c6da
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1504b511a80000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=137bf931a80000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: net/sched: fq_pie: avoid stalls in fq_pie_timer()

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220731094805.847-1-hdanton@sina.com>
@ 2022-07-31 10:09 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-07-31 10:09 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	0-...!: (1 GPs behind) idle=cad/1/0x4000000000000000 softirq=7023/7025 fqs=2 
	(detected by 1, t=10502 jiffies, g=7257, q=133)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 4073 Comm: syz-executor.0 Not tainted 5.17.0-rc6-next-20220303-syzkaller-dirty #0
Harclient_loop: send disconnect: Broken pipe


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=166fe82e080000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=15c240c1080000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220731081548.659-1-hdanton@sina.com>
@ 2022-07-31  8:31 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-07-31  8:31 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 2654 jiffies s: 2049 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Task dump for CPU 0:
task:syz-executor.0  state:R  running task     stack:27224 pid: 4073 ppid:  4053 flags:0x0000400e
Call Trace:
 <TASK>
 </TASK>


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11978a82080000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=14ff72fe080000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220731074633.519-1-hdanton@sina.com>
@ 2022-07-31  8:03 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-07-31  8:03 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in __hrtimer_run_queues

------------[ cut here ]------------
On CPU1 hrtimer tick_sched_timer took more than 4 ticks
WARNING: CPU: 1 PID: 0 at kernel/time/hrtimer.c:1689 __run_hrtimer kernel/time/hrtimer.c:1689 [inline]
WARNING: CPU: 1 PID: 0 at kernel/time/hrtimer.c:1689 __hrtimer_run_queues+0xe3c/0x1000 kernel/time/hrtimer.c:1754
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc6-next-20220303-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
RIP: 0010:__run_hrtimer kernel/time/hrtimer.c:1689 [inline]
RIP: 0010:__hrtimer_run_queues+0xe3c/0x1000 kernel/time/hrtimer.c:1754
Code: fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e b7 01 00 00 41 8b 77 40 48 c7 c7 a0 68 ce 89 48 8b 54 24 08 e8 50 82 b0 07 <0f> 0b e9 62 f4 ff ff e8 c8 42 5c 00 e9 5c f6 ff ff 48 8b 7c 24 18
RSP: 0018:ffffc900001e0e30 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff888011a21d40 RSI: ffffffff81602878 RDI: fffff5200003c1b8
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815fd23e R11: 0000000000000000 R12: ffff8880b9d2afa0
R13: 0000000000000000 R14: ffff8880b9d2a680 R15: ffff8880b9d2a600
FS:  0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fba30aa0097 CR3: 000000002354c000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1816
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:116 [inline]
RIP: 0010:acpi_idle_do_entry+0x1c6/0x250 drivers/acpi/processor_idle.c:556
Code: 89 de e8 4d e6 18 f8 84 db 75 ac e8 64 e2 18 f8 e8 1f 2b 1f f8 eb 0c e8 58 e2 18 f8 0f 00 2d 91 ed d1 00 e8 4c e2 18 f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 c7 e4 18 f8 48 85 db
RSP: 0018:ffffc90000177d18 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888011a21d40 RSI: ffffffff896045f4 RDI: 0000000000000000
RBP: ffff888016545864 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff817f7138 R11: 0000000000000000 R12: 0000000000000001
R13: ffff888016545800 R14: ffff888016545864 R15: ffff8880199b0004
 acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:692
 cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237
 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
 call_cpuidle kernel/sched/idle.c:158 [inline]
 cpuidle_idle_call kernel/sched/idle.c:239 [inline]
 do_idle+0x3e8/0x590 kernel/sched/idle.c:306
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:403
 start_secondary+0x265/0x340 arch/x86/kernel/smpboot.c:272
 secondary_startup_64_no_verify+0xc3/0xcb
 </TASK>
----------------
Code disassembly (best guess):
   0:	89 de                	mov    %ebx,%esi
   2:	e8 4d e6 18 f8       	callq  0xf818e654
   7:	84 db                	test   %bl,%bl
   9:	75 ac                	jne    0xffffffb7
   b:	e8 64 e2 18 f8       	callq  0xf818e274
  10:	e8 1f 2b 1f f8       	callq  0xf81f2b34
  15:	eb 0c                	jmp    0x23
  17:	e8 58 e2 18 f8       	callq  0xf818e274
  1c:	0f 00 2d 91 ed d1 00 	verw   0xd1ed91(%rip)        # 0xd1edb4
  23:	e8 4c e2 18 f8       	callq  0xf818e274
  28:	fb                   	sti
  29:	f4                   	hlt
* 2a:	9c                   	pushfq <-- trapping instruction
  2b:	5b                   	pop    %rbx
  2c:	81 e3 00 02 00 00    	and    $0x200,%ebx
  32:	fa                   	cli
  33:	31 ff                	xor    %edi,%edi
  35:	48 89 de             	mov    %rbx,%rsi
  38:	e8 c7 e4 18 f8       	callq  0xf818e504
  3d:	48 85 db             	test   %rbx,%rbx


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11658846080000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=161e281e080000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found]   ` <20220321080843.3060-1-hdanton@sina.com>
@ 2022-03-21 21:01     ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2022-03-21 21:01 UTC (permalink / raw)
  To: Hillf Danton; +Cc: syzbot, tglx, Dmitry Vyukov, linux-kernel, syzkaller-bugs

On Mon, Mar 21, 2022 at 04:08:43PM +0800, Hillf Danton wrote:
> Hi Paul and tglx
> 
> On Sun, 20 Mar 2022 23:43:07 -0700
> > Hello,
> > 
> > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > BUG: soft lockup in smp_call_function
> > 
> > watchdog: BUG: soft lockup - CPU#0 stuck for 143s! [kworker/u4:5:1244]
> > Modules linked in:
> > irq event stamp: 595274
> > hardirqs last  enabled at (595273): [<ffffffff89600c02>] asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
> > hardirqs last disabled at (595274): [<ffffffff894c14ab>] sysvec_apic_timer_interrupt+0xb/0xc0 arch/x86/kernel/apic/apic.c:1097
> > softirqs last  enabled at (588534): [<ffffffff81474343>] invoke_softirq kernel/softirq.c:432 [inline]
> > softirqs last  enabled at (588534): [<ffffffff81474343>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
> > softirqs last disabled at (588525): [<ffffffff81474343>] invoke_softirq kernel/softirq.c:432 [inline]
> > softirqs last disabled at (588525): [<ffffffff81474343>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
> > CPU: 0 PID: 1244 Comm: kworker/u4:5 Not tainted 5.17.0-syzkaller-00083-gf443e374ae13-dirty #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Workqueue: events_unbound toggle_allocation_gate
> > RIP: 0010:__sanitizer_cov_trace_pc+0x5c/0x60 kernel/kcov.c:210
> > Code: 82 80 15 00 00 83 f8 02 75 20 48 8b 8a 88 15 00 00 8b 92 84 15 00 00 48 8b 01 48 83 c0 01 48 39 c2 76 07 48 89 34 c1 48 89 01 <c3> 0f 1f 00 41 55 41 54 49 89 fc 55 48 bd eb 83 b5 80 46 86 c8 61
> > RSP: 0018:ffffc90005a3f9e8 EFLAGS: 00000293
> > RAX: 0000000000000000 RBX: ffff8880b9d3fec0 RCX: 0000000000000000
> > RDX: ffff88801cf21d00 RSI: ffffffff816d3654 RDI: 0000000000000003
> > RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
> > R10: ffffffff816d367a R11: 0000000000000000 R12: ffffed10173a7fd9
> > R13: 0000000000000001 R14: ffff8880b9d3fec8 R15: 0000000000000001
> > FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007ffe833fd960 CR3: 000000000b88e000 CR4: 00000000003506f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline]
> >  cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline]
> >  csd_lock_wait kernel/smp.c:440 [inline]
> 
> Given soft lockup detected on CPU0 waiting for IPI echo, is this report
> the evidence for hardware glitch wrt handling IPI? IOW can we rule out
> software ones in this case?

There are several possible causes:

1.  Hardware/firmware delays as you say.
2.  Some part of the kernel disabling irqs for too long.
3.  Some specific IPI handler is running for too long.
4.  So many IPIs are being sent to this CPU that it cannot keep up.

Tracing should help distinguish between these.

							Thanx, Paul

> Thanks
> Hillf
> 
> >  smp_call_function_many_cond+0x4e4/0xc90 kernel/smp.c:969
> >  on_each_cpu_cond_mask+0x56/0xa0 kernel/smp.c:1135
> >  on_each_cpu include/linux/smp.h:71 [inline]
> >  text_poke_sync arch/x86/kernel/alternative.c:1112 [inline]
> >  text_poke_bp_batch+0x21d/0x6f0 arch/x86/kernel/alternative.c:1300
> >  text_poke_flush arch/x86/kernel/alternative.c:1470 [inline]
> >  text_poke_flush arch/x86/kernel/alternative.c:1467 [inline]
> >  text_poke_finish+0x16/0x30 arch/x86/kernel/alternative.c:1477
> >  arch_jump_label_transform_apply+0x13/0x20 arch/x86/kernel/jump_label.c:146
> >  jump_label_update+0x32f/0x410 kernel/jump_label.c:830
> >  static_key_enable_cpuslocked+0x1b1/0x260 kernel/jump_label.c:177
> >  static_key_enable+0x16/0x20 kernel/jump_label.c:190
> >  toggle_allocation_gate mm/kfence/core.c:735 [inline]
> >  toggle_allocation_gate+0x100/0x390 mm/kfence/core.c:727
> >  process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
> >  worker_thread+0x657/0x1110 kernel/workqueue.c:2454
> >  kthread+0x2e9/0x3a0 kernel/kthread.c:377
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
> >  </TASK>
> > Sending NMI from CPU 0 to CPUs 1:
> > NMI backtrace for cpu 1
> > CPU: 1 PID: 4081 Comm: syz-executor201 Not tainted 5.17.0-syzkaller-00083-gf443e374ae13-dirty #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:_raw_spin_lock_irqsave+0x3/0x50 kernel/locking/spinlock.c:161
> > Code: 31 d2 31 f6 e8 fe 9b 0e f8 48 89 ef 58 5d e9 04 0f 0f f8 e8 bf 0a 30 f8 eb c9 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 41 54 55 <48> 89 fd 9c 41 5c fa 41 f7 c4 00 02 00 00 75 36 bf 01 00 00 00 e8
> > RSP: 0018:ffffc90000dc0d40 EFLAGS: 00000002
> > RAX: 0000000000001399 RBX: 0000000000058308 RCX: 0000000000000000
> > RDX: 61c8864680b583eb RSI: ffffffff89ae6860 RDI: ffffffff906f8610
> > RBP: ffffffff906f8608 R08: ffffffff906f8610 R09: 0000000000000001
> > R10: ffffffff81681c55 R11: 0000000000000000 R12: dffffc0000000000
> > R13: ffffffff89ae6860 R14: 1ffff920001b81ad R15: ffff888140782b40
> > FS:  0000555555b0b300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000020000600 CR3: 000000001e6c1000 CR4: 00000000003506e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <IRQ>
> >  debug_object_deactivate lib/debugobjects.c:735 [inline]
> >  debug_object_deactivate+0x101/0x300 lib/debugobjects.c:723
> >  debug_hrtimer_deactivate kernel/time/hrtimer.c:425 [inline]
> >  debug_deactivate kernel/time/hrtimer.c:481 [inline]
> >  __run_hrtimer kernel/time/hrtimer.c:1653 [inline]
> >  __hrtimer_run_queues+0x3f8/0xe50 kernel/time/hrtimer.c:1749
> >  hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
> >  __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
> >  sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
> >  </IRQ>
> >  <TASK>
> >  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
> > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
> > RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:194
> > Code: 74 24 10 e8 6a 92 0e f8 48 89 ef e8 c2 10 0f f8 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 73 c8 01 f8 65 8b 05 9c 76 b3 76 85 c0 74 0a 5b 5d c3 e8 b0 04
> > RSP: 0018:ffffc9000289f1e8 EFLAGS: 00000206
> > RAX: 0000000000000012 RBX: 0000000000000200 RCX: 1ffffffff2001bce
> > RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000001
> > RBP: ffff888140782ae8 R08: 0000000000000001 R09: ffffffff8ffc5a07
> > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> > R13: ffff888140782ae8 R14: 0000000000000246 R15: ffff888140782800
> >  spin_unlock_irqrestore include/linux/spinlock.h:404 [inline]
> >  taprio_change+0x2f0c/0x4050 net/sched/sch_taprio.c:1606
> >  taprio_init+0x52e/0x670 net/sched/sch_taprio.c:1738
> >  qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253
> >  tc_modify_qdisc+0x4c5/0x1a00 net/sched/sch_api.c:1660
> >  rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5596
> >  netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
> >  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
> >  netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
> >  netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
> >  sock_sendmsg_nosec net/socket.c:705 [inline]
> >  sock_sendmsg+0xcf/0x120 net/socket.c:725
> >  ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
> >  ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
> >  __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f5f768be729
> > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 81 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007ffe833c99c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f5f768be729
> > RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004
> > RBP: 0000000000000000 R08: 00007ffe833c99f0 R09: 00007ffe833c99f0
> > R10: 00007ffe833c99f0 R11: 0000000000000246 R12: 00007ffe833c99ec
> > R13: 00007ffe833c9a00 R14: 00007ffe833c9a40 R15: 0000000000000000
> >  </TASK>
> > ----------------
> > Code disassembly (best guess), 1 bytes skipped:
> >    0:	80 15 00 00 83 f8 02 	adcb   $0x2,-0x77d0000(%rip)        # 0xf8830007
> >    7:	75 20                	jne    0x29
> >    9:	48 8b 8a 88 15 00 00 	mov    0x1588(%rdx),%rcx
> >   10:	8b 92 84 15 00 00    	mov    0x1584(%rdx),%edx
> >   16:	48 8b 01             	mov    (%rcx),%rax
> >   19:	48 83 c0 01          	add    $0x1,%rax
> >   1d:	48 39 c2             	cmp    %rax,%rdx
> >   20:	76 07                	jbe    0x29
> >   22:	48 89 34 c1          	mov    %rsi,(%rcx,%rax,8)
> >   26:	48 89 01             	mov    %rax,(%rcx)
> > * 29:	c3                   	retq <-- trapping instruction
> >   2a:	0f 1f 00             	nopl   (%rax)
> >   2d:	41 55                	push   %r13
> >   2f:	41 54                	push   %r12
> >   31:	49 89 fc             	mov    %rdi,%r12
> >   34:	55                   	push   %rbp
> >   35:	48 bd eb 83 b5 80 46 	movabs $0x61c8864680b583eb,%rbp
> >   3c:	86 c8 61
> > 
> > 
> > Tested on:
> > 
> > commit:         f443e374 Linux 5.17
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=157ae133700000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=19ca6f72fd444749
> > dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=15e78333700000

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220321063053.2877-1-hdanton@sina.com>
@ 2022-03-21  6:43 ` syzbot
       [not found]   ` <20220321080843.3060-1-hdanton@sina.com>
  0 siblings, 1 reply; 11+ messages in thread
From: syzbot @ 2022-03-21  6:43 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: soft lockup in smp_call_function

watchdog: BUG: soft lockup - CPU#0 stuck for 143s! [kworker/u4:5:1244]
Modules linked in:
irq event stamp: 595274
hardirqs last  enabled at (595273): [<ffffffff89600c02>] asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
hardirqs last disabled at (595274): [<ffffffff894c14ab>] sysvec_apic_timer_interrupt+0xb/0xc0 arch/x86/kernel/apic/apic.c:1097
softirqs last  enabled at (588534): [<ffffffff81474343>] invoke_softirq kernel/softirq.c:432 [inline]
softirqs last  enabled at (588534): [<ffffffff81474343>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
softirqs last disabled at (588525): [<ffffffff81474343>] invoke_softirq kernel/softirq.c:432 [inline]
softirqs last disabled at (588525): [<ffffffff81474343>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
CPU: 0 PID: 1244 Comm: kworker/u4:5 Not tainted 5.17.0-syzkaller-00083-gf443e374ae13-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:__sanitizer_cov_trace_pc+0x5c/0x60 kernel/kcov.c:210
Code: 82 80 15 00 00 83 f8 02 75 20 48 8b 8a 88 15 00 00 8b 92 84 15 00 00 48 8b 01 48 83 c0 01 48 39 c2 76 07 48 89 34 c1 48 89 01 <c3> 0f 1f 00 41 55 41 54 49 89 fc 55 48 bd eb 83 b5 80 46 86 c8 61
RSP: 0018:ffffc90005a3f9e8 EFLAGS: 00000293
RAX: 0000000000000000 RBX: ffff8880b9d3fec0 RCX: 0000000000000000
RDX: ffff88801cf21d00 RSI: ffffffff816d3654 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff816d367a R11: 0000000000000000 R12: ffffed10173a7fd9
R13: 0000000000000001 R14: ffff8880b9d3fec8 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe833fd960 CR3: 000000000b88e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline]
 cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline]
 csd_lock_wait kernel/smp.c:440 [inline]
 smp_call_function_many_cond+0x4e4/0xc90 kernel/smp.c:969
 on_each_cpu_cond_mask+0x56/0xa0 kernel/smp.c:1135
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:1112 [inline]
 text_poke_bp_batch+0x21d/0x6f0 arch/x86/kernel/alternative.c:1300
 text_poke_flush arch/x86/kernel/alternative.c:1470 [inline]
 text_poke_flush arch/x86/kernel/alternative.c:1467 [inline]
 text_poke_finish+0x16/0x30 arch/x86/kernel/alternative.c:1477
 arch_jump_label_transform_apply+0x13/0x20 arch/x86/kernel/jump_label.c:146
 jump_label_update+0x32f/0x410 kernel/jump_label.c:830
 static_key_enable_cpuslocked+0x1b1/0x260 kernel/jump_label.c:177
 static_key_enable+0x16/0x20 kernel/jump_label.c:190
 toggle_allocation_gate mm/kfence/core.c:735 [inline]
 toggle_allocation_gate+0x100/0x390 mm/kfence/core.c:727
 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
 worker_thread+0x657/0x1110 kernel/workqueue.c:2454
 kthread+0x2e9/0x3a0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 4081 Comm: syz-executor201 Not tainted 5.17.0-syzkaller-00083-gf443e374ae13-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:_raw_spin_lock_irqsave+0x3/0x50 kernel/locking/spinlock.c:161
Code: 31 d2 31 f6 e8 fe 9b 0e f8 48 89 ef 58 5d e9 04 0f 0f f8 e8 bf 0a 30 f8 eb c9 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 41 54 55 <48> 89 fd 9c 41 5c fa 41 f7 c4 00 02 00 00 75 36 bf 01 00 00 00 e8
RSP: 0018:ffffc90000dc0d40 EFLAGS: 00000002
RAX: 0000000000001399 RBX: 0000000000058308 RCX: 0000000000000000
RDX: 61c8864680b583eb RSI: ffffffff89ae6860 RDI: ffffffff906f8610
RBP: ffffffff906f8608 R08: ffffffff906f8610 R09: 0000000000000001
R10: ffffffff81681c55 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff89ae6860 R14: 1ffff920001b81ad R15: ffff888140782b40
FS:  0000555555b0b300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000600 CR3: 000000001e6c1000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 debug_object_deactivate lib/debugobjects.c:735 [inline]
 debug_object_deactivate+0x101/0x300 lib/debugobjects.c:723
 debug_hrtimer_deactivate kernel/time/hrtimer.c:425 [inline]
 debug_deactivate kernel/time/hrtimer.c:481 [inline]
 __run_hrtimer kernel/time/hrtimer.c:1653 [inline]
 __hrtimer_run_queues+0x3f8/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:194
Code: 74 24 10 e8 6a 92 0e f8 48 89 ef e8 c2 10 0f f8 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 73 c8 01 f8 65 8b 05 9c 76 b3 76 85 c0 74 0a 5b 5d c3 e8 b0 04
RSP: 0018:ffffc9000289f1e8 EFLAGS: 00000206
RAX: 0000000000000012 RBX: 0000000000000200 RCX: 1ffffffff2001bce
RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000001
RBP: ffff888140782ae8 R08: 0000000000000001 R09: ffffffff8ffc5a07
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: ffff888140782ae8 R14: 0000000000000246 R15: ffff888140782800
 spin_unlock_irqrestore include/linux/spinlock.h:404 [inline]
 taprio_change+0x2f0c/0x4050 net/sched/sch_taprio.c:1606
 taprio_init+0x52e/0x670 net/sched/sch_taprio.c:1738
 qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253
 tc_modify_qdisc+0x4c5/0x1a00 net/sched/sch_api.c:1660
 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5596
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:725
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f5f768be729
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 81 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe833c99c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f5f768be729
RDX: 0000000000000000 RSI: 00000000200007c0 RDI: 0000000000000004
RBP: 0000000000000000 R08: 00007ffe833c99f0 R09: 00007ffe833c99f0
R10: 00007ffe833c99f0 R11: 0000000000000246 R12: 00007ffe833c99ec
R13: 00007ffe833c9a00 R14: 00007ffe833c9a40 R15: 0000000000000000
 </TASK>
----------------
Code disassembly (best guess), 1 bytes skipped:
   0:	80 15 00 00 83 f8 02 	adcb   $0x2,-0x77d0000(%rip)        # 0xf8830007
   7:	75 20                	jne    0x29
   9:	48 8b 8a 88 15 00 00 	mov    0x1588(%rdx),%rcx
  10:	8b 92 84 15 00 00    	mov    0x1584(%rdx),%edx
  16:	48 8b 01             	mov    (%rcx),%rax
  19:	48 83 c0 01          	add    $0x1,%rax
  1d:	48 39 c2             	cmp    %rax,%rdx
  20:	76 07                	jbe    0x29
  22:	48 89 34 c1          	mov    %rsi,(%rcx,%rax,8)
  26:	48 89 01             	mov    %rax,(%rcx)
* 29:	c3                   	retq <-- trapping instruction
  2a:	0f 1f 00             	nopl   (%rax)
  2d:	41 55                	push   %r13
  2f:	41 54                	push   %r12
  31:	49 89 fc             	mov    %rdi,%r12
  34:	55                   	push   %rbp
  35:	48 bd eb 83 b5 80 46 	movabs $0x61c8864680b583eb,%rbp
  3c:	86 c8 61


Tested on:

commit:         f443e374 Linux 5.17
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=157ae133700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=19ca6f72fd444749
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=15e78333700000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220321031535.2804-1-hdanton@sina.com>
@ 2022-03-21  3:23 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-03-21  3:23 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to create VM pool: failed to create GCE image: create image operation failed: &{Code:PERMISSIONS_ERROR Location: Message:Required 'read' permission for 'disks/ci-upstream-linux-next-kasan-gce-root-test-job-test-job-image.tar.gz' ForceSendFields:[] NullFields:[]}.


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17ce6425700000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220321012046.2729-1-hdanton@sina.com>
@ 2022-03-21  1:32 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-03-21  1:32 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

IPv6: ADDRCONF(NETDEV_CHANGE): veth0_to_batadv: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1_to_batadv: link becomes ready
rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 2628 jiffies s: 2065 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Task dump for CPU 0:
task:syz-executor354 state:R  running task     stack:27224 pid: 4078 ppid:  4061 flags:0x0000400e
Call Trace:
 <TASK>
 </TASK>


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/
console output: https://syzkaller.appspot.com/x/log.txt?x=16355271700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17b1f271700000


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in gc_worker (3)
       [not found] <20220320152154.2662-1-hdanton@sina.com>
@ 2022-03-20 15:33 ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2022-03-20 15:33 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 2642 jiffies s: 2057 root: 0x2/.
rcu: blocking rcu_node structures (internal RCU debug):
Task dump for CPU 1:
task:syz-executor212 state:R  running task     stack:26424 pid: 4080 ppid:  4063 flags:0x0000000e
Call Trace:
 <TASK>
 </TASK>


Tested on:

commit:         91265a6d Add linux-next specific files for 20220303
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/
console output: https://syzkaller.appspot.com/x/log.txt?x=13f1becb700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=617f79440a35673a
dashboard link: https://syzkaller.appspot.com/bug?extid=eec403943a2a2455adaa
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=11d0caeb700000


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-03-06 22:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-20 12:02 [syzbot] INFO: rcu detected stall in gc_worker (3) syzbot
2023-09-24 10:59 ` [syzbot] [netfilter?] " syzbot
2024-03-06 22:38 ` syzbot
     [not found] <20220320152154.2662-1-hdanton@sina.com>
2022-03-20 15:33 ` [syzbot] " syzbot
     [not found] <20220321012046.2729-1-hdanton@sina.com>
2022-03-21  1:32 ` syzbot
     [not found] <20220321031535.2804-1-hdanton@sina.com>
2022-03-21  3:23 ` syzbot
     [not found] <20220321063053.2877-1-hdanton@sina.com>
2022-03-21  6:43 ` syzbot
     [not found]   ` <20220321080843.3060-1-hdanton@sina.com>
2022-03-21 21:01     ` Paul E. McKenney
     [not found] <20220731074633.519-1-hdanton@sina.com>
2022-07-31  8:03 ` syzbot
     [not found] <20220731081548.659-1-hdanton@sina.com>
2022-07-31  8:31 ` syzbot
     [not found] <20220731094805.847-1-hdanton@sina.com>
2022-07-31 10:09 ` syzbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.