All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)
@ 2021-12-22 17:33 syzbot
       [not found] ` <20211225005253.1962-1-hdanton@sina.com>
  0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2021-12-22 17:33 UTC (permalink / raw)
  To: fweisbec, linux-kernel, mingo, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    9eaa88c7036e Merge tag 'libata-5.16-rc6' of git://git.kern..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=142d4143b00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e3bdfd29b408d1b6
dashboard link: https://syzkaller.appspot.com/bug?extid=03464269af631f4a4bdf
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11b1537db00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=125de549b00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+03464269af631f4a4bdf@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
	(detected by 0, t=10502 jiffies, g=5509, q=170)
rcu: All QSes seen, last rcu_preempt kthread activity 8642 (4295000735-4294992093), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 8643 jiffies! g5509 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28208 pid:   14 ppid:     2 flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:4972 [inline]
 context_switch kernel/sched/core.c:4972 [inline] kernel/sched/core.c:6253
 __schedule+0xa9a/0x4940 kernel/sched/core.c:6253 kernel/sched/core.c:6253
 schedule+0xd2/0x260 kernel/sched/core.c:6326 kernel/sched/core.c:6326
 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 kernel/time/timer.c:1881
 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955 kernel/rcu/tree.c:1955
 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128 kernel/rcu/tree.c:2128
 kthread+0x405/0x4f0 kernel/kthread.c:327 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 arch/x86/entry/entry_64.S:295
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
NMI backtrace for cpu 0
CPU: 0 PID: 3672 Comm: syz-executor549 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 __dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 lib/nmi_backtrace.c:111
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] kernel/rcu/tree_stall.h:481
 rcu_check_gp_kthread_starvation.cold+0x1fb/0x200 kernel/rcu/tree_stall.h:481 kernel/rcu/tree_stall.h:481
 print_other_cpu_stall kernel/rcu/tree_stall.h:586 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:729 [inline]
 rcu_pending kernel/rcu/tree.c:3878 [inline]
 print_other_cpu_stall kernel/rcu/tree_stall.h:586 [inline] kernel/rcu/tree.c:2597
 check_cpu_stall kernel/rcu/tree_stall.h:729 [inline] kernel/rcu/tree.c:2597
 rcu_pending kernel/rcu/tree.c:3878 [inline] kernel/rcu/tree.c:2597
 rcu_sched_clock_irq+0x2125/0x2200 kernel/rcu/tree.c:2597 kernel/rcu/tree.c:2597
 update_process_times+0x16d/0x200 kernel/time/timer.c:1785 kernel/time/timer.c:1785
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 kernel/time/tick-sched.c:226
 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 kernel/time/tick-sched.c:1428
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline] kernel/time/hrtimer.c:1749
 __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] arch/x86/kernel/apic/apic.c:1103
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 arch/x86/include/asm/idtentry.h:638
RIP: 0010:owner_on_cpu kernel/locking/rwsem.c:667 [inline]
RIP: 0010:owner_on_cpu kernel/locking/rwsem.c:667 [inline] kernel/locking/rwsem.c:750
RIP: 0010:rwsem_spin_on_owner+0x1aa/0x290 kernel/locking/rwsem.c:750 kernel/locking/rwsem.c:750
Code: a8 08 75 33 49 8d 7c 24 34 48 89 f8 48 c1 e8 03 42 0f b6 14 28 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 b0 00 00 00 <41> 8b 44 24 34 85 c0 0f 85 14 ff ff ff b8 08 00 00 00 48 83 c4 08
RSP: 0018:ffffc90002817448 EFLAGS: 00000246
RAX: 0000000000000007 RBX: ffff88806da34680 RCX: ffffffff815a9743
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88807d2000b4
RBP: ffff88806da34688 R08: 0000000000000000 R09: ffff888078b322c7
R10: ffffed100f166458 R11: 0000000000000001 R12: ffff88807d200080
R13: dffffc0000000000 R14: ffff888078b322c0 R15: ffffed100f166458
 rwsem_optimistic_spin kernel/locking/rwsem.c:808 [inline]
 rwsem_optimistic_spin kernel/locking/rwsem.c:808 [inline] kernel/locking/rwsem.c:1068
 rwsem_down_write_slowpath+0x1a8/0x1110 kernel/locking/rwsem.c:1068 kernel/locking/rwsem.c:1068
 __down_write_common kernel/locking/rwsem.c:1268 [inline]
 __down_write_common kernel/locking/rwsem.c:1265 [inline]
 __down_write kernel/locking/rwsem.c:1277 [inline]
 __down_write_common kernel/locking/rwsem.c:1268 [inline] kernel/locking/rwsem.c:1524
 __down_write_common kernel/locking/rwsem.c:1265 [inline] kernel/locking/rwsem.c:1524
 __down_write kernel/locking/rwsem.c:1277 [inline] kernel/locking/rwsem.c:1524
 down_write+0x135/0x150 kernel/locking/rwsem.c:1524 kernel/locking/rwsem.c:1524
 inode_lock include/linux/fs.h:783 [inline]
 inode_lock include/linux/fs.h:783 [inline] fs/ext4/file.c:263
 ext4_buffered_write_iter+0xb6/0x3b0 fs/ext4/file.c:263 fs/ext4/file.c:263
 ext4_file_write_iter+0x423/0x14e0 fs/ext4/file.c:681 fs/ext4/file.c:681
 call_write_iter include/linux/fs.h:2162 [inline]
 call_write_iter include/linux/fs.h:2162 [inline] fs/read_write.c:725
 do_iter_readv_writev+0x472/0x750 fs/read_write.c:725 fs/read_write.c:725
 do_iter_write+0x188/0x710 fs/read_write.c:851 fs/read_write.c:851
 vfs_iter_write+0x70/0xa0 fs/read_write.c:892 fs/read_write.c:892
 iter_file_splice_write+0x6fa/0xc10 fs/splice.c:689 fs/splice.c:689
 do_splice_from fs/splice.c:767 [inline]
 do_splice_from fs/splice.c:767 [inline] fs/splice.c:936
 direct_splice_actor+0x110/0x180 fs/splice.c:936 fs/splice.c:936
 splice_direct_to_actor+0x34b/0x8c0 fs/splice.c:891 fs/splice.c:891
 do_splice_direct+0x1b3/0x280 fs/splice.c:979 fs/splice.c:979
 do_sendfile+0xaf2/0x1250 fs/read_write.c:1245 fs/read_write.c:1245
 __do_sys_sendfile64 fs/read_write.c:1310 [inline]
 __se_sys_sendfile64 fs/read_write.c:1296 [inline]
 __do_sys_sendfile64 fs/read_write.c:1310 [inline] fs/read_write.c:1296
 __se_sys_sendfile64 fs/read_write.c:1296 [inline] fs/read_write.c:1296
 __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1296 fs/read_write.c:1296
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_x64 arch/x86/entry/common.c:50 [inline] arch/x86/entry/common.c:80
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f34d1e3f169
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc49095728 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f34d1e3f169
RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
R10: 00008400fffffffb R11: 0000000000000246 R12: 0000000000076d4d
R13: 00007ffc4909573c R14: 00007ffc49095750 R15: 00007ffc49095740
 </TASK>
----------------
Code disassembly (best guess):
   0:	a8 08                	test   $0x8,%al
   2:	75 33                	jne    0x37
   4:	49 8d 7c 24 34       	lea    0x34(%r12),%rdi
   9:	48 89 f8             	mov    %rdi,%rax
   c:	48 c1 e8 03          	shr    $0x3,%rax
  10:	42 0f b6 14 28       	movzbl (%rax,%r13,1),%edx
  15:	48 89 f8             	mov    %rdi,%rax
  18:	83 e0 07             	and    $0x7,%eax
  1b:	83 c0 03             	add    $0x3,%eax
  1e:	38 d0                	cmp    %dl,%al
  20:	7c 08                	jl     0x2a
  22:	84 d2                	test   %dl,%dl
  24:	0f 85 b0 00 00 00    	jne    0xda
* 2a:	41 8b 44 24 34       	mov    0x34(%r12),%eax <-- trapping instruction
  2f:	85 c0                	test   %eax,%eax
  31:	0f 85 14 ff ff ff    	jne    0xffffff4b
  37:	b8 08 00 00 00       	mov    $0x8,%eax
  3c:	48 83 c4 08          	add    $0x8,%rsp
----------------
Code disassembly (best guess):
   0:	a8 08                	test   $0x8,%al
   2:	75 33                	jne    0x37
   4:	49 8d 7c 24 34       	lea    0x34(%r12),%rdi
   9:	48 89 f8             	mov    %rdi,%rax
   c:	48 c1 e8 03          	shr    $0x3,%rax
  10:	42 0f b6 14 28       	movzbl (%rax,%r13,1),%edx
  15:	48 89 f8             	mov    %rdi,%rax
  18:	83 e0 07             	and    $0x7,%eax
  1b:	83 c0 03             	add    $0x3,%eax
  1e:	38 d0                	cmp    %dl,%al
  20:	7c 08                	jl     0x2a
  22:	84 d2                	test   %dl,%dl
  24:	0f 85 b0 00 00 00    	jne    0xda
* 2a:	41 8b 44 24 34       	mov    0x34(%r12),%eax <-- trapping instruction
  2f:	85 c0                	test   %eax,%eax
  31:	0f 85 14 ff ff ff    	jne    0xffffff4b
  37:	b8 08 00 00 00       	mov    $0x8,%eax
  3c:	48 83 c4 08          	add    $0x8,%rsp


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)
       [not found] ` <20211225005253.1962-1-hdanton@sina.com>
@ 2021-12-28  3:14   ` Waiman Long
  2021-12-29 21:29     ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Waiman Long @ 2021-12-28  3:14 UTC (permalink / raw)
  To: Hillf Danton, syzbot; +Cc: linux-kernel, Peter Zijlstra, syzkaller-bugs


On 12/24/21 19:52, Hillf Danton wrote:
> On Wed, 22 Dec 2021 09:33:26 -0800
>> syzbot found the following issue on:
>>
>> HEAD commit:    9eaa88c7036e Merge tag 'libata-5.16-rc6' of git://git.kern..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=142d4143b00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e3bdfd29b408d1b6
>> dashboard link: https://syzkaller.appspot.com/bug?extid=03464269af631f4a4bdf
>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11b1537db00000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=125de549b00000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+03464269af631f4a4bdf@syzkaller.appspotmail.com
>>
>> rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>> 	(detected by 0, t=10502 jiffies, g=5509, q=170)
>> rcu: All QSes seen, last rcu_preempt kthread activity 8642 (4295000735-4294992093), jiffies_till_next_fqs=1, root ->qsmask 0x0
>> rcu: rcu_preempt kthread starved for 8643 jiffies! g5509 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>> rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
>> rcu: RCU grace-period kthread stack dump:
>> task:rcu_preempt     state:R  running task     stack:28208 pid:   14 ppid:     2 flags:0x00004000
>> Call Trace:
>>   <TASK>
>>   context_switch kernel/sched/core.c:4972 [inline]
>>   context_switch kernel/sched/core.c:4972 [inline] kernel/sched/core.c:6253
>>   __schedule+0xa9a/0x4940 kernel/sched/core.c:6253 kernel/sched/core.c:6253
>>   schedule+0xd2/0x260 kernel/sched/core.c:6326 kernel/sched/core.c:6326
>>   schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 kernel/time/timer.c:1881
>>   rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955 kernel/rcu/tree.c:1955
>>   rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128 kernel/rcu/tree.c:2128
>>   kthread+0x405/0x4f0 kernel/kthread.c:327 kernel/kthread.c:327
>>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 arch/x86/entry/entry_64.S:295
>>   </TASK>
>> rcu: Stack dump where RCU GP kthread last ran:
>> NMI backtrace for cpu 0
>> CPU: 0 PID: 3672 Comm: syz-executor549 Not tainted 5.16.0-rc5-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>   <IRQ>
>>   __dump_stack lib/dump_stack.c:88 [inline]
>>   __dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106
>>   dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 lib/dump_stack.c:106
>>   nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 lib/nmi_backtrace.c:111
>>   nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 lib/nmi_backtrace.c:62
>>   trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
>>   trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] kernel/rcu/tree_stall.h:481
>>   rcu_check_gp_kthread_starvation.cold+0x1fb/0x200 kernel/rcu/tree_stall.h:481 kernel/rcu/tree_stall.h:481
>>   print_other_cpu_stall kernel/rcu/tree_stall.h:586 [inline]
>>   check_cpu_stall kernel/rcu/tree_stall.h:729 [inline]
>>   rcu_pending kernel/rcu/tree.c:3878 [inline]
>>   print_other_cpu_stall kernel/rcu/tree_stall.h:586 [inline] kernel/rcu/tree.c:2597
>>   check_cpu_stall kernel/rcu/tree_stall.h:729 [inline] kernel/rcu/tree.c:2597
>>   rcu_pending kernel/rcu/tree.c:3878 [inline] kernel/rcu/tree.c:2597
>>   rcu_sched_clock_irq+0x2125/0x2200 kernel/rcu/tree.c:2597 kernel/rcu/tree.c:2597
>>   update_process_times+0x16d/0x200 kernel/time/timer.c:1785 kernel/time/timer.c:1785
>>   tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 kernel/time/tick-sched.c:226
>>   tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 kernel/time/tick-sched.c:1428
>>   __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>>   __run_hrtimer kernel/time/hrtimer.c:1685 [inline] kernel/time/hrtimer.c:1749
>>   __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 kernel/time/hrtimer.c:1749
>>   hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 kernel/time/hrtimer.c:1811
>>   local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
>>   local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] arch/x86/kernel/apic/apic.c:1103
>>   __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 arch/x86/kernel/apic/apic.c:1103
>>   sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 arch/x86/kernel/apic/apic.c:1097
>>   </IRQ>
>>   <TASK>
>>   asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 arch/x86/include/asm/idtentry.h:638
>> RIP: 0010:owner_on_cpu kernel/locking/rwsem.c:667 [inline]
> The lock acquirer was spinning long enough to make RCU stall, and the
> question is, if this is not reproducible, how was the report triggered?
> Anything that syzbot is special in this case?
>
> Thanks
> 		Hillf

The test was running on a CONFIG_PREEMPT kernel. So if the syzkaller 
kthread is running at a higher priority than the rcu_preempt kthread, it 
is possible for the rcu_preempt kthread to be starved of cpu time. The 
rwsem optimistic spinning code will relinquish the cpu if there is a 
higher priority thread running. Since rcu_preempt kthread did not seem 
to be able to get the cpu, I suspect that it is probably caused by the 
syzkaller thread having a higher priority.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)
  2021-12-28  3:14   ` Waiman Long
@ 2021-12-29 21:29     ` Theodore Ts'o
       [not found]       ` <20211230125018.2272-1-hdanton@sina.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2021-12-29 21:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: Hillf Danton, syzbot, linux-kernel, Peter Zijlstra, syzkaller-bugs

On Mon, Dec 27, 2021 at 10:14:23PM -0500, Waiman Long wrote:
> 
> The test was running on a CONFIG_PREEMPT kernel. So if the syzkaller kthread
> is running at a higher priority than the rcu_preempt kthread, it is possible
> for the rcu_preempt kthread to be starved of cpu time. The rwsem optimistic
> spinning code will relinquish the cpu if there is a higher priority thread
> running. Since rcu_preempt kthread did not seem to be able to get the cpu, I
> suspect that it is probably caused by the syzkaller thread having a higher
> priority.

It's even worse than that.  The Syzkaller reproducer is calling
sched_setattr():

  *(uint32_t*)0x20000080 = 0x38;    // sched_attr.sched_size
  *(uint32_t*)0x20000084 = 1;       // sched_attr.sched_policy == SCHED_FIFO
  *(uint64_t*)0x20000088 = 0;       // sched_attr.sched_flags
  *(uint32_t*)0x20000090 = 0;       // sched_attr.sched_nice
  *(uint32_t*)0x20000094 = 1;       // sched_attr.sched_priority
  *(uint64_t*)0x20000098 = 0;       // ...
  *(uint64_t*)0x200000a0 = 0;
  *(uint64_t*)0x200000a8 = 0;
  *(uint32_t*)0x200000b0 = 0;
  *(uint32_t*)0x200000b4 = 0;
  syscall(__NR_sched_setattr, 0, 0x20000080ul, 0ul);

So one or more of the syzkaller threads is SCHED_FIFO, and SCHED_FIFO
threads will *never* relinquish the CPU in favor of SCHED_OTHER
threads (which in practice will include all kernel threads unless
special measures are taken by someone who knows what they are doing)
so long as it they are runable.

See the discussion at:

    https://lore.kernel.org/all/Yb5RMWRsJl5TMk8H@casper.infradead.org/

I recommend that kernel developers simply ignore any syzkaller report
that relates to tasks being hung or rcu detected and where the
reproducer is trying to set a real-time priority (e.g., sched_policy
of SCHED_FIFO or SCHED_RR), since the number of ways that
sched_setattr can be used as a foot-gun are near infinite....

Syzkaller reports that include sched_setattr are great for inflating
the OMG!  There are tons of unhandled syzkaller reports, "companies
need to fund more engineering headcount to fix syzkaller bugs" slide
decks.  But IMHO, they are not good for much else.

	      	     	       		    - Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)
       [not found]       ` <20211230125018.2272-1-hdanton@sina.com>
@ 2022-02-08 10:32         ` Aleksandr Nogikh
  0 siblings, 0 replies; 4+ messages in thread
From: Aleksandr Nogikh @ 2022-02-08 10:32 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Theodore Ts'o, Waiman Long, syzbot, LKML, Peter Zijlstra,
	syzkaller-bugs

Closing the bug. Syzkaller now is much more careful with sched_setattr
and perf_event_open, so, hopefully, we'll see fewer such false
positive reports in the future.

#syz invalid

On Thu, Dec 30, 2021 at 1:50 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Wed, 29 Dec 2021 16:29:33 -0500 Theodore Ts'o wrote:
> > On Mon, Dec 27, 2021 at 10:14:23PM -0500, Waiman Long wrote:
> > >
> > > The test was running on a CONFIG_PREEMPT kernel. So if the syzkaller kthread
> > > is running at a higher priority than the rcu_preempt kthread, it is possible
> > > for the rcu_preempt kthread to be starved of cpu time. The rwsem optimistic
> > > spinning code will relinquish the cpu if there is a higher priority thread
> > > running. Since rcu_preempt kthread did not seem to be able to get the cpu, I
> > > suspect that it is probably caused by the syzkaller thread having a higher
> > > priority.
> >
> > It's even worse than that.  The Syzkaller reproducer is calling
> > sched_setattr():
> >
> >   *(uint32_t*)0x20000080 = 0x38;    // sched_attr.sched_size
> >   *(uint32_t*)0x20000084 = 1;       // sched_attr.sched_policy == SCHED_FIFO
> >   *(uint64_t*)0x20000088 = 0;       // sched_attr.sched_flags
> >   *(uint32_t*)0x20000090 = 0;       // sched_attr.sched_nice
> >   *(uint32_t*)0x20000094 = 1;       // sched_attr.sched_priority
> >   *(uint64_t*)0x20000098 = 0;       // ...
> >   *(uint64_t*)0x200000a0 = 0;
> >   *(uint64_t*)0x200000a8 = 0;
> >   *(uint32_t*)0x200000b0 = 0;
> >   *(uint32_t*)0x200000b4 = 0;
> >   syscall(__NR_sched_setattr, 0, 0x20000080ul, 0ul);
> >
> > So one or more of the syzkaller threads is SCHED_FIFO, and SCHED_FIFO
> > threads will *never* relinquish the CPU in favor of SCHED_OTHER
> > threads (which in practice will include all kernel threads unless
> > special measures are taken by someone who knows what they are doing)
> > so long as it they are runable.
> >
> > See the discussion at:
> >
> >     https://lore.kernel.org/all/Yb5RMWRsJl5TMk8H@casper.infradead.org/
> >
> > I recommend that kernel developers simply ignore any syzkaller report
> > that relates to tasks being hung or rcu detected and where the
> > reproducer is trying to set a real-time priority (e.g., sched_policy
> > of SCHED_FIFO or SCHED_RR), since the number of ways that
> > sched_setattr can be used as a foot-gun are near infinite....
> >
> > Syzkaller reports that include sched_setattr are great for inflating
> > the OMG!  There are tons of unhandled syzkaller reports, "companies
> > need to fund more engineering headcount to fix syzkaller bugs" slide
> > decks.  But IMHO, they are not good for much else.
> >
> >                                           - Ted
> >
>
> On the other hand, this report suggests IMHO the need for setting the
> deadline, a couple of ticks by default, for spinners, to cut the chance
> for FIFO tasks to make trouble in scenarios like the report.
>
> Mutex needs the same mechanism if it makes sense.
>
> Thanks
>                 Hillf
>
>
> +++ x/kernel/locking/rwsem.c
> @@ -716,6 +716,7 @@ rwsem_spin_on_owner(struct rw_semaphore
>         struct task_struct *new, *owner;
>         unsigned long flags, new_flags;
>         enum owner_state state;
> +       unsigned long deadline;
>
>         lockdep_assert_preemption_disabled();
>
> @@ -724,6 +725,10 @@ rwsem_spin_on_owner(struct rw_semaphore
>         if (state != OWNER_WRITER)
>                 return state;
>
> +       /* avoid spinning long enough to make rcu stall
> +        * particularly in case of FIFO spinner
> +        */
> +       deadline = jiffies + 2;
>         for (;;) {
>                 /*
>                  * When a waiting writer set the handoff flag, it may spin
> @@ -747,7 +752,8 @@ rwsem_spin_on_owner(struct rw_semaphore
>                  */
>                 barrier();
>
> -               if (need_resched() || !owner_on_cpu(owner)) {
> +               if (need_resched() || !owner_on_cpu(owner) ||
> +                   time_after(jiffies, deadline)) {
>                         state = OWNER_NONSPINNABLE;
>                         break;
>                 }
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20211230125018.2272-1-hdanton%40sina.com.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-02-08 11:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 17:33 [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4) syzbot
     [not found] ` <20211225005253.1962-1-hdanton@sina.com>
2021-12-28  3:14   ` Waiman Long
2021-12-29 21:29     ` Theodore Ts'o
     [not found]       ` <20211230125018.2272-1-hdanton@sina.com>
2022-02-08 10:32         ` Aleksandr Nogikh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.