All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
@ 2023-02-08 10:37 syzbot
  2023-02-09 22:25 ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: syzbot @ 2023-02-08 10:37 UTC (permalink / raw)
  To: jstultz, linux-kernel, sboyd, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    4fafd96910ad Add linux-next specific files for 20230203
git tree:       linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=17005023480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=110c6175480000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ef5ad9480000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/348cc2da441a/disk-4fafd969.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e2dedc500f12/vmlinux-4fafd969.xz
kernel image: https://storage.googleapis.com/syzbot-assets/fae710d9ebd8/bzImage-4fafd969.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	(detected by 1, t=10502 jiffies, g=5605, q=143 ncpus=2)
rcu: All QSes seen, last rcu_preempt kthread activity 10502 (4294955582-4294945080), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 10502 jiffies! g5605 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28680 pid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5301 [inline]
 __schedule+0x245a/0x5a60 kernel/sched/core.c:6619
 schedule+0xde/0x1a0 kernel/sched/core.c:6695
 schedule_timeout+0x14e/0x2b0 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1609
 rcu_gp_kthread+0x23a/0x360 kernel/rcu/tree.c:1808
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
RIP: 0010:__sanitizer_cov_trace_const_cmp1+0x0/0x20 kernel/kcov.c:289
Code: 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 8b 0c 24 48 89 f2 48 89 fe bf 06 00 00 00 e9 f8 fe ff ff 0f 1f 84 00 00 00 00 00 <f3> 0f 1e fa 48 8b 0c 24 40 0f b6 d6 40 0f b6 f7 bf 01 00 00 00 e9
RSP: 0018:ffffc90000007dd8 EFLAGS: 00000046
RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81757653
RDX: fffffbfff1cea2eb RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8e751757
R10: fffffbfff1cea2ea R11: 0000000000000000 R12: 0000002a9fc7f679
R13: 0000000000000001 R14: ffffffff91cb6e28 R15: ffff8880775e6f60
FS:  0000555557516300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f88861130 CR3: 0000000072b77000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 trace_alarmtimer_fired include/trace/events/alarmtimer.h:73 [inline]
 alarmtimer_fired+0x327/0x670 kernel/time/alarmtimer.c:220
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x600/0xcf0 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
 __sysvec_apic_timer_interrupt+0x180/0x660 arch/x86/kernel/apic/apic.c:1113
 sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1107
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:lock_is_held_type+0x103/0x140 kernel/locking/lockdep.c:5767
Code: 00 00 b8 ff ff ff ff 65 0f c1 05 48 c8 ef 75 83 f8 01 75 29 9c 58 f6 c4 02 75 3d 48 f7 04 24 00 02 00 00 74 01 fb 48 83 c4 08 <44> 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ed eb b9 0f 0b 48
RSP: 0018:ffffc90003d0f3c8 EFLAGS: 00000282
RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffffff8c794680 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8880773d8000
R13: 0000000000000000 R14: 00000000ffffffff R15: ffff8880773d8a60
 lock_is_held include/linux/lockdep.h:288 [inline]
 rcu_read_lock_sched_held+0x3e/0x70 kernel/rcu/update.c:125
 trace_block_dirty_buffer include/trace/events/block.h:58 [inline]
 mark_buffer_dirty+0x445/0x570 fs/buffer.c:1126
 __block_commit_write.constprop.0.isra.0+0x19d/0x330 fs/buffer.c:2108
 block_write_end+0x53/0x220 fs/buffer.c:2187
 generic_write_end+0x8d/0x440 fs/buffer.c:2201
 ext4_da_write_end+0x1f5/0xa50 fs/ext4/inode.c:3174
 generic_perform_write+0x316/0x570 mm/filemap.c:3781
 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285
 ext4_file_write_iter+0x8bf/0x1710 fs/ext4/file.c:700
 call_write_iter include/linux/fs.h:1851 [inline]
 do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735
 do_iter_write+0x182/0x700 fs/read_write.c:861
 vfs_iter_write+0x74/0xa0 fs/read_write.c:902
 iter_file_splice_write+0x745/0xc90 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0x114/0x180 fs/splice.c:931
 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
 do_splice_direct+0x1ab/0x280 fs/splice.c:974
 do_sendfile+0xb19/0x12c0 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64 fs/read_write.c:1309 [inline]
 __x64_sys_sendfile64+0x14d/0x210 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f8f887f06a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffccec967c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 0000000000000062 RCX: 00007f8f887f06a9
RDX: 00000000200001c0 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00007ffccec967f0 R09: 00007ffccec967f0
R10: 000000000200a198 R11: 0000000000000246 R12: 00007ffccec967ec
R13: 00007ffccec96820 R14: 00007ffccec96800 R15: 0000000000000003
 </TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.925 msecs


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
  2023-02-08 10:37 [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6) syzbot
@ 2023-02-09 22:25 ` Thomas Gleixner
  2023-02-13 22:28   ` John Stultz
  2023-02-14 10:27   ` [tip: timers/urgent] alarmtimer: Prevent starvation by small intervals and SIG_IGN tip-bot2 for Thomas Gleixner
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2023-02-09 22:25 UTC (permalink / raw)
  To: syzbot, jstultz, linux-kernel, sboyd, syzkaller-bugs,
	Eric W. Biederman, Peter Zijlstra, Oleg Nesterov

On Wed, Feb 08 2023 at 02:37, syzbot wrote:
> syzbot found the following issue on:
>
> HEAD commit:    4fafd96910ad Add linux-next specific files for 20230203
> git tree:       linux-next
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=17005023480000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
> dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=110c6175480000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ef5ad9480000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/348cc2da441a/disk-4fafd969.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/e2dedc500f12/vmlinux-4fafd969.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/fae710d9ebd8/bzImage-4fafd969.xz

<SNIP>

> NMI backtrace for cpu 0
> Call Trace:
>  <IRQ>
>  trace_alarmtimer_fired include/trace/events/alarmtimer.h:73 [inline]
>  alarmtimer_fired+0x327/0x670 kernel/time/alarmtimer.c:220
>  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>  __hrtimer_run_queues+0x600/0xcf0 kernel/time/hrtimer.c:1749
>  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
>  __sysvec_apic_timer_interrupt+0x180/0x660 arch/x86/kernel/apic/apic.c:1113
>  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1107
>  </IRQ>
>  <TASK>
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645

Looking at the reproducer:

  *(uint64_t*)0x2006b000 = 0;
  *(uint64_t*)0x2006b008 = 8;
  *(uint64_t*)0x2006b010 = 0;
  *(uint64_t*)0x2006b018 = 9;
  syscall(__NR_timer_settime, 0, 0ul, 0x2006b000ul, 0ul);

This arms the alarm timer with a relative expiry of 8ns and an interval
of 9ns. Not a problem per se, but that's an issue when the signal is
ignored because then the timer is immediately rearmed because there is
no way to delay that rearming to the signal delivery path.
See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent
softirq starvation by small intervals and SIG_IGN") for details.

The reproducer does not set SIG_IGN explicitely, but it does something
nasty which has the same effect:

  *(uint64_t*)0x20000000 = 0;
  *(uint32_t*)0x20000008 = 0x12;
  *(uint32_t*)0x2000000c = 0;
  *(uint32_t*)0x20000010 = 0;
  syscall(__NR_timer_create, 9ul, 0x20000000ul, 0x20000180ul);

0x20000008 is the signal number, which decodes to SIGCONT. SIGCONT is
ignored if there is no handler set and the task is not ptraced.

The log clearly shows that:

   [pid  5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---

It works because the tasks are traced and therefore the signal is queued
so the tracer can see it, which delays the restart of the timer
and the restart of the timer is delayed.

   [pid  5087] kill(-5102, SIGKILL <unfinished ...>

Here the tracer is killed

   [pid  5107] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=167448, si_int=0, si_ptr=NULL} ---
   [pid  5104] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=157138, si_int=0, si_ptr=NULL} ---
   [pid  5103] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=187150, si_int=0, si_ptr=NULL} ---
   [pid  5106] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=190249, si_int=0, si_ptr=NULL} ---
   [pid  5102] --- stopped by SIGCONT ---
   ./strace-static-x86_64: ptrace(PTRACE_LISTEN,pid:5102,sig:0): Input/output error
   ./strace-static-x86_64: Process 5083 detached
   ./strace-static-x86_64: Process 5084 detached
   ./strace-static-x86_64: Process 5086 detached
   ./strace-static-x86_64: Process 5085 detached
   ./strace-static-x86_64: Process 5088 detached
   ./strace-static-x86_64: Process 5089 detached
   ./strace-static-x86_64: Process 5087 detached
   ./strace-static-x86_64: Process 5102 detached
   ./strace-static-x86_64: Process 5103 detached
   ./strace-static-x86_64: Process 5104 detached
   ./strace-static-x86_64: Process 5105 detached
   ./strace-static-x86_64: Process 5106 detached
   ./strace-static-x86_64: Process 5107 detached
   
and after it's gone the drama starts:
   
   syzkaller login: [   79.439102][    C0] hrtimer: interrupt took 68471 ns
   [  184.460538][    C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
   ...
   [  184.658237][    C1] rcu: Stack dump where RCU GP kthread last ran:
   [  184.664574][    C1] Sending NMI from CPU 1 to CPUs 0:
   [  184.669821][    C0] NMI backtrace for cpu 0
   [  184.669831][    C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
   ...
   [  184.670036][    C0] Call Trace:
   [  184.670041][    C0]  <IRQ>
   [  184.670045][    C0]  alarmtimer_fired+0x327/0x670

posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer().

The comment in that function says what needs to be done and there was a
fix available for the regular userspace induced SIG_IGN mechanism, but
that did not work due to the implicit ignore for SIGCONT. I gave up at
that point and put it onto the ever growing todo list...

So the only available workaround right now is to do exactly what
posix_timer_fn() does.

Untested patch below.

Thanks,

        tglx
---
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 5897828b9d7e..ff4ff2cab0c0 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -470,11 +470,35 @@ u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval)
 }
 EXPORT_SYMBOL_GPL(alarm_forward);
 
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
+	ktime_t now = base->get_ktime();
+
+	if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) {
+		/*
+		 * Same issue as with posix_timer_fn(). Timers which are
+		 * periodic but the signal is ignored can starve the system
+		 * with a very small interval. The real fix which was
+		 * promised in the context of posix_timer_fn() never
+		 * materialized, but someone should really work on it.
+		 *
+		 * To prevent DOS fake @now to be 1 jiffie out which keeps
+		 * the overrun accounting correct but creates an
+		 * inconsistency vs. timer_gettime(2).
+		 */
+		ktime_t kj = NSEC_PER_SEC / HZ;
+
+		if (interval < kj)
+			now = ktime_add(now, kj);
+	}
+
+	return alarm_forward(alarm, now, interval);
+}
 
-	return alarm_forward(alarm, base->get_ktime(), interval);
+u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+{
+	return __alarm_forward_now(alarm, interval, false);
 }
 EXPORT_SYMBOL_GPL(alarm_forward_now);
 
@@ -551,9 +575,10 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
 	if (posix_timer_event(ptr, si_private) && ptr->it_interval) {
 		/*
 		 * Handle ignored signals and rearm the timer. This will go
-		 * away once we handle ignored signals proper.
+		 * away once we handle ignored signals proper. Ensure that
+		 * small intervals cannot starve the system.
 		 */
-		ptr->it_overrun += alarm_forward_now(alarm, ptr->it_interval);
+		ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
 		++ptr->it_requeue_pending;
 		ptr->it_active = 1;
 		result = ALARMTIMER_RESTART;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
  2023-02-09 22:25 ` Thomas Gleixner
@ 2023-02-13 22:28   ` John Stultz
  2023-02-14 10:27   ` [tip: timers/urgent] alarmtimer: Prevent starvation by small intervals and SIG_IGN tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 10+ messages in thread
From: John Stultz @ 2023-02-13 22:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, linux-kernel, sboyd, syzkaller-bugs, Eric W. Biederman,
	Peter Zijlstra, Oleg Nesterov

On Thu, Feb 9, 2023 at 2:25 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Feb 08 2023 at 02:37, syzbot wrote:
> > syzbot found the following issue on:
> >
> > HEAD commit:    4fafd96910ad Add linux-next specific files for 20230203
> > git tree:       linux-next
> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=17005023480000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
> > dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=110c6175480000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ef5ad9480000
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/348cc2da441a/disk-4fafd969.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/e2dedc500f12/vmlinux-4fafd969.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/fae710d9ebd8/bzImage-4fafd969.xz
>
> <SNIP>
>
> > NMI backtrace for cpu 0
> > Call Trace:
> >  <IRQ>
> >  trace_alarmtimer_fired include/trace/events/alarmtimer.h:73 [inline]
> >  alarmtimer_fired+0x327/0x670 kernel/time/alarmtimer.c:220
> >  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
> >  __hrtimer_run_queues+0x600/0xcf0 kernel/time/hrtimer.c:1749
> >  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
> >  __sysvec_apic_timer_interrupt+0x180/0x660 arch/x86/kernel/apic/apic.c:1113
> >  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1107
> >  </IRQ>
> >  <TASK>
> >  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
>
> Looking at the reproducer:
>
>   *(uint64_t*)0x2006b000 = 0;
>   *(uint64_t*)0x2006b008 = 8;
>   *(uint64_t*)0x2006b010 = 0;
>   *(uint64_t*)0x2006b018 = 9;
>   syscall(__NR_timer_settime, 0, 0ul, 0x2006b000ul, 0ul);
>
> This arms the alarm timer with a relative expiry of 8ns and an interval
> of 9ns. Not a problem per se, but that's an issue when the signal is
> ignored because then the timer is immediately rearmed because there is
> no way to delay that rearming to the signal delivery path.
> See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent
> softirq starvation by small intervals and SIG_IGN") for details.
>
> The reproducer does not set SIG_IGN explicitely, but it does something
> nasty which has the same effect:
>
>   *(uint64_t*)0x20000000 = 0;
>   *(uint32_t*)0x20000008 = 0x12;
>   *(uint32_t*)0x2000000c = 0;
>   *(uint32_t*)0x20000010 = 0;
>   syscall(__NR_timer_create, 9ul, 0x20000000ul, 0x20000180ul);
>
> 0x20000008 is the signal number, which decodes to SIGCONT. SIGCONT is
> ignored if there is no handler set and the task is not ptraced.
>
> The log clearly shows that:
>
>    [pid  5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---
>
> It works because the tasks are traced and therefore the signal is queued
> so the tracer can see it, which delays the restart of the timer
> and the restart of the timer is delayed.
>
>    [pid  5087] kill(-5102, SIGKILL <unfinished ...>
>
> Here the tracer is killed
>
>    [pid  5107] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=167448, si_int=0, si_ptr=NULL} ---
>    [pid  5104] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=157138, si_int=0, si_ptr=NULL} ---
>    [pid  5103] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=187150, si_int=0, si_ptr=NULL} ---
>    [pid  5106] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=190249, si_int=0, si_ptr=NULL} ---
>    [pid  5102] --- stopped by SIGCONT ---
>    ./strace-static-x86_64: ptrace(PTRACE_LISTEN,pid:5102,sig:0): Input/output error
>    ./strace-static-x86_64: Process 5083 detached
>    ./strace-static-x86_64: Process 5084 detached
>    ./strace-static-x86_64: Process 5086 detached
>    ./strace-static-x86_64: Process 5085 detached
>    ./strace-static-x86_64: Process 5088 detached
>    ./strace-static-x86_64: Process 5089 detached
>    ./strace-static-x86_64: Process 5087 detached
>    ./strace-static-x86_64: Process 5102 detached
>    ./strace-static-x86_64: Process 5103 detached
>    ./strace-static-x86_64: Process 5104 detached
>    ./strace-static-x86_64: Process 5105 detached
>    ./strace-static-x86_64: Process 5106 detached
>    ./strace-static-x86_64: Process 5107 detached
>
> and after it's gone the drama starts:
>
>    syzkaller login: [   79.439102][    C0] hrtimer: interrupt took 68471 ns
>    [  184.460538][    C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>    ...
>    [  184.658237][    C1] rcu: Stack dump where RCU GP kthread last ran:
>    [  184.664574][    C1] Sending NMI from CPU 1 to CPUs 0:
>    [  184.669821][    C0] NMI backtrace for cpu 0
>    [  184.669831][    C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
>    ...
>    [  184.670036][    C0] Call Trace:
>    [  184.670041][    C0]  <IRQ>
>    [  184.670045][    C0]  alarmtimer_fired+0x327/0x670
>
> posix_timer_fn() prevents that by checking whether the interval for
> timers which have the signal ignored is smaller than a jiffie and
> artifically delay it by shifting the next expiry out by a jiffie. That's
> accurate vs. the overrun accounting, but slightly inaccurate
> vs. timer_gettimer().
>
> The comment in that function says what needs to be done and there was a
> fix available for the regular userspace induced SIG_IGN mechanism, but
> that did not work due to the implicit ignore for SIGCONT. I gave up at
> that point and put it onto the ever growing todo list...
>
> So the only available workaround right now is to do exactly what
> posix_timer_fn() does.
>
> Untested patch below.

I ran your test for a bit with the reproducer and didn't see any
issues. Also ran it through the kselftests/timers
run_destructive_tests and didn't see any issues.

So I think this is ok. Haven't really dug through all the implications
for adding an extra jiffy to the expiration there, but it sounds
reasonable to me.

Acked-by: John Stultz <jstultz@google.com>

thanks
-john

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip: timers/urgent] alarmtimer: Prevent starvation by small intervals and SIG_IGN
  2023-02-09 22:25 ` Thomas Gleixner
  2023-02-13 22:28   ` John Stultz
@ 2023-02-14 10:27   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 10+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2023-02-14 10:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: syzbot+b9564ba6e8e00694511b, Thomas Gleixner, John Stultz,
	stable, x86, linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     d125d1349abeb46945dc5e98f7824bf688266f13
Gitweb:        https://git.kernel.org/tip/d125d1349abeb46945dc5e98f7824bf688266f13
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 09 Feb 2023 23:25:49 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 14 Feb 2023 11:18:35 +01:00

alarmtimer: Prevent starvation by small intervals and SIG_IGN

syzbot reported a RCU stall which is caused by setting up an alarmtimer
with a very small interval and ignoring the signal. The reproducer arms the
alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a
problem per se, but that's an issue when the signal is ignored because then
the timer is immediately rearmed because there is no way to delay that
rearming to the signal delivery path.  See posix_timer_fn() and commit
58229a189942 ("posix-timers: Prevent softirq starvation by small intervals
and SIG_IGN") for details.

The reproducer does not set SIG_IGN explicitely, but it sets up the timers
signal with SIGCONT. That has the same effect as explicitely setting
SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and
the task is not ptraced.

The log clearly shows that:

   [pid  5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---

It works because the tasks are traced and therefore the signal is queued so
the tracer can see it, which delays the restart of the timer to the signal
delivery path. But then the tracer is killed:

   [pid  5087] kill(-5102, SIGKILL <unfinished ...>
   ...
   ./strace-static-x86_64: Process 5107 detached

and after it's gone the stall can be observed:

   syzkaller login: [   79.439102][    C0] hrtimer: interrupt took 68471 ns
   [  184.460538][    C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
   ...
   [  184.658237][    C1] rcu: Stack dump where RCU GP kthread last ran:
   [  184.664574][    C1] Sending NMI from CPU 1 to CPUs 0:
   [  184.669821][    C0] NMI backtrace for cpu 0
   [  184.669831][    C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
   ...
   [  184.670036][    C0] Call Trace:
   [  184.670041][    C0]  <IRQ>
   [  184.670045][    C0]  alarmtimer_fired+0x327/0x670

posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer(2).

The comment in that function says what needs to be done and there was a fix
available for the regular userspace induced SIG_IGN mechanism, but that did
not work due to the implicit ignore for SIGCONT and similar signals. This
needs to be worked on, but for now the only available workaround is to do
exactly what posix_timer_fn() does:

Increase the interval of self-rearming timers, which have their signal
ignored, to at least a jiffie.

Interestingly this has been fixed before via commit ff86bf0c65f1
("alarmtimer: Rate limit periodic intervals") already, but that fix got
lost in a later rework.

Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com
Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
---
 kernel/time/alarmtimer.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 5897828..7e5dff6 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -470,11 +470,35 @@ u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval)
 }
 EXPORT_SYMBOL_GPL(alarm_forward);
 
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle)
 {
 	struct alarm_base *base = &alarm_bases[alarm->type];
+	ktime_t now = base->get_ktime();
+
+	if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) {
+		/*
+		 * Same issue as with posix_timer_fn(). Timers which are
+		 * periodic but the signal is ignored can starve the system
+		 * with a very small interval. The real fix which was
+		 * promised in the context of posix_timer_fn() never
+		 * materialized, but someone should really work on it.
+		 *
+		 * To prevent DOS fake @now to be 1 jiffie out which keeps
+		 * the overrun accounting correct but creates an
+		 * inconsistency vs. timer_gettime(2).
+		 */
+		ktime_t kj = NSEC_PER_SEC / HZ;
+
+		if (interval < kj)
+			now = ktime_add(now, kj);
+	}
+
+	return alarm_forward(alarm, now, interval);
+}
 
-	return alarm_forward(alarm, base->get_ktime(), interval);
+u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+{
+	return __alarm_forward_now(alarm, interval, false);
 }
 EXPORT_SYMBOL_GPL(alarm_forward_now);
 
@@ -551,9 +575,10 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
 	if (posix_timer_event(ptr, si_private) && ptr->it_interval) {
 		/*
 		 * Handle ignored signals and rearm the timer. This will go
-		 * away once we handle ignored signals proper.
+		 * away once we handle ignored signals proper. Ensure that
+		 * small intervals cannot starve the system.
 		 */
-		ptr->it_overrun += alarm_forward_now(alarm, ptr->it_interval);
+		ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
 		++ptr->it_requeue_pending;
 		ptr->it_active = 1;
 		result = ALARMTIMER_RESTART;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230210065053.2385-1-hdanton@sina.com>
@ 2023-02-10  7:16 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-10  7:16 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com

Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11ee1b83480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=117f7c0b480000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230209073104.2276-1-hdanton@sina.com>
@ 2023-02-09  7:50 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-09  7:50 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	Tasks blocked on level-0 rcu_node (CPUs 0-1): P12/1:b..l
rcu: 	(detected by 1, t=10502 jiffies, g=11233, q=127 ncpus=2)
task:kworker/u4:1    state:R  running task     stack:24944 pid:12    ppid:2      flags:0x00004000
Workqueue: bat_events batadv_nc_worker
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5301 [inline]
 __schedule+0x245a/0x5a60 kernel/sched/core.c:6619
 preempt_schedule_notrace+0x5f/0xd0 kernel/sched/core.c:6881
 preempt_schedule_notrace_thunk+0x1a/0x20 arch/x86/entry/thunk_64.S:35
 rcu_read_unlock_sched_notrace include/linux/rcupdate.h:895 [inline]
 trace_lock_acquire+0x1e7/0x2b0 include/trace/events/lock.h:24
 lock_acquire+0x32/0xc0 kernel/locking/lockdep.c:5662
 rcu_lock_acquire include/linux/rcupdate.h:327 [inline]
 rcu_read_lock include/linux/rcupdate.h:773 [inline]
 batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:408 [inline]
 batadv_nc_worker+0x131/0xfe0 net/batman-adv/network-coding.c:719
 process_one_work+0x9bf/0x1820 kernel/workqueue.c:2390
 worker_thread+0x669/0x1090 kernel/workqueue.c:2537
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: rcu_preempt kthread starved for 10510 jiffies! g11233 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28680 pid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5301 [inline]
 __schedule+0x245a/0x5a60 kernel/sched/core.c:6619
 schedule+0xde/0x1a0 kernel/sched/core.c:6695
 schedule_timeout+0x14e/0x2b0 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1609
 rcu_gp_kthread+0x23a/0x360 kernel/rcu/tree.c:1808
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 1 PID: 5741 Comm: syz-executor.2 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
RIP: 0010:lock_acquire.part.0+0x156/0x370 kernel/locking/lockdep.c:5656
Code: b8 ff ff ff ff 65 0f c1 05 07 84 9c 7e 83 f8 01 0f 85 c7 01 00 00 9c 58 f6 c4 02 0f 85 da 01 00 00 48 85 ed 0f 85 ad 01 00 00 <48> b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00 00 00 00 48 c7
RSP: 0018:ffffc90005a8f238 EFLAGS: 00000206
RAX: 0000000000000046 RBX: 1ffff92000b51e48 RCX: ffffffff8165aaa6
RDX: 1ffff11005511896 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000200 R08: 0000000000000000 R09: ffffffff913e4c1f
R10: fffffbfff227c983 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88802a48a0d8
FS:  00007f2908784700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2907b15829 CR3: 0000000028780000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __raw_write_lock include/linux/rwlock_api_smp.h:209 [inline]
 _raw_write_lock+0x2e/0x40 kernel/locking/spinlock.c:300
 jbd2_log_start_commit+0x23/0x50 fs/jbd2/journal.c:521
 jbd2_journal_stop+0x53b/0xfd0 fs/jbd2/transaction.c:1932
 __ext4_journal_stop+0xe2/0x1f0 fs/ext4/ext4_jbd2.c:133
 ext4_dirty_inode+0xf1/0x130 fs/ext4/inode.c:6111
 __mark_inode_dirty+0x247/0x1250 fs/fs-writeback.c:2421
 mark_inode_dirty include/linux/fs.h:2127 [inline]
 generic_write_end+0x354/0x440 fs/buffer.c:2227
 ext4_da_write_end+0x1f5/0xa50 fs/ext4/inode.c:3174
 generic_perform_write+0x316/0x570 mm/filemap.c:3781
 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285
 ext4_file_write_iter+0x8bf/0x1710 fs/ext4/file.c:700
 call_write_iter include/linux/fs.h:1851 [inline]
 do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735
 do_iter_write+0x182/0x700 fs/read_write.c:861
 vfs_iter_write+0x74/0xa0 fs/read_write.c:902
 iter_file_splice_write+0x745/0xc90 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0x114/0x180 fs/splice.c:931
 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
 do_splice_direct+0x1ab/0x280 fs/splice.c:974
 do_sendfile+0xb19/0x12c0 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64 fs/read_write.c:1309 [inline]
 __x64_sys_sendfile64+0x14d/0x210 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f2907a8c0c9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f2908784168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007f2907babf80 RCX: 00007f2907a8c0c9
RDX: 00000000200001c0 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 00007f2907ae7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000200a198 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffcafd348f R14: 00007f2908784300 R15: 0000000000022000
 </TASK>


Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=17a0d91f480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=12e562fb480000


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230209072740.2224-1-hdanton@sina.com>
@ 2023-02-09  7:34 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-09  7:34 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

kernel/time/tick-sched.c:1498:40: error: expected ')' before ';' token
kernel/time/tick-sched.c:1498:2: error: too few arguments to function 'hrtimer_forward_now'
kernel/time/tick-sched.c:1500:25: error: expected ';' before '}' token
kernel/time/tick-sched.c:1501:1: error: control reaches end of non-void function [-Werror=return-type]


Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17347093480000


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230209061936.2127-1-hdanton@sina.com>
@ 2023-02-09  7:16 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-09  7:16 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

ExtendedConfig ASPM ClockPM MSI]
[    2.704971][    T1] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended configuration space under this bridge
[    2.721274][    T1] PCI host bridge to bus 0000:00
[    2.722257][    T1] pci_bus 0000:00: Unknown NUMA node; performance will be reduced
[    2.723618][    T1] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    2.725109][    T1] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    2.726984][    T1] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    2.727782][    T1] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfefff window]
[    2.729339][    T1] pci_bus 0000:00: root bus resource [bus 00-ff]
[    2.730963][    T1] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[    2.739900][    T1] pci 0000:00:01.0: [8086:7110] type 00 class 0x060100
[    2.771630][    T1] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[    2.799785][    T1] pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4 ACPI
[    2.805032][    T1] pci 0000:00:03.0: [1af4:1004] type 00 class 0x000000
[    2.817461][    T1] pci 0000:00:03.0: reg 0x10: [io  0xc000-0xc03f]
[    2.824197][    T1] pci 0000:00:03.0: reg 0x14: [mem 0xfe800000-0xfe80007f]
[    2.850249][    T1] pci 0000:00:04.0: [1af4:1000] type 00 class 0x020000
[    2.860971][    T1] pci 0000:00:04.0: reg 0x10: [io  0xc040-0xc07f]
[    2.869561][    T1] pci 0000:00:04.0: reg 0x14: [mem 0xfe801000-0xfe80107f]
[    2.893212][    T1] pci 0000:00:05.0: [1ae0:a002] type 00 class 0x030000
[    2.907509][    T1] pci 0000:00:05.0: reg 0x10: [mem 0xfe000000-0xfe7fffff]
[    2.938174][    T1] pci 0000:00:05.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    2.944947][    T1] pci 0000:00:06.0: [1af4:1002] type 00 class 0x00ff00
[    2.955123][    T1] pci 0000:00:06.0: reg 0x10: [io  0xc080-0xc09f]
[    2.962448][    T1] pci 0000:00:06.0: reg 0x14: [mem 0xfe802000-0xfe80207f]
[    2.988501][    T1] pci 0000:00:07.0: [1af4:1005] type 00 class 0x00ff00
[    2.999322][    T1] pci 0000:00:07.0: reg 0x10: [io  0xc0a0-0xc0bf]
[    3.006863][    T1] pci 0000:00:07.0: reg 0x14: [mem 0xfe803000-0xfe80303f]
[    3.054017][    T1] ACPI: PCI: Interrupt link LNKA configured for IRQ 10
[    3.061314][    T1] ACPI: PCI: Interrupt link LNKB configured for IRQ 10
[    3.070348][    T1] ACPI: PCI: Interrupt link LNKC configured for IRQ 11
[    3.078101][    T1] ACPI: PCI: Interrupt link LNKD configured for IRQ 11
[    3.084571][    T1] ACPI: PCI: Interrupt link LNKS configured for IRQ 9
[    3.100822][    T1] iommu: Default domain type: Translated 
[    3.102824][    T1] iommu: DMA domain TLB invalidation policy: lazy mode 
[    3.110275][    T1] SCSI subsystem initialized
[    3.114447][    T1] ACPI: bus type USB registered
[    3.116457][    T1] usbcore: registered new interface driver usbfs
[    3.117993][    T1] usbcore: registered new interface driver hub
[    3.119852][    T1] usbcore: registered new device driver usb
[    3.122466][    T1] mc: Linux media interface: v0.10
[    3.123533][    T1] videodev: Linux video capture interface: v2.00
[    3.125526][    T1] pps_core: LinuxPPS API ver. 1 registered
[    3.126906][    T1] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    3.127854][    T1] PTP clock support registered
[    3.130020][    T1] EDAC MC: Ver: 3.0.0
[    3.135116][    T1] Advanced Linux Sound Architecture Driver Initialized.
[    3.143220][    T1] Bluetooth: Core ver 2.22
[    3.144600][    T1] NET: Registered PF_BLUETOOTH protocol family
[    3.145633][    T1] Bluetooth: HCI device and connection manager initialized
[    3.147881][    T1] Bluetooth: HCI socket layer initialized
[    3.149539][    T1] Bluetooth: L2CAP socket layer initialized
[    3.150683][    T1] Bluetooth: SCO socket layer initialized
[    3.151990][    T1] NET: Registered PF_ATMPVC protocol family
[    3.153007][    T1] NET: Registered PF_ATMSVC protocol family
[    3.154497][    T1] NetLabel: Initializing
[    3.155425][    T1] NetLabel:  domain hash size = 128
[    3.156282][    T1] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    3.158252][    T1] NetLabel:  unlabeled traffic allowed by default
[    3.160111][    T1] nfc: nfc_init: NFC Core ver 0.1
[    3.160111][    T1] NET: Registered PF_NFC protocol family
[    3.160111][    T1] PCI: Using ACPI for IRQ routing
[    3.160111][    T1] pci 0000:00:05.0: vgaarb: setting as boot VGA device
[    3.160111][    T1] pci 0000:00:05.0: vgaarb: bridge control possible
[    3.161322][    T1] pci 0000:00:05.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    3.163054][    T1] vgaarb: loaded
[    3.177757][    T1] clocksource: Switched to clocksource kvm-clock
[    3.187013][    T1] VFS: Disk quotas dquot_6.6.0
[    3.187770][    C0] ------------[ cut here ]------------
[    3.188101][    T1] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    3.188708][    C0] jts hrtimer tick_sched_timer+0x0/0x120 runs more than 20 ticks
[    3.190632][    T1] FS-Cache: Loaded
[    3.191689][    C0] WARNING: CPU: 0 PID: 10 at kernel/time/hrtimer.c:1693 __hrtimer_run_queues+0xf76/0x1200
[    3.193672][    T1] CacheFiles: Loaded
[    3.193967][    C0] Modules linked in:
[    3.195296][    C0] CPU: 0 PID: 10 Comm: kworker/u4:0 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
[    3.195308][    T1] TOMOYO: 2.6.0
[    3.196864][    C0] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
[    3.196881][    C0] Workqueue: eval_map_wq eval_map_work_func
[    3.197481][    T1] Mandatory Access Control activated.
[    3.201064][    T1] AppArmor: AppArmor Filesystem Enabled
[    3.197760][    C0] 
[    3.197760][    C0] RIP: 0010:__hrtimer_run_queues+0xf76/0x1200
[    3.202884][    T1] pnp: PnP ACPI init
[    3.197760][    C0] Code: 96 10 00 e9 1e f3 ff ff bd 01 00 00 00 e8 72 9a 10 00 48 8b 74 24 28 48 c7 c7 c0 d5 4e 8a c6 05 5e e1 eb 0c 01 e8 3a 2b d8 ff <0f> 0b e8 53 9a 10 00 31 ff 89 ee e8 5a 96 10 00 40 84 ed 0f 84 e3
[    3.197760][    C0] RSP: 0000:ffffc90000007de0 EFLAGS: 00010082
[    3.197760][    C0] RAX: 0000000000000000 RBX: ffffffff8e7519c0 RCX: 0000000000000000
[    3.197760][    C0] RDX: ffff888016699d40 RSI: ffffffff814c0217 RDI: 0000000000000001
[    3.197760][    C0] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[    3.197760][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880b982b968
[    3.197760][    C0] R13: ffff8880b982b880 R14: ffff8880b982c120 R15: ffff8880b982b800
[    3.197760][    C0] FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
[    3.197760][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.197760][    C0] CR2: ffff88823ffff000 CR3: 000000000c56f000 CR4: 00000000003506f0
[    3.197760][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    3.197760][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    3.197760][    C0] Call Trace:
[    3.197760][    C0]  <IRQ>
[    3.197760][    C0]  ? tick_sched_do_timer+0x470/0x470
[    3.223647][    T1] pnp: PnP ACPI: found 7 devices
[    3.197760][    C0]  ? retrigger_next_event+0x3b0/0x3b0
[    3.197760][    C0]  ? ktime_get_update_offsets_now+0x3ef/0x5c0
[    3.197760][    C0]  hrtimer_interrupt+0x320/0x7b0
[    3.197760][    C0]  __sysvec_apic_timer_interrupt+0x180/0x660
[    3.197760][    C0]  sysvec_apic_timer_interrupt+0x92/0xc0
[    3.197760][    C0]  </IRQ>
[    3.197760][    C0]  <TASK>
[    3.197760][    C0]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[    3.197760][    C0] RIP: 0010:trace_event_eval_update+0x3a8/0xf70
[    3.197760][    C0] Code: f7 ff 45 84 ed 0f 85 81 05 00 00 e8 e2 d8 f7 ff 89 de bf 5f 00 00 00 e8 e6 d4 f7 ff 80 fb 5f 0f 84 67 05 00 00 e8 c8 d8 f7 ff <4c> 8d 6d 01 4c 89 e8 4c 89 ea 48 c1 e8 03 83 e2 07 42 0f b6 04 38
[    3.197760][    C0] RSP: 0000:ffffc900000f7c40 EFLAGS: 00000293
[    3.237854][    C0] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[    3.237854][    C0] RDX: ffff888016699d40 RSI: ffffffff818c96e8 RDI: 0000000000000001
[    3.237854][    C0] RBP: ffffffff8cf1cb8c R08: 0000000000000001 R09: 000000000000005f
[    3.237854][    C0] R10: 0000000000000020 R11: 495353415022202c R12: 0000000000000029
[    3.237854][    C0] R13: 0000000000000000 R14: 0000000000000000 R15: dffffc0000000000
[    3.237854][    C0]  ? trace_event_eval_update+0x3a8/0xf70
[    3.237854][    C0]  eval_map_work_func+0x3c/0x50
[    3.237854][    C0]  process_one_work+0x9bf/0x1820
[    3.237854][    C0]  ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[    3.237854][    C0]  ? rcu_read_lock_sched_held+0x3e/0x70
[    3.237854][    C0]  ? spin_bug+0x1c0/0x1c0
[    3.237854][    C0]  ? lock_acquire+0x32/0xc0
[    3.237854][    C0]  ? worker_thread+0x16d/0x1090
[    3.237854][    C0]  worker_thread+0x669/0x1090
[    3.237854][    C0]  ? process_one_work+0x1820/0x1820
[    3.237854][    C0]  kthread+0x2e8/0x3a0
[    3.237854][    C0]  ? kthread_complete_and_exit+0x40/0x40
[    3.237854][    C0]  ret_from_fork+0x1f/0x30
[    3.237854][    C0]  </TASK>
[    3.237854][    C0] Kernel panic - not syncing: kernel: panic_on_warn set ...
[    3.237854][    C0] CPU: 0 PID: 10 Comm: kworker/u4:0 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
[    3.237854][    C0] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
[    3.237854][    C0] Workqueue: eval_map_wq eval_map_work_func
[    3.237854][    C0] Call Trace:
[    3.237854][    C0]  <IRQ>
[    3.237854][    C0]  dump_stack_lvl+0xd9/0x150
[    3.237854][    C0]  panic+0x61b/0x6c0
[    3.237854][    C0]  ? panic_smp_self_stop+0x90/0x90
[    3.237854][    C0]  ? show_trace_log_lvl+0x285/0x390
[    3.237854][    C0]  ? __hrtimer_run_queues+0xf76/0x1200
[    3.237854][    C0]  check_panic_on_warn+0xb1/0xc0
[    3.237854][    C0]  __warn+0xf2/0x4f0
[    3.237854][    C0]  ? __hrtimer_run_queues+0xf76/0x1200
[    3.237854][    C0]  report_bug+0x2da/0x500
[    3.237854][    C0]  handle_bug+0x3c/0x70
[    3.237854][    C0]  exc_invalid_op+0x18/0x50
[    3.237854][    C0]  asm_exc_invalid_op+0x1a/0x20
[    3.237854][    C0] RIP: 0010:__hrtimer_run_queues+0xf76/0x1200
[    3.237854][    C0] Code: 96 10 00 e9 1e f3 ff ff bd 01 00 00 00 e8 72 9a 10 00 48 8b 74 24 28 48 c7 c7 c0 d5 4e 8a c6 05 5e e1 eb 0c 01 e8 3a 2b d8 ff <0f> 0b e8 53 9a 10 00 31 ff 89 ee e8 5a 96 10 00 40 84 ed 0f 84 e3
[    3.237854][    C0] RSP: 0000:ffffc90000007de0 EFLAGS: 00010082
[    3.237854][    C0] RAX: 0000000000000000 RBX: ffffffff8e7519c0 RCX: 0000000000000000
[    3.237854][    C0] RDX: ffff888016699d40 RSI: ffffffff814c0217 RDI: 0000000000000001
[    3.237854][    C0] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[    3.237854][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880b982b968
[    3.237854][    C0] R13: ffff8880b982b880 R14: ffff8880b982c120 R15: ffff8880b982b800
[    3.237854][    C0]  ? __warn_printk+0x187/0x310
[    3.237854][    C0]  ? tick_sched_do_timer+0x470/0x470
[    3.237854][    C0]  ? retrigger_next_event+0x3b0/0x3b0
[    3.237854][    C0]  ? ktime_get_update_offsets_now+0x3ef/0x5c0
[    3.237854][    C0]  hrtimer_interrupt+0x320/0x7b0
[    3.237854][    C0]  __sysvec_apic_timer_interrupt+0x180/0x660
[    3.237854][    C0]  sysvec_apic_timer_interrupt+0x92/0xc0
[    3.237854][    C0]  </IRQ>
[    3.237854][    C0]  <TASK>
[    3.237854][    C0]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[    3.237854][    C0] RIP: 0010:trace_event_eval_update+0x3a8/0xf70
[    3.237854][    C0] Code: f7 ff 45 84 ed 0f 85 81 05 00 00 e8 e2 d8 f7 ff 89 de bf 5f 00 00 00 e8 e6 d4 f7 ff 80 fb 5f 0f 84 67 05 00 00 e8 c8 d8 f7 ff <4c> 8d 6d 01 4c 89 e8 4c 89 ea 48 c1 e8 03 83 e2 07 42 0f b6 04 38
[    3.237854][    C0] RSP: 0000:ffffc900000f7c40 EFLAGS: 00000293
[    3.237854][    C0] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[    3.237854][    C0] RDX: ffff888016699d40 RSI: ffffffff818c96e8 RDI: 0000000000000001
[    3.237854][    C0] RBP: ffffffff8cf1cb8c R08: 0000000000000001 R09: 000000000000005f
[    3.237854][    C0] R10: 0000000000000020 R11: 495353415022202c R12: 0000000000000029
[    3.237854][    C0] R13: 0000000000000000 R14: 0000000000000000 R15: dffffc0000000000
[    3.237854][    C0]  ? trace_event_eval_update+0x3a8/0xf70
[    3.237854][    C0]  eval_map_work_func+0x3c/0x50
[    3.237854][    C0]  process_one_work+0x9bf/0x1820
[    3.237854][    C0]  ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[    3.237854][    C0]  ? rcu_read_lock_sched_held+0x3e/0x70
[    3.237854][    C0]  ? spin_bug+0x1c0/0x1c0
[    3.237854][    C0]  ? lock_acquire+0x32/0xc0
[    3.237854][    C0]  ? worker_thread+0x16d/0x1090
[    3.237854][    C0]  worker_thread+0x669/0x1090
[    3.237854][    C0]  ? process_one_work+0x1820/0x1820
[    3.237854][    C0]  kthread+0x2e8/0x3a0
[    3.237854][    C0]  ? kthread_complete_and_exit+0x40/0x40
[    3.237854][    C0]  ret_from_fork+0x1f/0x30
[    3.237854][    C0]  </TASK>
[    3.237854][    C0] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOCACHE="/syzkaller/.cache/go-build"
GOENV="/syzkaller/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/syzkaller/jobs-2/linux/gopath/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/syzkaller/jobs-2/linux/gopath"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build909161844=/tmp/go-build -gno-record-gcc-switches"

git status (err=<nil>)
HEAD detached at 1b2f701aa
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:32: run command via tools/syz-env for best compatibility, see:
Makefile:33: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=1b2f701aa9a17abb7e27c7c1170d26398febf247 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20230203-112454'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=1b2f701aa9a17abb7e27c7c1170d26398febf247 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20230203-112454'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=1b2f701aa9a17abb7e27c7c1170d26398febf247 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20230203-112454'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"1b2f701aa9a17abb7e27c7c1170d26398febf247\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=16d60afb480000


Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1343af13480000


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230209010659.2012-1-hdanton@sina.com>
@ 2023-02-09  6:07 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-09  6:07 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in ext4_file_write_iter

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	(detected by 0, t=10502 jiffies, g=10829, q=65 ncpus=2)
rcu: All QSes seen, last rcu_preempt kthread activity 10502 (4294957416-4294946914), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 10502 jiffies! g10829 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28680 pid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5301 [inline]
 __schedule+0x245a/0x5a60 kernel/sched/core.c:6619
 schedule+0xde/0x1a0 kernel/sched/core.c:6695
 schedule_timeout+0x14e/0x2b0 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1609
 rcu_gp_kthread+0x23a/0x360 kernel/rcu/tree.c:1808
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 5793 Comm: syz-executor.0 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
RIP: 0010:__run_hrtimer kernel/time/hrtimer.c:1691 [inline]
RIP: 0010:__hrtimer_run_queues+0x237/0x1200 kernel/time/hrtimer.c:1759
Code: 74 08 3c 03 0f 8e 5b 0f 00 00 48 8b 44 24 70 4c 89 f7 c7 80 20 0a 00 00 01 00 00 00 48 8b 44 24 28 ff d0 48 c7 c2 00 f2 40 8c <89> 44 24 5c 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df 80 3c 02 00
RSP: 0018:ffffc900001e0de0 EFLAGS: 00000082
RAX: 0000000000000001 RBX: ffffffff8e7519c0 RCX: 0000000000000000
RDX: ffffffff8c40f200 RSI: ffffffff81757b84 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880b992b968
R13: ffff8880b992b980 R14: ffff88802a5a0828 R15: ffff8880b992b800
FS:  00007f24a25b3700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f24a1915829 CR3: 000000001fee8000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1821
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
 __sysvec_apic_timer_interrupt+0x180/0x660 arch/x86/kernel/apic/apic.c:1113
 sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1107
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:copy_page_from_iter_atomic+0xb13/0x1180 lib/iov_iter.c:821
Code: 03 44 01 e3 42 80 3c 28 00 0f 85 1b 06 00 00 4c 8d 45 08 89 d8 c1 eb 0c 89 de 4c 89 c2 25 ff 0f 00 00 48 c1 e6 06 48 03 75 00 <48> c1 ea 03 48 b9 00 00 00 00 00 16 00 00 42 0f b6 14 2a 48 01 ce
RSP: 0018:ffffc9000647f448 EFLAGS: 00000282
RAX: 00000000000006c0 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88807a22a408 RSI: ffffea0001943640 RDI: ffff88807a22a40c
RBP: ffff88807a22a400 R08: ffff88807a22a408 R09: 0000000000000002
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000020
 generic_perform_write+0x2ca/0x570 mm/filemap.c:3778
 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285
 ext4_file_write_iter+0x8bf/0x1710 fs/ext4/file.c:700
 call_write_iter include/linux/fs.h:1851 [inline]
 do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735
 do_iter_write+0x182/0x700 fs/read_write.c:861
 vfs_iter_write+0x74/0xa0 fs/read_write.c:902
 iter_file_splice_write+0x745/0xc90 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0x114/0x180 fs/splice.c:931
 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
 do_splice_direct+0x1ab/0x280 fs/splice.c:974
 do_sendfile+0xb19/0x12c0 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64 fs/read_write.c:1309 [inline]
 __x64_sys_sendfile64+0x14d/0x210 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f24a188c0c9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f24a25b3168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007f24a19abf80 RCX: 00007f24a188c0c9
RDX: 00000000200001c0 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 00007f24a18e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000200a198 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffc18815df R14: 00007f24a25b3300 R15: 0000000000022000
 </TASK>
------------[ cut here ]------------
jts hrtimer tick_sched_timer+0x0/0x120 kernel/time/tick-sched.c:212 runs more than 20 ticks
WARNING: CPU: 0 PID: 5791 at kernel/time/hrtimer.c:1693 __run_hrtimer kernel/time/hrtimer.c:1693 [inline]
WARNING: CPU: 0 PID: 5791 at kernel/time/hrtimer.c:1693 __hrtimer_run_queues+0xf76/0x1200 kernel/time/hrtimer.c:1759
Modules linked in:
CPU: 0 PID: 5791 Comm: syz-executor.1 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
RIP: 0010:__run_hrtimer kernel/time/hrtimer.c:1693 [inline]
RIP: 0010:__hrtimer_run_queues+0xf76/0x1200 kernel/time/hrtimer.c:1759
Code: 96 10 00 e9 1e f3 ff ff bd 01 00 00 00 e8 72 9a 10 00 48 8b 74 24 28 48 c7 c7 c0 d5 4e 8a c6 05 5e e1 eb 0c 01 e8 3a 2b d8 ff <0f> 0b e8 53 9a 10 00 31 ff 89 ee e8 5a 96 10 00 40 84 ed 0f 84 e3
RSP: 0018:ffffc90000007de0 EFLAGS: 00010082
RAX: 0000000000000000 RBX: ffffffff8e7519c0 RCX: 0000000000000000
RDX: ffff88807d998000 RSI: ffffffff814c0217 RDI: 0000000000000001
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880b982b968
R13: ffff8880b982b880 R14: ffff8880b982c120 R15: ffff8880b982b800
FS:  00007fee9873f700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fee97b15829 CR3: 0000000017dc5000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1821
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
 __sysvec_apic_timer_interrupt+0x180/0x660 arch/x86/kernel/apic/apic.c:1113
 sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1107
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:deref_stack_reg arch/x86/kernel/unwind_orc.c:379 [inline]
RIP: 0010:unwind_next_frame+0xa9b/0x1f60 arch/x86/kernel/unwind_orc.c:558
Code: 89 ef 48 89 4c 24 30 4c 89 de 4c 89 44 24 28 4c 89 5c 24 20 e8 a6 f3 ff ff 4d 8d 4d 34 84 c0 0f 84 07 f9 ff ff 4c 8b 5c 24 20 <48> 89 ee 48 ba 00 00 00 00 00 fc ff df 48 c1 ee 03 4c 89 df e8 8c
RSP: 0018:ffffc900064ef588 EFLAGS: 00000202
RAX: ffffc900064f0001 RBX: 0000000000000001 RCX: 0000000000000001
RDX: ffffc900064efd01 RSI: ffffc900064efd80 RDI: ffffc900064ef608
RBP: ffffc900064ef640 R08: ffffffff8f09573c R09: ffffc900064ef62c
R10: 0000000000000001 R11: ffffc900064efd80 R12: ffffc900064ef62d
R13: ffffc900064ef5f8 R14: ffffc900064efd88 R15: ffffffff8f095740
 arch_stack_walk+0x81/0xf0 arch/x86/kernel/stacktrace.c:25
 stack_trace_save+0x90/0xc0 kernel/stacktrace.c:122
 kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
 kasan_set_track+0x25/0x30 mm/kasan/common.c:52
 ____kasan_kmalloc mm/kasan/common.c:374 [inline]
 ____kasan_kmalloc mm/kasan/common.c:333 [inline]
 __kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383
 kasan_kmalloc include/linux/kasan.h:196 [inline]
 __do_kmalloc_node mm/slab_common.c:967 [inline]
 __kmalloc+0x5a/0xd0 mm/slab_common.c:980
 kmalloc_array include/linux/slab.h:636 [inline]
 kcalloc include/linux/slab.h:667 [inline]
 iter_file_splice_write+0x16c/0xc90 fs/splice.c:625
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0x114/0x180 fs/splice.c:931
 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
 do_splice_direct+0x1ab/0x280 fs/splice.c:974
 do_sendfile+0xb19/0x12c0 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64 fs/read_write.c:1309 [inline]
 __x64_sys_sendfile64+0x14d/0x210 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fee97a8c0c9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fee9873f168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007fee97babf80 RCX: 00007fee97a8c0c9
RDX: 00000000200001c0 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 00007fee97ae7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000200a198 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffebfc242ff R14: 00007fee9873f300 R15: 0000000000022000
 </TASK>


Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1666c52f480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1097b5a3480000


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6)
       [not found] <20230208132556.1881-1-hdanton@sina.com>
@ 2023-02-08 14:15 ` syzbot
  0 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2023-02-08 14:15 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	(detected by 0, t=10502 jiffies, g=11593, q=32 ncpus=2)
rcu: All QSes seen, last rcu_preempt kthread activity 10501 (4294957866-4294947365), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 10502 jiffies! g11593 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28680 pid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5301 [inline]
 __schedule+0x245a/0x5a60 kernel/sched/core.c:6619
 schedule+0xde/0x1a0 kernel/sched/core.c:6695
 schedule_timeout+0x14e/0x2b0 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1609
 rcu_gp_kthread+0x23a/0x360 kernel/rcu/tree.c:1808
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 0 PID: 5807 Comm: syz-executor.3 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
RIP: 0010:copy_page_from_iter_atomic+0xabb/0x1180 lib/iov_iter.c:821
Code: c1 ea 03 80 3c 02 00 0f 85 c8 05 00 00 31 c0 45 8b 66 08 4c 89 74 24 30 49 bd 00 00 00 00 00 fc ff df 49 89 c6 e8 95 b9 6c fd <48> 8d 7d 0c 48 89 f8 48 c1 e8 03 42 0f b6 14 28 48 89 f8 83 e0 07
RSP: 0018:ffffc90005ff7448 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
RDX: ffff888021211d40 RSI: ffffffff8417b10b RDI: 0000000000000001
RBP: ffff88802b8abe00 R08: 0000000000000001 R09: 0000000000000002
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000020
FS:  00007f37e0a3e700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37dfd15829 CR3: 000000007d1dd000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 generic_perform_write+0x2ca/0x570 mm/filemap.c:3778
 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285
 ext4_file_write_iter+0x8bf/0x1710 fs/ext4/file.c:700
 call_write_iter include/linux/fs.h:1851 [inline]
 do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735
 do_iter_write+0x182/0x700 fs/read_write.c:861
 vfs_iter_write+0x74/0xa0 fs/read_write.c:902
 iter_file_splice_write+0x745/0xc90 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0x114/0x180 fs/splice.c:931
 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886
 do_splice_direct+0x1ab/0x280 fs/splice.c:974
 do_sendfile+0xb19/0x12c0 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64 fs/read_write.c:1309 [inline]
 __x64_sys_sendfile64+0x14d/0x210 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f37dfc8c0c9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f37e0a3e168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007f37dfdabf80 RCX: 00007f37dfc8c0c9
RDX: 00000000200001c0 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 00007f37dfce7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000200a198 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc507f411f R14: 00007f37e0a3e300 R15: 0000000000022000
 </TASK>


Tested on:

commit:         4fafd969 Add linux-next specific files for 20230203
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11e5f41f480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1d2fba7d42502ca4
dashboard link: https://syzkaller.appspot.com/bug?extid=b9564ba6e8e00694511b
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=13d62847480000


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-14 10:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-08 10:37 [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6) syzbot
2023-02-09 22:25 ` Thomas Gleixner
2023-02-13 22:28   ` John Stultz
2023-02-14 10:27   ` [tip: timers/urgent] alarmtimer: Prevent starvation by small intervals and SIG_IGN tip-bot2 for Thomas Gleixner
     [not found] <20230208132556.1881-1-hdanton@sina.com>
2023-02-08 14:15 ` [syzbot] INFO: rcu detected stall in ext4_file_write_iter (6) syzbot
     [not found] <20230209010659.2012-1-hdanton@sina.com>
2023-02-09  6:07 ` syzbot
     [not found] <20230209061936.2127-1-hdanton@sina.com>
2023-02-09  7:16 ` syzbot
     [not found] <20230209072740.2224-1-hdanton@sina.com>
2023-02-09  7:34 ` syzbot
     [not found] <20230209073104.2276-1-hdanton@sina.com>
2023-02-09  7:50 ` syzbot
     [not found] <20230210065053.2385-1-hdanton@sina.com>
2023-02-10  7:16 ` syzbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.