All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [kernel?] possible deadlock in console_flush_all (2)
@ 2023-10-08  6:28 syzbot
  2023-10-17 10:06 ` Aleksandr Nogikh
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: syzbot @ 2023-10-08  6:28 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    7d730f1bf6f3 Add linux-next specific files for 20231005
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f532286be4fff4b5
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1d7f28a4398f/disk-7d730f1b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d454d124268e/vmlinux-7d730f1b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/dbca966175cb/bzImage-7d730f1b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f78380e4eae53c64125c@syzkaller.appspotmail.com

EEVDF scheduling fail, picking leftmost
======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc4-next-20231005-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.0/5232 is trying to acquire lock:
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_emit_next_record kernel/printk/printk.c:2888 [inline]
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_flush_all+0x49a/0xd50 kernel/printk/printk.c:2960

but task is already holding lock:
ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&rq->__lock){-.-.}-{2:2}:
       _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
       raw_spin_rq_lock kernel/sched/sched.h:1357 [inline]
       rq_lock kernel/sched/sched.h:1671 [inline]
       task_fork_fair+0x70/0x240 kernel/sched/fair.c:12399
       sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4799
       copy_process+0x4580/0x74b0 kernel/fork.c:2609
       kernel_clone+0xfd/0x920 kernel/fork.c:2907
       user_mode_thread+0xb4/0xf0 kernel/fork.c:2985
       rest_init+0x27/0x2b0 init/main.c:691
       arch_call_rest_init+0x13/0x30 init/main.c:823
       start_kernel+0x39f/0x480 init/main.c:1068
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #3 (&p->pi_lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
       try_to_wake_up+0xb0/0x15d0 kernel/sched/core.c:4213
       __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
       __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
       tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
       serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
       serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
       serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
       serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
       __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
       handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
       handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
       handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
       generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
       handle_irq arch/x86/kernel/irq.c:238 [inline]
       __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
       common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
       asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
       __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
       _raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
       spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
       uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
       serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
       __rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
       rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
       rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
       pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
       process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
       process_scheduled_works kernel/workqueue.c:2703 [inline]
       worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
       kthread+0x33c/0x440 kernel/kthread.c:388
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

-> #2 (&tty->write_wait){-...}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       __wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
       tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
       serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
       serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
       serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
       serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
       __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
       handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
       handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
       handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
       generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
       handle_irq arch/x86/kernel/irq.c:238 [inline]
       __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
       common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
       asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
       __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
       _raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
       spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
       uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
       serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
       __rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
       rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
       rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
       pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
       process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
       process_scheduled_works kernel/workqueue.c:2703 [inline]
       worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
       kthread+0x33c/0x440 kernel/kthread.c:388
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

-> #1 (&port_lock_key){-...}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       uart_port_lock_irqsave include/linux/serial_core.h:616 [inline]
       serial8250_console_write+0xa7c/0x1060 drivers/tty/serial/8250/8250_port.c:3410
       console_emit_next_record kernel/printk/printk.c:2894 [inline]
       console_flush_all+0x4d5/0xd50 kernel/printk/printk.c:2960
       console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
       vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
       vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
       _printk+0xc8/0x100 kernel/printk/printk.c:2329
       register_console+0xb30/0x1210 kernel/printk/printk.c:3535
       univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
       console_init+0xba/0x5c0 kernel/printk/printk.c:3681
       start_kernel+0x25a/0x480 init/main.c:1004
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #0 (console_owner){....}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain kernel/locking/lockdep.c:3868 [inline]
       __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
       lock_acquire kernel/locking/lockdep.c:5753 [inline]
       lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
       console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
       console_emit_next_record kernel/printk/printk.c:2888 [inline]
       console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
       console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
       vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
       vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
       _printk+0xc8/0x100 kernel/printk/printk.c:2329
       pick_eevdf kernel/sched/fair.c:963 [inline]
       pick_next_entity kernel/sched/fair.c:5247 [inline]
       pick_next_task_fair kernel/sched/fair.c:8244 [inline]
       __pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
       __pick_next_task kernel/sched/core.c:6003 [inline]
       __pick_next_task kernel/sched/core.c:5972 [inline]
       pick_next_task kernel/sched/core.c:6061 [inline]
       __schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
       __schedule_loop kernel/sched/core.c:6753 [inline]
       schedule+0xe7/0x270 kernel/sched/core.c:6768
       futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
       __futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
       futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
       do_futex+0x227/0x350 kernel/futex/syscalls.c:102
       __do_sys_futex kernel/futex/syscalls.c:179 [inline]
       __se_sys_futex kernel/futex/syscalls.c:160 [inline]
       __x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Chain exists of:
  console_owner --> &p->pi_lock --> &rq->__lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rq->__lock);
                               lock(&p->pi_lock);
                               lock(&rq->__lock);
  lock(console_owner);

 *** DEADLOCK ***

3 locks held by syz-executor.0/5232:
 #0: ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
 #1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1924 [inline]
 #1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x162/0x5f0 kernel/printk/printk.c:2303
 #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:303 [inline]
 #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:232 [inline]
 #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:286 [inline]
 #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_flush_all+0x12a/0xd50 kernel/printk/printk.c:2952

stack backtrace:
CPU: 1 PID: 5232 Comm: syz-executor.0 Not tainted 6.6.0-rc4-next-20231005-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
 check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain kernel/locking/lockdep.c:3868 [inline]
 __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
 lock_acquire kernel/locking/lockdep.c:5753 [inline]
 lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
 console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
 console_emit_next_record kernel/printk/printk.c:2888 [inline]
 console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
 console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
 vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
 vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
 _printk+0xc8/0x100 kernel/printk/printk.c:2329
 pick_eevdf kernel/sched/fair.c:963 [inline]
 pick_next_entity kernel/sched/fair.c:5247 [inline]
 pick_next_task_fair kernel/sched/fair.c:8244 [inline]
 __pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
 __pick_next_task kernel/sched/core.c:6003 [inline]
 __pick_next_task kernel/sched/core.c:5972 [inline]
 pick_next_task kernel/sched/core.c:6061 [inline]
 __schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
 __schedule_loop kernel/sched/core.c:6753 [inline]
 schedule+0xe7/0x270 kernel/sched/core.c:6768
 futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
 __futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
 futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
 do_futex+0x227/0x350 kernel/futex/syscalls.c:102
 __do_sys_futex kernel/futex/syscalls.c:179 [inline]
 __se_sys_futex kernel/futex/syscalls.c:160 [inline]
 __x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7af7e7cae9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7af8af7178 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 00007f7af7f9bf88 RCX: 00007f7af7e7cae9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f7af7f9bf88
RBP: 00007f7af7f9bf80 R08: 00007f7af8af76c0 R09: 00007f7af8af76c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7af7f9bf8c
R13: 000000000000000b R14: 00007fff1e139c80 R15: 00007fff1e139d68
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2023-10-08  6:28 [syzbot] [kernel?] possible deadlock in console_flush_all (2) syzbot
@ 2023-10-17 10:06 ` Aleksandr Nogikh
  2023-10-17 14:13   ` Peter Zijlstra
  2023-10-17 13:07 ` [syzbot] [ext4?] " syzbot
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Aleksandr Nogikh @ 2023-10-17 10:06 UTC (permalink / raw)
  To: syzbot, Benjamin Segall, Peter Zijlstra
  Cc: linux-kernel, syzkaller-bugs, tglx, Dmitry Vyukov

This is currently causing more than 50% daily crashes on syzbot:

https://syzkaller.appspot.com/upstream/graph/crashes

It began to appear after next-20231004. Judging by the diff between
next-20231003 and next-20231004, it looks like quite a few changes to
kernel/sched were merged in between. Could it be caused by any of
those?

On Sun, Oct 8, 2023 at 8:28 AM syzbot
<syzbot+f78380e4eae53c64125c@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    7d730f1bf6f3 Add linux-next specific files for 20231005
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f532286be4fff4b5
> dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/1d7f28a4398f/disk-7d730f1b.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/d454d124268e/vmlinux-7d730f1b.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/dbca966175cb/bzImage-7d730f1b.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+f78380e4eae53c64125c@syzkaller.appspotmail.com
>
> EEVDF scheduling fail, picking leftmost
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.6.0-rc4-next-20231005-syzkaller #0 Not tainted
> ------------------------------------------------------
> syz-executor.0/5232 is trying to acquire lock:
> ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
> ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_emit_next_record kernel/printk/printk.c:2888 [inline]
> ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_flush_all+0x49a/0xd50 kernel/printk/printk.c:2960
>
> but task is already holding lock:
> ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #4 (&rq->__lock){-.-.}-{2:2}:
>        _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
>        raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
>        raw_spin_rq_lock kernel/sched/sched.h:1357 [inline]
>        rq_lock kernel/sched/sched.h:1671 [inline]
>        task_fork_fair+0x70/0x240 kernel/sched/fair.c:12399
>        sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4799
>        copy_process+0x4580/0x74b0 kernel/fork.c:2609
>        kernel_clone+0xfd/0x920 kernel/fork.c:2907
>        user_mode_thread+0xb4/0xf0 kernel/fork.c:2985
>        rest_init+0x27/0x2b0 init/main.c:691
>        arch_call_rest_init+0x13/0x30 init/main.c:823
>        start_kernel+0x39f/0x480 init/main.c:1068
>        x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
>        x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
>        secondary_startup_64_no_verify+0x166/0x16b
>
> -> #3 (&p->pi_lock){-.-.}-{2:2}:
>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>        _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
>        class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
>        try_to_wake_up+0xb0/0x15d0 kernel/sched/core.c:4213
>        __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
>        __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
>        tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
>        serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
>        serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
>        serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
>        serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
>        __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
>        handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
>        handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
>        handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
>        generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
>        handle_irq arch/x86/kernel/irq.c:238 [inline]
>        __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
>        common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
>        asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
>        __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
>        _raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
>        spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
>        uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
>        serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
>        __rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
>        rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
>        rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
>        pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
>        process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
>        process_scheduled_works kernel/workqueue.c:2703 [inline]
>        worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
>        kthread+0x33c/0x440 kernel/kthread.c:388
>        ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
>        ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
>
> -> #2 (&tty->write_wait){-...}-{2:2}:
>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>        _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
>        __wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
>        tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
>        serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
>        serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
>        serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
>        serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
>        __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
>        handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
>        handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
>        handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
>        generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
>        handle_irq arch/x86/kernel/irq.c:238 [inline]
>        __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
>        common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
>        asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
>        __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
>        _raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
>        spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
>        uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
>        serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
>        __rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
>        rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
>        rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
>        pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
>        process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
>        process_scheduled_works kernel/workqueue.c:2703 [inline]
>        worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
>        kthread+0x33c/0x440 kernel/kthread.c:388
>        ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
>        ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
>
> -> #1 (&port_lock_key){-...}-{2:2}:
>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>        _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
>        uart_port_lock_irqsave include/linux/serial_core.h:616 [inline]
>        serial8250_console_write+0xa7c/0x1060 drivers/tty/serial/8250/8250_port.c:3410
>        console_emit_next_record kernel/printk/printk.c:2894 [inline]
>        console_flush_all+0x4d5/0xd50 kernel/printk/printk.c:2960
>        console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
>        vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
>        vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
>        _printk+0xc8/0x100 kernel/printk/printk.c:2329
>        register_console+0xb30/0x1210 kernel/printk/printk.c:3535
>        univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
>        console_init+0xba/0x5c0 kernel/printk/printk.c:3681
>        start_kernel+0x25a/0x480 init/main.c:1004
>        x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
>        x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
>        secondary_startup_64_no_verify+0x166/0x16b
>
> -> #0 (console_owner){....}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>        validate_chain kernel/locking/lockdep.c:3868 [inline]
>        __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
>        lock_acquire kernel/locking/lockdep.c:5753 [inline]
>        lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
>        console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
>        console_emit_next_record kernel/printk/printk.c:2888 [inline]
>        console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
>        console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
>        vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
>        vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
>        _printk+0xc8/0x100 kernel/printk/printk.c:2329
>        pick_eevdf kernel/sched/fair.c:963 [inline]
>        pick_next_entity kernel/sched/fair.c:5247 [inline]
>        pick_next_task_fair kernel/sched/fair.c:8244 [inline]
>        __pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
>        __pick_next_task kernel/sched/core.c:6003 [inline]
>        __pick_next_task kernel/sched/core.c:5972 [inline]
>        pick_next_task kernel/sched/core.c:6061 [inline]
>        __schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
>        __schedule_loop kernel/sched/core.c:6753 [inline]
>        schedule+0xe7/0x270 kernel/sched/core.c:6768
>        futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
>        __futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
>        futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
>        do_futex+0x227/0x350 kernel/futex/syscalls.c:102
>        __do_sys_futex kernel/futex/syscalls.c:179 [inline]
>        __se_sys_futex kernel/futex/syscalls.c:160 [inline]
>        __x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
>        do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>        do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> other info that might help us debug this:
>
> Chain exists of:
>   console_owner --> &p->pi_lock --> &rq->__lock
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&rq->__lock);
>                                lock(&p->pi_lock);
>                                lock(&rq->__lock);
>   lock(console_owner);
>
>  *** DEADLOCK ***
>
> 3 locks held by syz-executor.0/5232:
>  #0: ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
>  #1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1924 [inline]
>  #1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x162/0x5f0 kernel/printk/printk.c:2303
>  #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:303 [inline]
>  #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:232 [inline]
>  #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:286 [inline]
>  #2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_flush_all+0x12a/0xd50 kernel/printk/printk.c:2952
>
> stack backtrace:
> CPU: 1 PID: 5232 Comm: syz-executor.0 Not tainted 6.6.0-rc4-next-20231005-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
>  check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2187
>  check_prev_add kernel/locking/lockdep.c:3134 [inline]
>  check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>  validate_chain kernel/locking/lockdep.c:3868 [inline]
>  __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
>  lock_acquire kernel/locking/lockdep.c:5753 [inline]
>  lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
>  console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
>  console_emit_next_record kernel/printk/printk.c:2888 [inline]
>  console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
>  console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
>  vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
>  vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
>  _printk+0xc8/0x100 kernel/printk/printk.c:2329
>  pick_eevdf kernel/sched/fair.c:963 [inline]
>  pick_next_entity kernel/sched/fair.c:5247 [inline]
>  pick_next_task_fair kernel/sched/fair.c:8244 [inline]
>  __pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
>  __pick_next_task kernel/sched/core.c:6003 [inline]
>  __pick_next_task kernel/sched/core.c:5972 [inline]
>  pick_next_task kernel/sched/core.c:6061 [inline]
>  __schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
>  __schedule_loop kernel/sched/core.c:6753 [inline]
>  schedule+0xe7/0x270 kernel/sched/core.c:6768
>  futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
>  __futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
>  futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
>  do_futex+0x227/0x350 kernel/futex/syscalls.c:102
>  __do_sys_futex kernel/futex/syscalls.c:179 [inline]
>  __se_sys_futex kernel/futex/syscalls.c:160 [inline]
>  __x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>  do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f7af7e7cae9
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f7af8af7178 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: ffffffffffffffda RBX: 00007f7af7f9bf88 RCX: 00007f7af7e7cae9
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f7af7f9bf88
> RBP: 00007f7af7f9bf80 R08: 00007f7af8af76c0 R09: 00007f7af8af76c0
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7af7f9bf8c
> R13: 000000000000000b R14: 00007fff1e139c80 R15: 00007fff1e139d68
>  </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the bug is already fixed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite bug's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the bug is a duplicate of another bug, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000e40a2906072e9567%40google.com.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [ext4?] possible deadlock in console_flush_all (2)
  2023-10-08  6:28 [syzbot] [kernel?] possible deadlock in console_flush_all (2) syzbot
  2023-10-17 10:06 ` Aleksandr Nogikh
@ 2023-10-17 13:07 ` syzbot
  2023-10-17 14:16   ` Peter Zijlstra
  2024-03-05 14:05 ` [syzbot] [kernel?] " Tetsuo Handa
  2024-03-20  6:50 ` [syzbot] [kernel?] " Tetsuo Handa
  3 siblings, 1 reply; 16+ messages in thread
From: syzbot @ 2023-10-17 13:07 UTC (permalink / raw)
  To: adilger.kernel, bsegall, dvyukov, linux-ext4, linux-fsdevel,
	linux-kernel, nogikh, peterz, syzkaller-bugs, tglx, tytso

syzbot has found a reproducer for the following issue on:

HEAD commit:    213f891525c2 Merge tag 'probes-fixes-v6.6-rc6' of git://gi..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=154b2f55680000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4471dbb58008c9f2
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1711f94d680000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1267424d680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2865a8ce1485/disk-213f8915.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9250aaf914b2/vmlinux-213f8915.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a38e8bb73fa3/bzImage-213f8915.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/1f53082eac44/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f78380e4eae53c64125c@syzkaller.appspotmail.com

list_add corruption. next->prev should be prev (ffff8880b993d228), but was caff904900000000. (next=ffff8880783659f8).
======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc6-syzkaller-00029-g213f891525c2 #0 Not tainted
------------------------------------------------------
syz-executor221/5041 is trying to acquire lock:
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_emit_next_record kernel/printk/printk.c:2904 [inline]
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_flush_all+0x4ac/0xfb0 kernel/printk/printk.c:2966

but task is already holding lock:
ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&rq->__lock){-.-.}-{2:2}:
       _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
       raw_spin_rq_lock kernel/sched/sched.h:1372 [inline]
       rq_lock kernel/sched/sched.h:1681 [inline]
       task_fork_fair+0x70/0x240 kernel/sched/fair.c:12416
       sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4816
       copy_process+0x45ec/0x73f0 kernel/fork.c:2609
       kernel_clone+0xfd/0x920 kernel/fork.c:2909
       user_mode_thread+0xb4/0xf0 kernel/fork.c:2987
       rest_init+0x27/0x2b0 init/main.c:691
       arch_call_rest_init+0x13/0x30 init/main.c:823
       start_kernel+0x39f/0x480 init/main.c:1068
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #3 (&p->pi_lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
       try_to_wake_up+0xb0/0x15b0 kernel/sched/core.c:4230
       __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
       __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
       tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
       serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1843
       serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1950
       serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1970
       serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
       __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
       handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
       handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
       handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
       generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
       handle_irq arch/x86/kernel/irq.c:238 [inline]
       __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
       common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
       asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
       native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
       arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
       acpi_safe_halt+0x1b/0x20 drivers/acpi/processor_idle.c:112
       acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
       cpuidle_enter_state+0x82/0x500 drivers/cpuidle/cpuidle.c:267
       cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
       cpuidle_idle_call kernel/sched/idle.c:215 [inline]
       do_idle+0x315/0x3f0 kernel/sched/idle.c:282
       cpu_startup_entry+0x50/0x60 kernel/sched/idle.c:380
       rest_init+0x16f/0x2b0 init/main.c:726
       arch_call_rest_init+0x13/0x30 init/main.c:823
       start_kernel+0x39f/0x480 init/main.c:1068
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #2 (&tty->write_wait){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       __wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
       tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
       serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1843
       serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1950
       serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1970
       serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
       __handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
       handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
       handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
       handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
       generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
       handle_irq arch/x86/kernel/irq.c:238 [inline]
       __common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
       common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
       asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
       native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
       arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
       acpi_safe_halt+0x1b/0x20 drivers/acpi/processor_idle.c:112
       acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
       cpuidle_enter_state+0x82/0x500 drivers/cpuidle/cpuidle.c:267
       cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
       cpuidle_idle_call kernel/sched/idle.c:215 [inline]
       do_idle+0x315/0x3f0 kernel/sched/idle.c:282
       cpu_startup_entry+0x50/0x60 kernel/sched/idle.c:380
       rest_init+0x16f/0x2b0 init/main.c:726
       arch_call_rest_init+0x13/0x30 init/main.c:823
       start_kernel+0x39f/0x480 init/main.c:1068
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #1 (&port_lock_key){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
       serial8250_console_write+0x57e/0x1060 drivers/tty/serial/8250/8250_port.c:3411
       console_emit_next_record kernel/printk/printk.c:2910 [inline]
       console_flush_all+0x4eb/0xfb0 kernel/printk/printk.c:2966
       console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
       vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
       vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
       _printk+0xc8/0x100 kernel/printk/printk.c:2332
       register_console+0xa67/0x10d0 kernel/printk/printk.c:3524
       univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
       console_init+0xba/0x5c0 kernel/printk/printk.c:3667
       start_kernel+0x25a/0x480 init/main.c:1004
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0x166/0x16b

-> #0 (console_owner){-...}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain kernel/locking/lockdep.c:3868 [inline]
       __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
       lock_acquire kernel/locking/lockdep.c:5753 [inline]
       lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
       console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
       console_emit_next_record kernel/printk/printk.c:2904 [inline]
       console_flush_all+0x4c1/0xfb0 kernel/printk/printk.c:2966
       console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
       vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
       vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
       _printk+0xc8/0x100 kernel/printk/printk.c:2332
       __list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
       __list_add_valid include/linux/list.h:88 [inline]
       __list_add include/linux/list.h:150 [inline]
       list_add include/linux/list.h:169 [inline]
       account_entity_enqueue kernel/sched/fair.c:3534 [inline]
       enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
       enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
       enqueue_task kernel/sched/core.c:2102 [inline]
       activate_task kernel/sched/core.c:2132 [inline]
       ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
       ttwu_queue kernel/sched/core.c:4029 [inline]
       try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
       autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
       __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
       __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
       wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
       wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
       irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
       irq_work_run_list kernel/irq_work.c:252 [inline]
       irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
       update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
       tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
       tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
       __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
       __hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
       hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
       local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
       __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
       sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
       memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:70
       ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
       ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
       ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
       ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
       vfs_fallocate+0x46c/0xe80 fs/open.c:324
       ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
       file_ioctl fs/ioctl.c:334 [inline]
       do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
       __do_sys_ioctl fs/ioctl.c:869 [inline]
       __se_sys_ioctl fs/ioctl.c:857 [inline]
       __x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Chain exists of:
  console_owner --> &p->pi_lock --> &rq->__lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rq->__lock);
                               lock(&p->pi_lock);
                               lock(&rq->__lock);
  lock(console_owner);

 *** DEADLOCK ***

9 locks held by syz-executor221/5041:
 #0: ffff88807e30e410 (sb_writers#4){.+.+}-{0:0}, at: ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
 #1: ffff888077854a00 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: inode_lock include/linux/fs.h:802 [inline]
 #1: ffff888077854a00 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: ext4_punch_hole+0x1bd/0x1040 fs/ext4/inode.c:3916
 #2: ffff888077854ba0 (mapping.invalidate_lock){++++}-{3:3}, at: filemap_invalidate_lock include/linux/fs.h:847 [inline]
 #2: ffff888077854ba0 (mapping.invalidate_lock){++++}-{3:3}, at: ext4_punch_hole+0x370/0x1040 fs/ext4/inode.c:3963
 #3: ffff888077854888 (&ei->i_data_sem){++++}-{3:3}, at: ext4_punch_hole+0xd94/0x1040 fs/ext4/inode.c:4004
 #4: ffffffff8cb98958 (log_wait.lock){-...}-{2:2}, at: __wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
 #5: ffff88807eea0a00 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
 #5: ffff88807eea0a00 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb0/0x15b0 kernel/sched/core.c:4230
 #6: ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
 #7: ffffffff8cb98b00 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1927 [inline]
 #7: ffffffff8cb98b00 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x162/0x5f0 kernel/printk/printk.c:2306
 #8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:303 [inline]
 #8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:232 [inline]
 #8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:292 [inline]
 #8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: console_flush_all+0x12d/0xfb0 kernel/printk/printk.c:2958

stack backtrace:
CPU: 0 PID: 5041 Comm: syz-executor221 Not tainted 6.6.0-rc6-syzkaller-00029-g213f891525c2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
 check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain kernel/locking/lockdep.c:3868 [inline]
 __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
 lock_acquire kernel/locking/lockdep.c:5753 [inline]
 lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
 console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
 console_emit_next_record kernel/printk/printk.c:2904 [inline]
 console_flush_all+0x4c1/0xfb0 kernel/printk/printk.c:2966
 console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
 vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
 vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
 _printk+0xc8/0x100 kernel/printk/printk.c:2332
 __list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
 __list_add_valid include/linux/list.h:88 [inline]
 __list_add include/linux/list.h:150 [inline]
 list_add include/linux/list.h:169 [inline]
 account_entity_enqueue kernel/sched/fair.c:3534 [inline]
 enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
 enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
 enqueue_task kernel/sched/core.c:2102 [inline]
 activate_task kernel/sched/core.c:2132 [inline]
 ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
 ttwu_queue kernel/sched/core.c:4029 [inline]
 try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
 autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
 __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
 __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
 wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
 wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
 irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
 irq_work_run_list kernel/irq_work.c:252 [inline]
 irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
 update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
 tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
 tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
 __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
 __hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
 hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
 __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:71
Code: 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 <48> 8d 76 20 4c 89 1f 4c 89 57 08 4c 89 4f 10 4c 89 47 18 48 8d 7f
RSP: 0018:ffffc9000431f800 EFLAGS: 00000286
RAX: ffff88807723c078 RBX: ffff88807723c002 RCX: 0000000000000000
RDX: fffffffffec10288 RSI: ffff88807862bd44 RDI: ffff88807862bd38
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88807723c07c
R13: ffff88807e30c000 R14: dffffc0000000000 R15: dffffc0000000000
 ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
 ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
 ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
 ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
 vfs_fallocate+0x46c/0xe80 fs/open.c:324
 ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
 file_ioctl fs/ioctl.c:334 [inline]
 do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
 __do_sys_ioctl fs/ioctl.c:869 [inline]
 __se_sys_ioctl fs/ioctl.c:857 [inline]
 __x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff93a3c62d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff93a361218 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff93a44d6d8 RCX: 00007ff93a3c62d9
RDX: 0000000020000080 RSI: 000000004030582b RDI: 0000000000000004
RBP: 00007ff93a44d6d0 R08: 00007ffd0d4db467 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff93a41a578
R13: 0000000000000006 R14: 00007ffd0d4db380 R15: 6f6f6c2f7665642f
 </TASK>
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:29!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5041 Comm: syz-executor221 Not tainted 6.6.0-rc6-syzkaller-00029-g213f891525c2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
RIP: 0010:__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
Code: c7 c7 20 2d e9 8a e8 3d 31 3f fd 0f 0b 48 c7 c7 c0 2d e9 8a e8 2f 31 3f fd 0f 0b 48 89 d9 48 c7 c7 20 2e e9 8a e8 1e 31 3f fd <0f> 0b 48 89 f1 48 c7 c7 a0 2e e9 8a 48 89 de e8 0a 31 3f fd 0f 0b
RSP: 0018:ffffc900000079c0 EFLAGS: 00010086
RAX: 0000000000000075 RBX: ffff8880783659f8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816b8952 RDI: 0000000000000005
RBP: ffff88807eea00b8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080010004 R11: 0000000000000001 R12: ffff88807eea0120
R13: ffff88807eea3c00 R14: ffff8880b993c820 R15: ffff8880b993d228
FS:  00007ff93a3616c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002002b000 CR3: 0000000020467000 CR4: 0000000000350ef0
Call Trace:
 <IRQ>
 __list_add_valid include/linux/list.h:88 [inline]
 __list_add include/linux/list.h:150 [inline]
 list_add include/linux/list.h:169 [inline]
 account_entity_enqueue kernel/sched/fair.c:3534 [inline]
 enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
 enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
 enqueue_task kernel/sched/core.c:2102 [inline]
 activate_task kernel/sched/core.c:2132 [inline]
 ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
 ttwu_queue kernel/sched/core.c:4029 [inline]
 try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
 autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
 __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
 __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
 wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
 wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
 irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
 irq_work_run_list kernel/irq_work.c:252 [inline]
 irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
 update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
 tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
 tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
 __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
 __hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
 hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
 __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:71
Code: 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 <48> 8d 76 20 4c 89 1f 4c 89 57 08 4c 89 4f 10 4c 89 47 18 48 8d 7f
RSP: 0018:ffffc9000431f800 EFLAGS: 00000286
RAX: ffff88807723c078 RBX: ffff88807723c002 RCX: 0000000000000000
RDX: fffffffffec10288 RSI: ffff88807862bd44 RDI: ffff88807862bd38
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88807723c07c
R13: ffff88807e30c000 R14: dffffc0000000000 R15: dffffc0000000000
 ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
 ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
 ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
 ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
 vfs_fallocate+0x46c/0xe80 fs/open.c:324
 ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
 file_ioctl fs/ioctl.c:334 [inline]
 do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
 __do_sys_ioctl fs/ioctl.c:869 [inline]
 __se_sys_ioctl fs/ioctl.c:857 [inline]
 __x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff93a3c62d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff93a361218 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff93a44d6d8 RCX: 00007ff93a3c62d9
RDX: 0000000020000080 RSI: 000000004030582b RDI: 0000000000000004
RBP: 00007ff93a44d6d0 R08: 00007ffd0d4db467 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff93a41a578
R13: 0000000000000006 R14: 00007ffd0d4db380 R15: 6f6f6c2f7665642f
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
Code: c7 c7 20 2d e9 8a e8 3d 31 3f fd 0f 0b 48 c7 c7 c0 2d e9 8a e8 2f 31 3f fd 0f 0b 48 89 d9 48 c7 c7 20 2e e9 8a e8 1e 31 3f fd <0f> 0b 48 89 f1 48 c7 c7 a0 2e e9 8a 48 89 de e8 0a 31 3f fd 0f 0b
RSP: 0018:ffffc900000079c0 EFLAGS: 00010086
RAX: 0000000000000075 RBX: ffff8880783659f8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816b8952 RDI: 0000000000000005
RBP: ffff88807eea00b8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080010004 R11: 0000000000000001 R12: ffff88807eea0120
R13: ffff88807eea3c00 R14: ffff8880b993c820 R15: ffff8880b993d228
FS:  00007ff93a3616c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002002b000 CR3: 0000000020467000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
   0:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   5:	48 81 fa a8 02 00 00 	cmp    $0x2a8,%rdx
   c:	72 05                	jb     0x13
   e:	40 38 fe             	cmp    %dil,%sil
  11:	74 47                	je     0x5a
  13:	48 83 ea 20          	sub    $0x20,%rdx
  17:	48 83 ea 20          	sub    $0x20,%rdx
  1b:	4c 8b 1e             	mov    (%rsi),%r11
  1e:	4c 8b 56 08          	mov    0x8(%rsi),%r10
  22:	4c 8b 4e 10          	mov    0x10(%rsi),%r9
  26:	4c 8b 46 18          	mov    0x18(%rsi),%r8
* 2a:	48 8d 76 20          	lea    0x20(%rsi),%rsi <-- trapping instruction
  2e:	4c 89 1f             	mov    %r11,(%rdi)
  31:	4c 89 57 08          	mov    %r10,0x8(%rdi)
  35:	4c 89 4f 10          	mov    %r9,0x10(%rdi)
  39:	4c 89 47 18          	mov    %r8,0x18(%rdi)
  3d:	48                   	rex.W
  3e:	8d                   	.byte 0x8d
  3f:	7f                   	.byte 0x7f


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2023-10-17 10:06 ` Aleksandr Nogikh
@ 2023-10-17 14:13   ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2023-10-17 14:13 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: syzbot, Benjamin Segall, linux-kernel, syzkaller-bugs, tglx,
	Dmitry Vyukov

On Tue, Oct 17, 2023 at 12:06:12PM +0200, Aleksandr Nogikh wrote:
> This is currently causing more than 50% daily crashes on syzbot:
> 
> https://syzkaller.appspot.com/upstream/graph/crashes
> 
> It began to appear after next-20231004. Judging by the diff between
> next-20231003 and next-20231004, it looks like quite a few changes to
> kernel/sched were merged in between. Could it be caused by any of
> those?

> > EEVDF scheduling fail, picking leftmost

That was in next for a little bit, can you still reproduce with current
next?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [ext4?] possible deadlock in console_flush_all (2)
  2023-10-17 13:07 ` [syzbot] [ext4?] " syzbot
@ 2023-10-17 14:16   ` Peter Zijlstra
  2023-10-17 14:28     ` Aleksandr Nogikh
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2023-10-17 14:16 UTC (permalink / raw)
  To: syzbot
  Cc: adilger.kernel, bsegall, dvyukov, linux-ext4, linux-fsdevel,
	linux-kernel, nogikh, syzkaller-bugs, tglx, tytso

On Tue, Oct 17, 2023 at 06:07:50AM -0700, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    213f891525c2 Merge tag 'probes-fixes-v6.6-rc6' of git://gi..

> list_add corruption. next->prev should be prev (ffff8880b993d228), but was caff904900000000. (next=ffff8880783659f8).

Urgh, I've not seen that happen before. How reliable does this trigger?

>  __list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
>  __list_add_valid include/linux/list.h:88 [inline]
>  __list_add include/linux/list.h:150 [inline]
>  list_add include/linux/list.h:169 [inline]
>  account_entity_enqueue kernel/sched/fair.c:3534 [inline]
>  enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
>  enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
>  enqueue_task kernel/sched/core.c:2102 [inline]
>  activate_task kernel/sched/core.c:2132 [inline]
>  ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
>  ttwu_queue kernel/sched/core.c:4029 [inline]
>  try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
>  autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
>  __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
>  __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
>  wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
>  wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
>  irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
>  irq_work_run_list kernel/irq_work.c:252 [inline]
>  irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
>  update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
>  tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
>  tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
>  __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
>  __hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
>  hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
>  __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
>  sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
>  </IRQ>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [ext4?] possible deadlock in console_flush_all (2)
  2023-10-17 14:16   ` Peter Zijlstra
@ 2023-10-17 14:28     ` Aleksandr Nogikh
  0 siblings, 0 replies; 16+ messages in thread
From: Aleksandr Nogikh @ 2023-10-17 14:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: syzbot, adilger.kernel, bsegall, dvyukov, linux-ext4,
	linux-fsdevel, linux-kernel, syzkaller-bugs, tglx, tytso

Thank you for the information!

I've looked closer -- syzbot is currently fuzzing a somewhat old -next
version now (20231005), it could not upgrade after that because of a
-next boot error
(https://syzkaller.appspot.com/bug?extid=6867a9777f4b8dc4e256, already
has a patch). It can explain why we're still seeing these crashes.

Once the fix commit for "linux-next boot error: KASAN:
slab-out-of-bounds Write in vhci_setup" reaches next, it should all be
fine.

And it looks like we should just automatically stop -next fuzzing if
we cannot upgrade the kernel for more than several days in a row..

On Tue, Oct 17, 2023 at 4:16 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Oct 17, 2023 at 06:07:50AM -0700, syzbot wrote:
> > syzbot has found a reproducer for the following issue on:
> >
> > HEAD commit:    213f891525c2 Merge tag 'probes-fixes-v6.6-rc6' of git://gi..
>
> > list_add corruption. next->prev should be prev (ffff8880b993d228), but was caff904900000000. (next=ffff8880783659f8).
>
> Urgh, I've not seen that happen before. How reliable does this trigger?
>
> >  __list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
> >  __list_add_valid include/linux/list.h:88 [inline]
> >  __list_add include/linux/list.h:150 [inline]
> >  list_add include/linux/list.h:169 [inline]
> >  account_entity_enqueue kernel/sched/fair.c:3534 [inline]
> >  enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
> >  enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
> >  enqueue_task kernel/sched/core.c:2102 [inline]
> >  activate_task kernel/sched/core.c:2132 [inline]
> >  ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
> >  ttwu_queue kernel/sched/core.c:4029 [inline]
> >  try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
> >  autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
> >  __wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
> >  __wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
> >  wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
> >  wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
> >  irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
> >  irq_work_run_list kernel/irq_work.c:252 [inline]
> >  irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
> >  update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
> >  tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
> >  tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
> >  __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
> >  __hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
> >  hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
> >  __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
> >  sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
> >  </IRQ>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2023-10-08  6:28 [syzbot] [kernel?] possible deadlock in console_flush_all (2) syzbot
  2023-10-17 10:06 ` Aleksandr Nogikh
  2023-10-17 13:07 ` [syzbot] [ext4?] " syzbot
@ 2024-03-05 14:05 ` Tetsuo Handa
  2024-03-05 15:21   ` [syzbot] [serial?] " syzbot
  2024-03-20  6:50 ` [syzbot] [kernel?] " Tetsuo Handa
  3 siblings, 1 reply; 16+ messages in thread
From: Tetsuo Handa @ 2024-03-05 14:05 UTC (permalink / raw)
  To: syzbot, linux-kernel, syzkaller-bugs

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 001fe047bd5d..237bcab72004 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -48,6 +48,7 @@
 #include <linux/mutex_api.h>
 #include <linux/plist.h>
 #include <linux/poll.h>
+#include <linux/printk.h>
 #include <linux/proc_fs.h>
 #include <linux/profile.h>
 #include <linux/psi.h>
@@ -1680,6 +1681,7 @@ static inline void
 rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
 	__acquires(rq->lock)
 {
+	printk_deferred_enter();
 	raw_spin_rq_lock_irqsave(rq, rf->flags);
 	rq_pin_lock(rq, rf);
 }
@@ -1688,6 +1690,7 @@ static inline void
 rq_lock_irq(struct rq *rq, struct rq_flags *rf)
 	__acquires(rq->lock)
 {
+	printk_deferred_enter();
 	raw_spin_rq_lock_irq(rq);
 	rq_pin_lock(rq, rf);
 }
@@ -1696,6 +1699,7 @@ static inline void
 rq_lock(struct rq *rq, struct rq_flags *rf)
 	__acquires(rq->lock)
 {
+	printk_deferred_enter();
 	raw_spin_rq_lock(rq);
 	rq_pin_lock(rq, rf);
 }
@@ -1706,6 +1710,7 @@ rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
 {
 	rq_unpin_lock(rq, rf);
 	raw_spin_rq_unlock_irqrestore(rq, rf->flags);
+	printk_deferred_exit();
 }
 
 static inline void
@@ -1714,6 +1719,7 @@ rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
 {
 	rq_unpin_lock(rq, rf);
 	raw_spin_rq_unlock_irq(rq);
+	printk_deferred_exit();
 }
 
 static inline void
@@ -1722,6 +1728,7 @@ rq_unlock(struct rq *rq, struct rq_flags *rf)
 {
 	rq_unpin_lock(rq, rf);
 	raw_spin_rq_unlock(rq);
+	printk_deferred_exit();
 }
 
 DEFINE_LOCK_GUARD_1(rq_lock, struct rq,


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [syzbot] [serial?] possible deadlock in console_flush_all (2)
  2024-03-05 14:05 ` [syzbot] [kernel?] " Tetsuo Handa
@ 2024-03-05 15:21   ` syzbot
  0 siblings, 0 replies; 16+ messages in thread
From: syzbot @ 2024-03-05 15:21 UTC (permalink / raw)
  To: linux-kernel, penguin-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in locks_remove_posix

general protection fault, probably for non-canonical address 0xdffffc0020008013: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x0000000100040098-0x000000010004009f]
CPU: 1 PID: 5487 Comm: dhcpcd-run-hook Not tainted 6.8.0-rc6-syzkaller-g8f50d5c42355-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
RIP: 0010:list_empty include/linux/list.h:373 [inline]
RIP: 0010:locks_remove_posix+0xff/0x440 fs/locks.c:2554
Code: 4d 8b be 20 02 00 00 4d 85 ff 0f 84 9f 01 00 00 e8 16 ea 7b ff 49 8d 47 50 48 ba 00 00 00 00 00 fc ff df 48 89 c1 48 c1 e9 03 <80> 3c 11 00 0f 85 09 03 00 00 49 8b 57 50 48 39 d0 0f 84 6e 01 00
RSP: 0018:ffffc9000388f978 EFLAGS: 00010202
RAX: 0000000100040098 RBX: ffff888019b36500 RCX: 0000000020008013
RDX: dffffc0000000000 RSI: ffffffff820b2e1a RDI: ffff888076090730
RBP: 1ffff92000711f30 R08: 0000000000000003 R09: 0000000000004000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000388f9b0
R13: ffff88802a7e0b40 R14: ffff888076090510 R15: 0000000100040048
FS:  0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100040340 CR3: 00000000183e4000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 filp_flush+0x11b/0x1a0 fs/open.c:1518
 filp_close+0x1c/0x30 fs/open.c:1527
 close_files fs/file.c:432 [inline]
 put_files_struct fs/file.c:447 [inline]
 put_files_struct+0x1df/0x360 fs/file.c:444
 exit_files+0x82/0xb0 fs/file.c:464
 do_exit+0xa56/0x2ad0 kernel/exit.c:866
 do_group_exit+0xd4/0x2a0 kernel/exit.c:1020
 get_signal+0x23b5/0x2790 kernel/signal.c:2893
 arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:310
 exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 irqentry_exit_to_user_mode+0x13c/0x280 kernel/entry/common.c:225
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0033:0x7fbec7157428
Code: Unable to access opcode bytes at 0x7fbec71573fe.
RSP: 002b:00007fff8928a8a0 EFLAGS: 00010246
RAX: 0000000100040048 RBX: 0000555b854932c0 RCX: 00007fbec7205098
RDX: 0000000000000000 RSI: 0000555b85495d30 RDI: 0000555b85494930
RBP: 0000555b85494930 R08: 0000000000000000 R09: 0000000000000000
R10: 00007fbec7205088 R11: 0000000000000246 R12: 0000000000000000
R13: 0000555b85495d30 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:list_empty include/linux/list.h:373 [inline]
RIP: 0010:locks_remove_posix+0xff/0x440 fs/locks.c:2554
Code: 4d 8b be 20 02 00 00 4d 85 ff 0f 84 9f 01 00 00 e8 16 ea 7b ff 49 8d 47 50 48 ba 00 00 00 00 00 fc ff df 48 89 c1 48 c1 e9 03 <80> 3c 11 00 0f 85 09 03 00 00 49 8b 57 50 48 39 d0 0f 84 6e 01 00
RSP: 0018:ffffc9000388f978 EFLAGS: 00010202
RAX: 0000000100040098 RBX: ffff888019b36500 RCX: 0000000020008013
RDX: dffffc0000000000 RSI: ffffffff820b2e1a RDI: ffff888076090730
RBP: 1ffff92000711f30 R08: 0000000000000003 R09: 0000000000004000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000388f9b0
R13: ffff88802a7e0b40 R14: ffff888076090510 R15: 0000000100040048
FS:  0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100040340 CR3: 00000000183e4000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	4d 8b be 20 02 00 00 	mov    0x220(%r14),%r15
   7:	4d 85 ff             	test   %r15,%r15
   a:	0f 84 9f 01 00 00    	je     0x1af
  10:	e8 16 ea 7b ff       	call   0xff7bea2b
  15:	49 8d 47 50          	lea    0x50(%r15),%rax
  19:	48 ba 00 00 00 00 00 	movabs $0xdffffc0000000000,%rdx
  20:	fc ff df
  23:	48 89 c1             	mov    %rax,%rcx
  26:	48 c1 e9 03          	shr    $0x3,%rcx
* 2a:	80 3c 11 00          	cmpb   $0x0,(%rcx,%rdx,1) <-- trapping instruction
  2e:	0f 85 09 03 00 00    	jne    0x33d
  34:	49 8b 57 50          	mov    0x50(%r15),%rdx
  38:	48 39 d0             	cmp    %rdx,%rax
  3b:	0f                   	.byte 0xf
  3c:	84 6e 01             	test   %ch,0x1(%rsi)


Tested on:

commit:         8f50d5c4 Merge branch 'Allow struct_ops maps with a la..
git tree:       bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15c2d486180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=7e085fc648b10d1e
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=121c92fe180000


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2023-10-08  6:28 [syzbot] [kernel?] possible deadlock in console_flush_all (2) syzbot
                   ` (2 preceding siblings ...)
  2024-03-05 14:05 ` [syzbot] [kernel?] " Tetsuo Handa
@ 2024-03-20  6:50 ` Tetsuo Handa
  2024-03-20  6:56   ` Alexei Starovoitov
  3 siblings, 1 reply; 16+ messages in thread
From: Tetsuo Handa @ 2024-03-20  6:50 UTC (permalink / raw)
  To: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Petr Mladek, Steven Rostedt, John Ogness,
	Sergey Senozhatsky
  Cc: tglx, linux-kernel, bpf

Hello, bpf developers.

syzbot is reporting circular locking dependency because bpf program is hitting
printk() path. To avoid this dependency, something like
https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000 is needed.

Is it possible to call printk_deferred_{enter,exit}() from the bpf side (e.g.
bpf_prog_run()) so that we don't need to add overhead on the scheduler side
when bpf programs are not registered?

printk_deferred_{enter,exit}() requires that printk_deferred_exit() is called
on a CPU which printk_deferred_enter() was called, for percpu counter is used.
Is migrate_{disable,enable}() also needed around printk_deferred_{enter,exit}() ?

On 2023/10/08 15:28, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    7d730f1bf6f3 Add linux-next specific files for 20231005
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f532286be4fff4b5
> dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  6:50 ` [syzbot] [kernel?] " Tetsuo Handa
@ 2024-03-20  6:56   ` Alexei Starovoitov
  2024-03-20  7:04     ` Tetsuo Handa
  0 siblings, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2024-03-20  6:56 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Petr Mladek, Steven Rostedt, John Ogness,
	Sergey Senozhatsky, Thomas Gleixner, LKML, bpf

On Tue, Mar 19, 2024 at 11:50 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello, bpf developers.
>
> syzbot is reporting circular locking dependency because bpf program is hitting
> printk() path. To avoid this dependency, something like
> https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000 is needed.
>
> Is it possible to call printk_deferred_{enter,exit}() from the bpf side (e.g.
> bpf_prog_run()) so that we don't need to add overhead on the scheduler side
> when bpf programs are not registered?
>
> printk_deferred_{enter,exit}() requires that printk_deferred_exit() is called
> on a CPU which printk_deferred_enter() was called, for percpu counter is used.
> Is migrate_{disable,enable}() also needed around printk_deferred_{enter,exit}() ?
>
> On 2023/10/08 15:28, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    7d730f1bf6f3 Add linux-next specific files for 20231005
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000

This has nothing to do with bpf.
bpf never calls printk().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  6:56   ` Alexei Starovoitov
@ 2024-03-20  7:04     ` Tetsuo Handa
  2024-03-20  7:12       ` Alexei Starovoitov
  0 siblings, 1 reply; 16+ messages in thread
From: Tetsuo Handa @ 2024-03-20  7:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Petr Mladek, Steven Rostedt, John Ogness,
	Sergey Senozhatsky, Thomas Gleixner, LKML, bpf

On 2024/03/20 15:56, Alexei Starovoitov wrote:
> This has nothing to do with bpf.
> bpf never calls printk().

Please see the Sample crash report in the dashboard.
bpf program is hitting printk() via report_bug().

-> #0 (console_owner){....}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       console_lock_spinning_enable kernel/printk/printk.c:1873 [inline]
       console_emit_next_record kernel/printk/printk.c:2901 [inline]
       console_flush_all+0x810/0xfd0 kernel/printk/printk.c:2973
       console_unlock+0x13b/0x4d0 kernel/printk/printk.c:3042
       vprintk_emit+0x5a6/0x770 kernel/printk/printk.c:2342
       _printk+0xd5/0x120 kernel/printk/printk.c:2367
       __report_bug lib/bug.c:195 [inline]
       report_bug+0x346/0x500 lib/bug.c:219
       handle_bug+0x3e/0x70 arch/x86/kernel/traps.c:239
       exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:260
       asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
       __local_bh_enable_ip+0x1be/0x200 kernel/softirq.c:362
       spin_unlock_bh include/linux/spinlock.h:396 [inline]
       __sock_map_delete net/core/sock_map.c:424 [inline]
       sock_map_delete_elem+0xca/0x140 net/core/sock_map.c:446
       bpf_prog_2c29ac5cdc6b1842+0x42/0x46
       bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
       __bpf_prog_run include/linux/filter.h:657 [inline]
       bpf_prog_run include/linux/filter.h:664 [inline]
       __bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
       bpf_trace_run4+0x25a/0x490 kernel/trace/bpf_trace.c:2422
       trace_sched_switch include/trace/events/sched.h:222 [inline]
       __schedule+0x250b/0x49d0 kernel/sched/core.c:6733
       __schedule_loop kernel/sched/core.c:6813 [inline]
       schedule+0x14b/0x320 kernel/sched/core.c:6828
       ptrace_stop+0x5b4/0x940 kernel/signal.c:2358
       ptrace_do_notify kernel/signal.c:2395 [inline]
       ptrace_notify+0x255/0x380 kernel/signal.c:2407
       ptrace_report_syscall include/linux/ptrace.h:415 [inline]
       ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
       syscall_exit_work+0xbd/0x170 kernel/entry/common.c:167
       syscall_exit_to_user_mode_prepare kernel/entry/common.c:194 [inline]
       __syscall_exit_to_user_mode_work kernel/entry/common.c:199 [inline]
       syscall_exit_to_user_mode+0x273/0x360 kernel/entry/common.c:212
       do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89
       entry_SYSCALL_64_after_hwframe+0x6d/0x75


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  7:04     ` Tetsuo Handa
@ 2024-03-20  7:12       ` Alexei Starovoitov
  2024-03-20  7:30         ` Tetsuo Handa
  0 siblings, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2024-03-20  7:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Petr Mladek, Steven Rostedt, John Ogness,
	Sergey Senozhatsky, Thomas Gleixner, LKML, bpf

On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2024/03/20 15:56, Alexei Starovoitov wrote:
> > This has nothing to do with bpf.
> > bpf never calls printk().
>
> Please see the Sample crash report in the dashboard.
> bpf program is hitting printk() via report_bug().

Exactly. local_bh_neable is simply asking for a splat.
_this_ bug is in printk.
It's a generic issue.

sockmap bug is orthogonal.
It's already being looked at.

> -> #0 (console_owner){....}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        console_lock_spinning_enable kernel/printk/printk.c:1873 [inline]
>        console_emit_next_record kernel/printk/printk.c:2901 [inline]
>        console_flush_all+0x810/0xfd0 kernel/printk/printk.c:2973
>        console_unlock+0x13b/0x4d0 kernel/printk/printk.c:3042
>        vprintk_emit+0x5a6/0x770 kernel/printk/printk.c:2342
>        _printk+0xd5/0x120 kernel/printk/printk.c:2367
>        __report_bug lib/bug.c:195 [inline]
>        report_bug+0x346/0x500 lib/bug.c:219
>        handle_bug+0x3e/0x70 arch/x86/kernel/traps.c:239
>        exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:260
>        asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
>        __local_bh_enable_ip+0x1be/0x200 kernel/softirq.c:362
>        spin_unlock_bh include/linux/spinlock.h:396 [inline]
>        __sock_map_delete net/core/sock_map.c:424 [inline]
>        sock_map_delete_elem+0xca/0x140 net/core/sock_map.c:446
>        bpf_prog_2c29ac5cdc6b1842+0x42/0x46
>        bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
>        __bpf_prog_run include/linux/filter.h:657 [inline]
>        bpf_prog_run include/linux/filter.h:664 [inline]
>        __bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
>        bpf_trace_run4+0x25a/0x490 kernel/trace/bpf_trace.c:2422
>        trace_sched_switch include/trace/events/sched.h:222 [inline]
>        __schedule+0x250b/0x49d0 kernel/sched/core.c:6733
>        __schedule_loop kernel/sched/core.c:6813 [inline]
>        schedule+0x14b/0x320 kernel/sched/core.c:6828
>        ptrace_stop+0x5b4/0x940 kernel/signal.c:2358
>        ptrace_do_notify kernel/signal.c:2395 [inline]
>        ptrace_notify+0x255/0x380 kernel/signal.c:2407
>        ptrace_report_syscall include/linux/ptrace.h:415 [inline]
>        ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
>        syscall_exit_work+0xbd/0x170 kernel/entry/common.c:167
>        syscall_exit_to_user_mode_prepare kernel/entry/common.c:194 [inline]
>        __syscall_exit_to_user_mode_work kernel/entry/common.c:199 [inline]
>        syscall_exit_to_user_mode+0x273/0x360 kernel/entry/common.c:212
>        do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89
>        entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  7:12       ` Alexei Starovoitov
@ 2024-03-20  7:30         ` Tetsuo Handa
  2024-03-20  8:47           ` Alexei Starovoitov
  0 siblings, 1 reply; 16+ messages in thread
From: Tetsuo Handa @ 2024-03-20  7:30 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Petr Mladek, Steven Rostedt, John Ogness,
	Sergey Senozhatsky, Thomas Gleixner, LKML, bpf

On 2024/03/20 16:12, Alexei Starovoitov wrote:
> On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2024/03/20 15:56, Alexei Starovoitov wrote:
>>> This has nothing to do with bpf.
>>> bpf never calls printk().
>>
>> Please see the Sample crash report in the dashboard.
>> bpf program is hitting printk() via report_bug().
> 
> Exactly. local_bh_neable is simply asking for a splat.
> _this_ bug is in printk.
> It's a generic issue.

I can't catch. printk() is called due to report_bug().

If the reason report_bug() is called is that spin_unlock_bh() is bad,
this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.

> 
> sockmap bug is orthogonal.
> It's already being looked at.

Then, can we expect that this bug is also fixed shortly?

> 
>> -> #0 (console_owner){....}-{0:0}:
>>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>>        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>>        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>>        console_lock_spinning_enable kernel/printk/printk.c:1873 [inline]
>>        console_emit_next_record kernel/printk/printk.c:2901 [inline]
>>        console_flush_all+0x810/0xfd0 kernel/printk/printk.c:2973
>>        console_unlock+0x13b/0x4d0 kernel/printk/printk.c:3042
>>        vprintk_emit+0x5a6/0x770 kernel/printk/printk.c:2342
>>        _printk+0xd5/0x120 kernel/printk/printk.c:2367
>>        __report_bug lib/bug.c:195 [inline]
>>        report_bug+0x346/0x500 lib/bug.c:219
>>        handle_bug+0x3e/0x70 arch/x86/kernel/traps.c:239
>>        exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:260
>>        asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
>>        __local_bh_enable_ip+0x1be/0x200 kernel/softirq.c:362
>>        spin_unlock_bh include/linux/spinlock.h:396 [inline]
>>        __sock_map_delete net/core/sock_map.c:424 [inline]
>>        sock_map_delete_elem+0xca/0x140 net/core/sock_map.c:446
>>        bpf_prog_2c29ac5cdc6b1842+0x42/0x46
>>        bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
>>        __bpf_prog_run include/linux/filter.h:657 [inline]
>>        bpf_prog_run include/linux/filter.h:664 [inline]


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  7:30         ` Tetsuo Handa
@ 2024-03-20  8:47           ` Alexei Starovoitov
  2024-03-27 11:05             ` John Ogness
  0 siblings, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2024-03-20  8:47 UTC (permalink / raw)
  To: Tetsuo Handa, John Fastabend
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky,
	Thomas Gleixner, LKML, bpf

On Wed, Mar 20, 2024 at 12:30 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2024/03/20 16:12, Alexei Starovoitov wrote:
> > On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>
> >> On 2024/03/20 15:56, Alexei Starovoitov wrote:
> >>> This has nothing to do with bpf.
> >>> bpf never calls printk().
> >>
> >> Please see the Sample crash report in the dashboard.
> >> bpf program is hitting printk() via report_bug().
> >
> > Exactly. local_bh_neable is simply asking for a splat.
> > _this_ bug is in printk.
> > It's a generic issue.
>
> I can't catch. printk() is called due to report_bug().
>
> If the reason report_bug() is called is that spin_unlock_bh() is bad,
> this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.

There are two bugs.
The one you've started the thread about is in printk.

The sockmap bug that triggered report_bug() is separate.

>
> >
> > sockmap bug is orthogonal.
> > It's already being looked at.
>
> Then, can we expect that this bug is also fixed shortly?

cc-ing John.
I don't think it's hi-pri.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-20  8:47           ` Alexei Starovoitov
@ 2024-03-27 11:05             ` John Ogness
  2024-03-27 14:25               ` Tetsuo Handa
  0 siblings, 1 reply; 16+ messages in thread
From: John Ogness @ 2024-03-27 11:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Tetsuo Handa, John Fastabend
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Thomas Gleixner,
	LKML, bpf

On 2024-03-20, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> On Wed, Mar 20, 2024 at 12:30 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2024/03/20 16:12, Alexei Starovoitov wrote:
>> > On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
>> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
>> >>
>> >> On 2024/03/20 15:56, Alexei Starovoitov wrote:
>> >>> This has nothing to do with bpf.
>> >>> bpf never calls printk().
>> >>
>> >> Please see the Sample crash report in the dashboard.
>> >> bpf program is hitting printk() via report_bug().
>> >
>> > Exactly. local_bh_neable is simply asking for a splat.
>> > _this_ bug is in printk.
>> > It's a generic issue.
>>
>> I can't catch. printk() is called due to report_bug().
>>
>> If the reason report_bug() is called is that spin_unlock_bh() is bad,
>> this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.
>
> There are two bugs.
> The one you've started the thread about is in printk.

The printk rework (which is not yet fully mainline) will correctly
handle this context.

As to the patch [0] you suggested, it would be more appropriate to
perform deferred_enter/_exit *within* the locked critical section. But
we really only want these whack-a-mole workarounds for cases that can
occur in a non-bug situation. IMHO this is not such a case and falls
into the category of "known problem, the rework will handle it".

John Ogness

[0] https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)
  2024-03-27 11:05             ` John Ogness
@ 2024-03-27 14:25               ` Tetsuo Handa
  0 siblings, 0 replies; 16+ messages in thread
From: Tetsuo Handa @ 2024-03-27 14:25 UTC (permalink / raw)
  To: John Ogness, Alexei Starovoitov, John Fastabend
  Cc: syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Thomas Gleixner,
	LKML, bpf

On 2024/03/27 20:05, John Ogness wrote:
> The printk rework (which is not yet fully mainline) will correctly
> handle this context.
> 
> As to the patch [0] you suggested, it would be more appropriate to
> perform deferred_enter/_exit *within* the locked critical section. But
> we really only want these whack-a-mole workarounds for cases that can
> occur in a non-bug situation. IMHO this is not such a case and falls
> into the category of "known problem, the rework will handle it".
> 
> John Ogness
> 
> [0] https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000
> 

Since the cause of current flood of lockdep reports is already explained
in https://syzkaller.appspot.com/bug?extid=c4f4d25859c2e5859988 , we don't
need [0] for now. But it is unfortunate that the message which explains
what went wrong cannot be reported due to reporting console_lock dependency.
Therefore, I intend [0] as a workaround for a bug situation. We can revert [0]
after the printk rework completed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-03-27 14:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-08  6:28 [syzbot] [kernel?] possible deadlock in console_flush_all (2) syzbot
2023-10-17 10:06 ` Aleksandr Nogikh
2023-10-17 14:13   ` Peter Zijlstra
2023-10-17 13:07 ` [syzbot] [ext4?] " syzbot
2023-10-17 14:16   ` Peter Zijlstra
2023-10-17 14:28     ` Aleksandr Nogikh
2024-03-05 14:05 ` [syzbot] [kernel?] " Tetsuo Handa
2024-03-05 15:21   ` [syzbot] [serial?] " syzbot
2024-03-20  6:50 ` [syzbot] [kernel?] " Tetsuo Handa
2024-03-20  6:56   ` Alexei Starovoitov
2024-03-20  7:04     ` Tetsuo Handa
2024-03-20  7:12       ` Alexei Starovoitov
2024-03-20  7:30         ` Tetsuo Handa
2024-03-20  8:47           ` Alexei Starovoitov
2024-03-27 11:05             ` John Ogness
2024-03-27 14:25               ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.