All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
@ 2022-09-07  9:39 František Šumšal
  2022-09-07 15:08 ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: František Šumšal @ 2022-09-07  9:39 UTC (permalink / raw)
  To: kvm


[-- Attachment #1.1: Type: text/plain, Size: 40224 bytes --]

Hello!

In our Arch Linux part of the upstream systemd CI I recently noticed an uptrend in CPU soft lockups when running one of our tests. This test runs several systemd-nspawn containers in succession and sometimes the underlying VM locks up due to a CPU soft lockup (just to clarify, the topology is: CentOS Stream 8 (baremetal, 4.18.0-305.3.1.el8) -> Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1) -> Arch Linux (nested KVM or QEMU TCG, happens with both, 5.19.5-arch1-1/5.19.7-arch1-1) -> nspawn containers). I did some further testing, and it reproduces even when the baremetal is my local Fedora 36 machine (5.17.12-300.fc36.x86_64).

Unfortunately, I can't provide a simple and reliable reproducer, as I can reproduce it only with that particular test and not reliably (sometimes it's the first iteration, sometimes it takes an hour or more to reproduce). However, I'd be more than glad to collect more information from one such machine, if possible.

Some logs gathered from the affected machines:
```
[ 88.654931] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [systemd:1]
[ 88.655506] Modules linked in: ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom serio_raw atkbd libps2 vivaldi_fmap ata_generic pata_acpi virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev ata_piix floppy i8042 serio virtio_rng e
[ 88.656219] Unloaded tainted modules: crc32c_intel():1
[ 88.657433] CPU: 0 PID: 1 Comm: systemd Not tainted 5.19.5-arch1-1 #1 bcfbb9540ef8053872a922808497e5b96b6f8fc8
[ 88.657999] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[ 88.658186] RIP: 0010:___slab_alloc+0x555/0x6b0
[ 88.658692] Code: 73 08 e9 d7 00 00 00 48 89 fb 48 8b 03 48 85 c0 0f 84 b7 fe ff ff 48 63 40 58 4d 8b 8c c5 d8 00 00 00 48 89 c7 4d 85 c9 74 0c <eb> 3e 49 8b 41 08 49 39 45 10 72 4b 48 8d 7b 10 44 3b 63 18 73 c7
[ 88.658692] RSP: 0018:ffffbb3fc0013b58 EFLAGS: 00000282
[ 88.658692] RAX: 0000000000000000 RBX: ffff9cab6e9e41c0 RCX: 0000000000000000
[ 88.658692] RDX: 0000000000000000 RSI: ffff9cab41041280 RDI: 0000000000000000
[ 88.658692] RBP: 0000000000000dc0 R08: ffff9cab6e9e41c0 R09: ffff9cab41041280
[ 88.658692] R10: ffff9cab4623adc8 R11: ffffbb3fc0013c20 R12: 0000000000000002
[ 88.658692] R13: ffff9cab41042a00 R14: 00000000ffffffff R15: ffff9cab6da38140
[ 88.662511] FS: 00007f8020ae6a40(0000) GS:ffff9cab6da00000(0000) knlGS:0000000000000000
[ 88.662511] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 88.662511] CR2: 00007f654f796445 CR3: 000000002f494000 CR4: 00000000000006f0
[ 88.662511] Call Trace:
[ 88.662511] <TASK>
[ 88.662511] ? cpuset_css_alloc+0x2f/0x150
[ 88.662511] ? cgroup_addrm_files+0x320/0x370
[ 88.662511] ? cpuset_css_alloc+0x2f/0x150
[ 88.662511] kmem_cache_alloc_trace+0x29e/0x310
[ 88.665067] cpuset_css_alloc+0x2f/0x150
[ 88.665067] cgroup_apply_control_enable+0x120/0x340
[ 88.665067] cgroup_subtree_control_write+0x37b/0x400
[ 88.665067] kernfs_fop_write_iter+0x122/0x200
[ 88.665067] new_sync_write+0x13a/0x1c0
[ 88.665067] vfs_write+0x1ef/0x270
[ 88.665067] ksys_write+0x6f/0xf0
[ 88.665067] do_syscall_64+0x5f/0x90
[ 88.665067] ? vfs_write+0x178/0x270
[ 88.665067] ? vfs_write+0x178/0x270
[ 88.665067] ? ksys_write+0x6f/0xf0
[ 88.665067] ? syscall_exit_to_user_mode+0x1b/0x40
[ 88.665067] ? do_syscall_64+0x6b/0x90
[ 88.665067] ? do_syscall_64+0x6b/0x90
[ 88.665067] ? do_syscall_64+0x6b/0x90
[ 88.665067] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 88.665067] RIP: 0033:0x7f8020f0fe94
[ 88.668682] Code: 15 f9 0e 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 8d 96 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[ 88.668682] RSP: 002b:00007fff61974728 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 88.668682] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f8020f0fe94
[ 88.668682] RDX: 0000000000000008 RSI: 000055ef79cdf1c0 RDI: 000000000000003f
[ 88.668682] RBP: 000055ef79cdf1c0 R08: fefefefefefefeff R09: ff736472746f622a
[ 88.668682] R10: 00007f8020ebfe00 R11: 0000000000000202 R12: 0000000000000008
[ 88.668682] R13: 000055ef79cd2130 R14: 0000000000000008 R15: 00007f8020fed940
[ 88.668682] </TASK>
[ 116.654866] watchdog: BUG: soft lockup - CPU#0 stuck for 52s! [systemd:1]
[ 116.655272] Modules linked in: ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom serio_raw atkbd libps2 vivaldi_fmap ata_generic pata_acpi virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev ata_piix floppy i8042 serio virtio_rng e
[ 116.655401] Unloaded tainted modules: crc32c_intel():1
[ 116.656629] CPU: 0 PID: 1 Comm: systemd Tainted: G L 5.19.5-arch1-1 #1 bcfbb9540ef8053872a922808497e5b96b6f8fc8
[ 116.657113] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[ 116.657487] RIP: 0010:___slab_alloc+0x555/0x6b0
[ 116.657733] Code: 73 08 e9 d7 00 00 00 48 89 fb 48 8b 03 48 85 c0 0f 84 b7 fe ff ff 48 63 40 58 4d 8b 8c c5 d8 00 00 00 48 89 c7 4d 85 c9 74 0c <eb> 3e 49 8b 41 08 49 39 45 10 72 4b 48 8d 7b 10 44 3b 63 18 73 c7
[ 116.658185] RSP: 0018:ffffbb3fc0013b58 EFLAGS: 00000282
[ 116.658185] RAX: 0000000000000000 RBX: ffff9cab6e9e41c0 RCX: 0000000000000000
[ 116.659383] RDX: 0000000000000000 RSI: ffff9cab41041280 RDI: 0000000000000000
[ 116.659383] RBP: 0000000000000dc0 R08: ffff9cab6e9e41c0 R09: ffff9cab41041280
[ 116.659383] R10: ffff9cab4623adc8 R11: ffffbb3fc0013c20 R12: 0000000000000002
[ 116.659383] R13: ffff9cab41042a00 R14: 00000000ffffffff R15: ffff9cab6da38140
[ 116.659383] FS: 00007f8020ae6a40(0000) GS:ffff9cab6da00000(0000) knlGS:0000000000000000
[ 116.659383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 116.659383] CR2: 00007f654f796445 CR3: 000000002f494000 CR4: 00000000000006f0
[ 116.661746] Call Trace:
[ 116.661746] <TASK>
[ 116.661746] ? cpuset_css_alloc+0x2f/0x150
[ 116.661746] ? cgroup_addrm_files+0x320/0x370
[ 116.661746] ? cpuset_css_alloc+0x2f/0x150
[ 116.661746] kmem_cache_alloc_trace+0x29e/0x310
[ 116.661746] cpuset_css_alloc+0x2f/0x150
[ 116.661746] cgroup_apply_control_enable+0x120/0x340
[ 116.661746] cgroup_subtree_control_write+0x37b/0x400
[ 116.661746] kernfs_fop_write_iter+0x122/0x200
[ 116.661746] new_sync_write+0x13a/0x1c0
[ 116.661746] vfs_write+0x1ef/0x270
[ 116.661746] ksys_write+0x6f/0xf0
[ 116.661746] do_syscall_64+0x5f/0x90
[ 116.661746] ? vfs_write+0x178/0x270
[ 116.661746] ? vfs_write+0x178/0x270
[ 116.661746] ? ksys_write+0x6f/0xf0
[ 116.661746] ? syscall_exit_to_user_mode+0x1b/0x40
[ 116.665695] ? do_syscall_64+0x6b/0x90
[ 116.665695] ? do_syscall_64+0x6b/0x90
[ 116.665695] ? do_syscall_64+0x6b/0x90
[ 116.665695] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 116.665695] RIP: 0033:0x7f8020f0fe94
[ 116.665695] Code: 15 f9 0e 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 8d 96 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[ 116.665695] RSP: 002b:00007fff61974728 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 116.665695] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f8020f0fe94
[ 116.665695] RDX: 0000000000000008 RSI: 000055ef79cdf1c0 RDI: 000000000000003f
[ 116.668853] RBP: 000055ef79cdf1c0 R08: fefefefefefefeff R09: ff736472746f622a
[ 116.668853] R10: 00007f8020ebfe00 R11: 0000000000000202 R12: 0000000000000008
[ 116.668853] R13: 000055ef79cd2130 R14: 0000000000000008 R15: 00007f8020fed940
[ 116.668853] </TASK>
[ 122.198410] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 122.198943] (detected by 3, t=18002 jiffies, g=18269, q=147 ncpus=4)
[ 122.199383] rcu: All QSes seen, last rcu_preempt kthread activity 18002 (4294913871-4294895869), jiffies_till_next_fqs=2, root ->qsmask 0x0
[ 122.200142] rcu: rcu_preempt kthread starved for 18002 jiffies! g18269 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 122.200601] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 122.200969] rcu: RCU grace-period kthread stack dump:
[ 122.201280] task:rcu_preempt state:R running task stack: 0 pid: 16 ppid: 2 flags:0x00004000
[ 122.201590] Call Trace:
[ 122.201590] <TASK>
[ 122.201590] ? rcu_gp_cleanup+0x3c0/0x3c0
[ 122.201590] __schedule+0x356/0x11a0
[ 122.201590] ? __mod_timer+0x289/0x3b0
[ 122.201590] ? rcu_gp_cleanup+0x3c0/0x3c0
[ 122.201590] schedule+0x5e/0xd0
[ 122.201590] schedule_timeout+0x98/0x150
[ 122.201590] ? __bpf_trace_tick_stop+0x10/0x10
[ 122.201590] rcu_gp_fqs_loop+0x113/0x3c0
[ 122.201590] rcu_gp_kthread+0xb0/0x170
[ 122.201590] kthread+0xde/0x110
[ 122.201590] ? kthread_complete_and_exit+0x20/0x20
[ 122.201590] ret_from_fork+0x22/0x30
[ 122.201590] </TASK>
[ 122.201590] rcu: Stack dump where RCU GP kthread last ran:
[ 122.201590] Sending NMI from CPU 3 to CPUs 0:
[ 122.206283] NMI backtrace for cpu 0
[ 122.207931] CPU: 0 PID: 1 Comm: systemd Tainted: G L 5.19.5-arch1-1 #1 bcfbb9540ef8053872a922808497e5b96b6f8fc8
[ 122.208217] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[ 122.208217] RIP: 0010:asm_exc_int3+0x0/0x40
[ 122.208217] Code: e9 d5 0a 00 00 0f 1f 44 00 00 f3 0f 1e fa 0f 1f 00 fc 6a ff e8 a1 09 00 00 48 89 c4 48 89 e7 e8 c6 77 fd ff e9 b1 0a 00 00 90 <f3> 0f 1e fa 0f 1f 00 fc 6a ff f6 44 24 10 03 75 18 ff 74 24 28 ff
[ 122.208217] RSP: 0018:ffffbb3fc0013b28 EFLAGS: 00000082
[ 122.208217] RAX: 0000000000000000 RBX: ffff9cab6e9e41c0 RCX: 0000000000000000
[ 122.208217] RDX: 0000000000000000 RSI: ffff9cab41041280 RDI: 0000000000000000
[ 122.208217] RBP: 0000000000000dc0 R08: ffff9cab6e9e41c0 R09: ffff9cab41041280
[ 122.208217] R10: ffff9cab4623adc8 R11: ffffbb3fc0013c20 R12: 0000000000000002
[ 122.208217] R13: ffff9cab41042a00 R14: 00000000ffffffff R15: ffff9cab6da38140
[ 122.208217] FS: 00007f8020ae6a40(0000) GS:ffff9cab6da00000(0000) knlGS:0000000000000000
[ 122.208217] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 122.208217] CR2: 00007f654f796445 CR3: 000000002f494000 CR4: 00000000000006f0
[ 122.208217] Call Trace:
[ 122.208217] <TASK>
[ 122.208217] RIP: 0010:___slab_alloc+0x556/0x6b0
[ 122.208217] Code: 08 e9 d7 00 00 00 48 89 fb 48 8b 03 48 85 c0 0f 84 b7 fe ff ff 48 63 40 58 4d 8b 8c c5 d8 00 00 00 48 89 c7 4d 85 c9 74 0c eb <3e> 49 8b 41 08 49 39 45 10 72 4b 48 8d 7b 10 44 3b 63 18 73 c7 31
[ 122.208217] RSP: 0018:ffffbb3fc0013b58 EFLAGS: 00000282
[ 122.208217] ? ___slab_alloc+0x50d/0x6b0
[ 122.208217] ? cpuset_css_alloc+0x2f/0x150
[ 122.208217] ? cgroup_addrm_files+0x320/0x370
[ 122.208217] ? cpuset_css_alloc+0x2f/0x150
[ 122.208217] kmem_cache_alloc_trace+0x29e/0x310
[ 122.208217] cpuset_css_alloc+0x2f/0x150
[ 122.208217] cgroup_apply_control_enable+0x120/0x340
[ 122.208217] cgroup_subtree_control_write+0x37b/0x400
[ 122.208217] kernfs_fop_write_iter+0x122/0x200
[ 122.208217] new_sync_write+0x13a/0x1c0
[ 122.208217] vfs_write+0x1ef/0x270
[ 122.208217] ksys_write+0x6f/0xf0
[ 122.208217] do_syscall_64+0x5f/0x90
[ 122.208217] ? vfs_write+0x178/0x270
[ 122.208217] ? vfs_write+0x178/0x270
[ 122.208217] ? ksys_write+0x6f/0xf0
[ 122.208217] ? syscall_exit_to_user_mode+0x1b/0x40
[ 122.208217] ? do_syscall_64+0x6b/0x90
[ 122.208217] ? do_syscall_64+0x6b/0x90
[ 122.208217] ? do_syscall_64+0x6b/0x90
[ 122.208217] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 122.208217] RIP: 0033:0x7f8020f0fe94
[ 122.208217] Code: 15 f9 0e 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 8d 96 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[ 122.208217] RSP: 002b:00007fff61974728 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 122.208217] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f8020f0fe94
[ 122.208217] RDX: 0000000000000008 RSI: 000055ef79cdf1c0 RDI: 000000000000003f
[ 122.208217] RBP: 000055ef79cdf1c0 R08: fefefefefefefeff R09: ff736472746f622a
[ 122.208217] R10: 00007f8020ebfe00 R11: 0000000000000202 R12: 0000000000000008
[ 122.208217] R13: 000055ef79cd2130 R14: 0000000000000008 R15: 00007f8020fed940
[ 122.208217] </TASK>
[ 122.208217] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 6.624 msecs
```

```
[ 491.433151] INFO: task kworker/u8:0:8 blocked for more than 368 seconds.
[ 491.433671] Not tainted 5.19.5-arch1-1 #1
[ 491.433965] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 491.434492] task:kworker/u8:0 state:D stack: 0 pid: 8 ppid: 2 flags:0x00004000
[ 491.435131] Workqueue: writeback wb_workfn (flush-8:0)
[ 491.435522] Call Trace:
[ 491.435734] <TASK>
[ 491.435912] __schedule+0x356/0x11a0
[ 491.436236] ? wb_over_bg_thresh+0x26c/0x2a0
[ 491.436751] schedule+0x5e/0xd0
[ 491.436990] inode_sleep_on_writeback+0xbe/0xd0
[ 491.437374] ? cpuacct_percpu_seq_show+0x20/0x20
[ 491.437749] wb_writeback+0x11f/0x2f0
[ 491.438047] wb_workfn+0x2c9/0x520
[ 491.438300] process_one_work+0x1c7/0x380
[ 491.438605] worker_thread+0x51/0x390
[ 491.438904] ? rescuer_thread+0x3b0/0x3b0
[ 491.439212] kthread+0xde/0x110
[ 491.439471] ? kthread_complete_and_exit+0x20/0x20
[ 491.439989] ret_from_fork+0x22/0x30
[ 491.440281] </TASK>
[ 491.440504] INFO: task kworker/u8:2:45 blocked for more than 368 seconds.
[ 491.440960] Not tainted 5.19.5-arch1-1 #1
[ 491.441279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 491.441875] task:kworker/u8:2 state:D stack: 0 pid: 45 ppid: 2 flags:0x00004000
[ 491.442454] Workqueue: writeback wb_workfn (flush-8:0)
[ 491.442855] Call Trace:
[ 491.453066] <TASK>
[ 491.453251] __schedule+0x356/0x11a0
[ 491.453494] ? kmem_cache_alloc+0x16a/0x300
[ 491.453830] schedule+0x5e/0xd0
[ 491.454090] io_schedule+0x46/0x70
[ 491.454307] blk_mq_get_tag+0x11e/0x2a0
[ 491.454604] ? cpuacct_percpu_seq_show+0x20/0x20
[ 491.454965] __blk_mq_alloc_requests+0x184/0x2e0
[ 491.455300] blk_mq_submit_bio+0x2b4/0x5b0
[ 491.455653] __submit_bio+0xf5/0x180
[ 491.455913] submit_bio_noacct_nocheck+0x20e/0x2c0
[ 491.456227] ext4_io_submit+0x24/0x40 [ext4 469a96554db3024f50e2546d8f81b9874cc9bfc2]
[ 491.457089] ext4_writepages+0x4b6/0x1200 [ext4 469a96554db3024f50e2546d8f81b9874cc9bfc2]
[ 491.457870] do_writepages+0xc4/0x1d0
[ 491.458172] ? wb_get_create+0x62/0x6a0
[ 491.458449] __writeback_single_inode+0x3d/0x360
[ 491.458818] writeback_sb_inodes+0x1ed/0x4a0
[ 491.459154] __writeback_inodes_wb+0x4c/0xf0
[ 491.459484] wb_writeback+0x204/0x2f0
[ 491.459964] wb_workfn+0x36f/0x520
[ 491.460268] process_one_work+0x1c7/0x380
[ 491.460544] worker_thread+0x51/0x390
[ 491.460816] ? rescuer_thread+0x3b0/0x3b0
[ 491.461083] kthread+0xde/0x110
[ 491.461341] ? kthread_complete_and_exit+0x20/0x20
[ 491.461757] ret_from_fork+0x22/0x30
[ 491.462047] </TASK>
[ 491.462305] INFO: task kworker/u8:5:894 blocked for more than 368 seconds.
[ 491.462820] Not tainted 5.19.5-arch1-1 #1
[ 491.473057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 491.473639] task:kworker/u8:5 state:D stack: 0 pid: 894 ppid: 2 flags:0x00004000
[ 491.474228] Workqueue: writeback wb_workfn (flush-8:0)
[ 491.474612] Call Trace:
[ 491.474825] <TASK>
[ 491.474981] __schedule+0x356/0x11a0
[ 491.475257] ? rwb_trace_step+0x90/0x90
[ 491.475566] ? wbt_done+0xc0/0xc0
[ 491.475805] schedule+0x5e/0xd0
[ 491.476036] io_schedule+0x46/0x70
[ 491.476444] rq_qos_wait+0xc0/0x140
[ 491.476740] ? karma_partition+0x2a0/0x2a0
[ 491.477071] ? rwb_trace_step+0x90/0x90
[ 491.477390] wbt_wait+0xa9/0x110
[ 491.477626] __rq_qos_throttle+0x27/0x40
[ 491.477884] blk_mq_submit_bio+0x289/0x5b0
[ 491.478176] __submit_bio+0xf5/0x180
[ 491.478483] submit_bio_noacct_nocheck+0x20e/0x2c0
[ 491.478850] submit_bh_wbc+0x12b/0x150
[ 491.479136] __block_write_full_page+0x1fe/0x500
[ 491.479484] ? bh_uptodate_or_lock+0x80/0x80
[ 491.480016] ? blkdev_llseek+0x60/0x60
[ 491.480308] __writepage+0x1d/0x70
[ 491.480542] write_cache_pages+0x187/0x4c0
[ 491.480884] ? dirty_background_bytes_handler+0x30/0x30
[ 491.481255] generic_writepages+0x7f/0xb0
[ 491.481566] do_writepages+0xc4/0x1d0
[ 491.481857] __writeback_single_inode+0x3d/0x360
[ 491.482191] writeback_sb_inodes+0x1ed/0x4a0
[ 491.482560] __writeback_inodes_wb+0x4c/0xf0
[ 491.482859] wb_writeback+0x204/0x2f0
[ 491.493105] wb_workfn+0x36f/0x520
[ 491.493339] ? check_preempt_curr+0x61/0x70
[ 491.493670] ? ttwu_do_wakeup+0x17/0x170
[ 491.493970] process_one_work+0x1c7/0x380
[ 491.494261] worker_thread+0x51/0x390
[ 491.494571] ? rescuer_thread+0x3b0/0x3b0
[ 491.494856] kthread+0xde/0x110
[ 491.495092] ? kthread_complete_and_exit+0x20/0x20
[ 491.495460] ret_from_fork+0x22/0x30
[ 491.495779] </TASK>
[ 527.728135] systemd[1]: systemd-udevd.service: Got notification message from PID 226 (WATCHDOG=1)
[ 558.726681] systemd[1]: systemd-journald.service: State 'final-sigterm' timed out. Killing.
[ 558.727927] systemd[1]: systemd-journald.service: Killing process 207 (8) with signal SIGKILL.
[ 558.729607] systemd[1]: systemd-journald.service: Changed final-sigterm -> final-sigkill
```

```
[   89.850385] testsuite-13.sh[264]: + systemd-nspawn --register=no -D /var/lib/machines/testsuite-13.unified-yes-cgns-yes-api-vfs-writable-no --private-network -U -b
[   89.959415] testsuite-13.sh[1014]: varlink: Setting state idle-client
[   89.961250] testsuite-13.sh[1014]: /run/systemd/userdb/io.systemd.Machine: Sending message: {"method":"io.systemd.UserDatabase.GetUserRecord","parameters":{"uid":749600768,"service":"io.systemd.Machine"}}
[  116.670448] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-nspawn:1014]
[  116.670817] Modules linked in: ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom serio_raw atkbd libps2 vivaldi_fmap ata_generic pata_acpi virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev ata_piix i8042 serio floppy virtio_rng e
[  116.671381] Unloaded tainted modules: crc32c_intel():1
[  116.672249] CPU: 1 PID: 1014 Comm: systemd-nspawn Not tainted 5.19.7-arch1-1 #1 e547dd5fb390dbeba7568a7a787b4abde4680315
[  116.672831] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[  116.673274] RIP: 0010:___slab_alloc+0x555/0x6b0
[  116.674053] Code: 73 08 e9 d7 00 00 00 48 89 fb 48 8b 03 48 85 c0 0f 84 b7 fe ff ff 48 63 40 58 4d 8b 8c c5 d8 00 00 00 48 89 c7 4d 85 c9 74 0c <66> 90 49 8b 41 08 49 39 45 10 72 4b 48 8d 7b 10 44 3b 63 18 73 c7
[  116.674053] RSP: 0018:ffffa893805bbbc8 EFLAGS: 00000282
[  116.674779] RAX: 0000000000000000 RBX: ffff9adf2f2e11c0 RCX: 0000000000000000
[  116.674779] RDX: 0000000000000000 RSI: ffff9adf011c7140 RDI: 0000000000000000
[  116.674779] RBP: 0000000000000cc0 R08: ffff9adf2f2e11c0 R09: ffff9adf011c7140
[  116.674779] R10: 00007fa6f4cea000 R11: ffff9adf2f4aea28 R12: 0000000000000002
[  116.674779] R13: ffff9adf011c6a00 R14: 00000000ffffffff R15: ffff9adf2e479720
[  116.674779] FS:  00007fa6f3d9bfc0(0000) GS:ffff9adf2e440000(0000) knlGS:0000000000000000
[  116.674779] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  116.674779] CR2: 00007fa6f4cea000 CR3: 0000000003bca000 CR4: 00000000000006e0
[  116.674779] Call Trace:
[  116.674779]  <TASK>
[  116.674779]  ? vm_area_alloc+0x1e/0x60
[  116.674779]  ? filename_lookup+0xe8/0x1f0
[  116.674779]  ? vm_area_alloc+0x1e/0x60
[  116.674779]  kmem_cache_alloc+0x29c/0x300
[  116.674779]  vm_area_alloc+0x1e/0x60
[  116.674779]  mmap_region+0x38b/0x630
[  116.674779]  do_mmap+0x341/0x530
[  116.674779]  ? security_mmap_file+0x7f/0xf0
[  116.674779]  vm_mmap_pgoff+0xea/0x190
[  116.674779]  ksys_mmap_pgoff+0x185/0x1f0
[  116.674779]  do_syscall_64+0x5f/0x90
[  116.674779]  ? syscall_exit_to_user_mode+0x1b/0x40
[  116.674779]  ? do_syscall_64+0x6b/0x90
[  116.674779]  ? rcu_core+0x1eb/0x4d0
[  116.674779]  ? native_x2apic_icr_read+0x20/0x20
[  116.674779]  ? __do_softirq+0x187/0x2c9
[  116.674779]  ? __irq_exit_rcu+0x4b/0xf0
[  116.674779]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  116.674779] RIP: 0033:0x7fa6f4719707
[  116.674779] Code: ff ff eb e7 67 e8 a9 6e 01 00 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 41 89 ca 41 f7 c1 ff 0f 00 00 75 10 b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 21 c3 48 8b 05 51 76 0d 00 64 c7 00 16 00 00
[  116.674779] RSP: 002b:00007ffe906dfe98 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[  116.674779] RAX: ffffffffffffffda RBX: 0000000000000014 RCX: 00007fa6f4719707
[  116.674779] RDX: 0000000000000001 RSI: 0000000000000017 RDI: 0000000000000000
[  116.674779] RBP: 00007ffe906dff80 R08: 0000000000000014 R09: 0000000000000000
[  116.674779] R10: 0000000000000002 R11: 0000000000000246 R12: 0000559683183f20
[  116.674779] R13: 00007ffe906dfeb0 R14: 000000000000000b R15: 0000000000000001
[  116.674779]  </TASK>
[  116.923792] watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [kworker/u12:2:57]
[  116.924132] Modules linked in: ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom serio_raw atkbd libps2 vivaldi_fmap ata_generic pata_acpi virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev ata_piix i8042 serio floppy virtio_rng e
[  116.924233] Unloaded tainted modules: crc32c_intel():1
[  116.925160] CPU: 4 PID: 57 Comm: kworker/u12:2 Tainted: G             L    5.19.7-arch1-1 #1 e547dd5fb390dbeba7568a7a787b4abde4680315
[  116.925553] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[  116.926028] Workqueue: writeback wb_workfn (flush-8:0)
[  116.926319] RIP: 0010:___slab_alloc+0x555/0x6b0
[  116.926542] Code: 73 08 e9 d7 00 00 00 48 89 fb 48 8b 03 48 85 c0 0f 84 b7 fe ff ff 48 63 40 58 4d 8b 8c c5 d8 00 00 00 48 89 c7 4d 85 c9 74 0c <66> 90 49 8b 41 08 49 39 45 10 72 4b 48 8d 7b 10 44 3b 63 18 73 c7
[  116.927086] RSP: 0018:ffffa8938020f670 EFLAGS: 00000286
[  116.927086] RAX: 0000000000000000 RBX: ffff9adf2f2e11c0 RCX: 0000000000000000
[  116.927086] RDX: 0000000000000000 RSI: ffff9adf011c72c0 RDI: 0000000000000000
[  116.927086] RBP: 0000000000000d40 R08: ffff9adf2f2e11c0 R09: ffff9adf011c72c0
[  116.927086] R10: ffff9adf0393a120 R11: 0000000000000000 R12: 0000000000000002
[  116.927086] R13: ffff9adf011cb300 R14: 00000000ffffffff R15: ffff9adf2e53a820
[  116.927086] FS:  0000000000000000(0000) GS:ffff9adf2e500000(0000) knlGS:0000000000000000
[  116.927086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  116.927086] CR2: 00007f4d6e017000 CR3: 0000000001d2a000 CR4: 00000000000006e0
[  116.927086] Call Trace:
[  116.927086]  <TASK>
[  116.927086]  ? ext4_mb_new_blocks+0x828/0x10a0 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  ? ext4_mb_new_blocks+0x828/0x10a0 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  kmem_cache_alloc+0x29c/0x300
[  116.927086]  ext4_mb_new_blocks+0x828/0x10a0 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  ? ext4_find_extent+0x3c7/0x420 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  ext4_ext_map_blocks+0x5c8/0x19d0 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  ? release_pages+0x171/0x630
[  116.927086]  ? find_get_pages_range_tag+0x1b2/0x240
[  116.927086]  ? __pagevec_release+0x1f/0x60
[  116.927086]  ext4_map_blocks+0x251/0x620 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  ext4_writepages+0x89e/0x1200 [ext4 ecafeb703c7cda2228e75743a922c49a9b98d25b]
[  116.927086]  do_writepages+0xc4/0x1d0
[  116.927086]  __writeback_single_inode+0x3d/0x360
[  116.927086]  ? wbc_detach_inode+0x101/0x220
[  116.927086]  writeback_sb_inodes+0x1ed/0x4a0
[  116.927086]  __writeback_inodes_wb+0x4c/0xf0
[  116.927086]  wb_writeback+0x204/0x2f0
[  116.927086]  wb_workfn+0x2c9/0x520
[  116.927086]  process_one_work+0x1c7/0x380
[  116.927086]  worker_thread+0x51/0x390
[  116.927086]  ? rescuer_thread+0x3b0/0x3b0
[  116.927086]  kthread+0xde/0x110
[  116.927086]  ? kthread_complete_and_exit+0x20/0x20
[  116.927086]  ret_from_fork+0x22/0x30
[  116.927086]  </TASK>
```

Also, in one instance, the machine died with:
```
[ 1413.878040] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1413.879468] #PF: supervisor instruction fetch in kernel mode
[ 1413.880495] #PF: error_code(0x0010) - not-present page
[ 1413.881421] PGD 0 P4D 0
[ 1413.881884] Oops: 0010 [#1] PREEMPT SMP PTI
[ 1413.882629] CPU: 64 PID: 67693 Comm: qemu-system-x86 Tainted: G W 5.19.5-arch1-1 #1 bcfbb9540ef8053872a922808497e5b96b6f8fc8
[ 1413.884738] Hardware name: Red Hat KVM, BIOS 1.16.0-1.module_el8.7.0+1140+ff0772f9 04/01/2014
[ 1413.886181] RIP: 0010:0x0
[ 1413.886652] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 1413.887818] RSP: 0018:ffffb29bc1004ee8 EFLAGS: 00010202
[ 1413.888736] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff8c57cbc33c78
[ 1413.889957] RDX: 0000000000000000 RSI: ffffb29bc1004ef8 RDI: ffff8c2c45f6edc0
[ 1413.891174] RBP: ffff8c57cbc33b40 R08: 0000000000000000 R09: 0000000000000100
[ 1413.892417] R10: 0000000000000200 R11: ffffffff8aa060c0 R12: ffff8c2dd7ef8000
[ 1413.893634] R13: 0000000000000000 R14: 000000000000000a R15: 0000000000000000
[ 1413.894851] FS: 00007f4819ffb6c0(0000) GS:ffff8c57cbc00000(0000) knlGS:0000000000000000
[ 1413.896224] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1413.897198] CR2: ffffffffffffffd6 CR3: 00000002a9a94002 CR4: 0000000000770ee0
[ 1413.898422] PKRU: 55555554
[ 1413.898890] Call Trace:
[ 1413.899338] <IRQ>
[ 1413.899693] rcu_do_batch+0x199/0x4e0
[ 1413.900336] rcu_core+0x1ae/0x4d0
[ 1413.900912] __do_softirq+0xd0/0x2c9
[ 1413.901544] __irq_exit_rcu+0xbb/0xf0
[ 1413.902186] sysvec_apic_timer_interrupt+0x72/0x90
[ 1413.903016] </IRQ>
[ 1413.903400] <TASK>
[ 1413.903772] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1413.904653] RIP: 0010:syscall_enter_from_user_mode+0x16/0x60
[ 1413.905631] Code: 00 65 8b 05 3c 1e 65 76 83 f8 01 75 e6 eb db 0f 1f 44 00 00 66 0f 1f 00 55 48 89 fd 53 48 89 f3 66 90 66 90 fb 0f 1f 44 00 00 <65> 48 8b 04 25 c0 0b 02 00 48 8b 70 08 40 f6 c6 3f 75 1c 48 89 d8
[ 1413.908745] RSP: 0018:ffffb29bc40f3ea8 EFLAGS: 00000286
[ 1413.909644] RAX: ffffb29bc40f3ec0 RBX: 00000000000000ca RCX: 0000000000000048
[ 1413.910850] RDX: 0000000000000000 RSI: 00000000000000ca RDI: ffffb29bc40f3f58
[ 1413.912067] RBP: ffffb29bc40f3f58 R08: 0000000000000000 R09: 0000000000000000
[ 1413.913285] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1413.914492] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1413.915723] do_syscall_64+0x3a/0x90
[ 1413.916352] ? syscall_exit_to_user_mode+0x1b/0x40
[ 1413.917165] ? do_syscall_64+0x6b/0x90
[ 1413.917822] ? syscall_exit_to_user_mode+0x1b/0x40
[ 1413.918664] ? do_syscall_64+0x6b/0x90
[ 1413.919318] ? do_syscall_64+0x6b/0x90
[ 1413.919960] ? do_syscall_64+0x6b/0x90
[ 1413.920609] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 1413.921483] RIP: 0033:0x7f49fcc315e7
[ 1413.922106] Code: 81 00 00 00 b8 ca 00 00 00 0f 05 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 40 80 f6 81 45 31 d2 ba 01 00 00 00 b8 ca 00 00 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 8b 05 7d 49 15 00 48 89
[ 1413.925240] RSP: 002b:00007f4819ffa5f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[ 1413.926524] RAX: ffffffffffffffda RBX: 000055aad47ae340 RCX: 00007f49fcc315e7
[ 1413.927741] RDX: 0000000000000001 RSI: 0000000000000081 RDI: 000055aad47ae340
[ 1413.928966] RBP: 0000000000010000 R08: ffffbfce02ebb000 R09: 00007f480817c6c8
[ 1413.930177] R10: 0000000000000000 R11: 0000000000000246 R12: 000055aad52936b0
[ 1413.931399] R13: 0000000000000000 R14: 00007f4819ffa790 R15: 00007f48197fb000
[ 1413.932625] </TASK>
[ 1413.933023] Modules linked in: sit bonding tls veth psample sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan vrf ipvtap tap ipvlan batman_adv bridge bareudp dummy tun xt_nat xt_addrtype vfat xt_tcpudp fat xt_MASQUERADE iptable_nat nft_fib_ipv6 loop nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache netfs intel_rapl_msr intel_rapl_common intel_uncore_frequency_common isst_if_common nfit kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel 8021q rapl garp mrp
[ 1413.933098] joydev stp llc mousedev intel_agp psmouse cirrus pcspkr i2c_piix4 intel_gtt mac_hid cfg80211 rfkill dm_multipath dm_mod crypto_user fuse bpf_preload qemu_fw_cfg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 virtio_net serio_raw virtio_rng net_failover ata_generic atkbd rng_core failover virtio_blk virtio_balloon pata_acpi libps2 vivaldi_fmap crc32_pclmul crc32c_intel aesni_intel virtio_pci crypto_simd virtio_pci_legacy_dev i8042 cryptd virtio_pci_modern_dev ata_piix floppy serio
[ 1413.947630] Unloaded tainted modules: sch_teql():1 netdevsim():1 netdevsim():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 skx_edac():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_cstate():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 skx_edac():1 intel_uncore():1 intel_cstate():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 skx_edac():1 intel_uncore_frequency():1 intel_uncore():1 skx_edac():1 isst_if_mbox_msr():1 intel_cstate():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 skx_edac():1 intel_cstate():1 skx_edac():1 isst_if_mbox_msr():1 intel_uncore():1 intel_cstate():1 intel_uncore_frequency():1 intel_cstate():1 intel_uncore():1 intel_uncore_frequency():1 skx_edac():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 isst_if_mbox_msr():1 skx_edac():1 intel_cstate():1 intel_uncore_frequency():1 intel_cstate():1 isst_if_mbox_msr():1 intel_uncore():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1
[ 1413.955104] isst_if_mbox_msr():1 skx_edac():1 intel_uncore():1 intel_uncore():1 isst_if_mbox_msr():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1 skx_edac():1 intel_cstate():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_uncore():1 intel_uncore():1 pcc_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1 pcc_cpufreq():1 skx_edac():1 intel_uncore():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_cstate():1 acpi_cpufreq():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_uncore_frequency():1 intel_cstate():1 intel_uncore():1 skx_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 skx_edac():1 intel_uncore_frequency():1 intel_uncore():1 intel_cstate():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_cstate():1 isst_if_mbox_msr():1 pcc_cpufreq():1 skx_edac():1 acpi_cpufreq():1 intel_uncore_frequency():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
[ 1413.969669] intel_uncore_frequency():1 intel_uncore():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_cstate():1 skx_edac():1 acpi_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_cstate():1 intel_uncore_frequency():1 skx_edac():1 acpi_cpufreq():1 skx_edac():1 intel_uncore():1 pcc_cpufreq():1 intel_uncore_frequency():1 intel_cstate():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore_frequency():1 pcc_cpufreq():1 intel_uncore():1 intel_cstate():1 acpi_cpufreq():1 pcc_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 skx_edac():1 isst_if_mbox_msr():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_uncore_frequency():1 intel_uncore():1 isst_if_mbox_msr():1 skx_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_uncore():1
[ 1413.983860] intel_uncore_frequency():1 skx_edac():1 intel_cstate():1 acpi_cpufreq():1 pcc_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 acpi_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 skx_edac():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_cstate():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_cstate():1 intel_uncore():1 pcc_cpufreq():1 skx_edac():1 intel_cstate():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 skx_edac():1 pcc_cpufreq():1 skx_edac():1 pcc_cpufreq():1 intel_uncore_frequency():1 intel_cstate():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 skx_edac():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_cstate():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 acpi_cpufreq():1 intel_cstate():1 intel_uncore_frequency():1 pcc_cpufreq():1 intel_uncore():1 skx_edac():1 acpi_cpufreq():1 isst_if_mbox_msr():1
[ 1413.998221] intel_uncore():1 pcc_cpufreq():1 skx_edac():1 acpi_cpufreq():1 intel_uncore_frequency():1 intel_cstate():1 pcc_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 intel_cstate():1 intel_uncore_frequency():1 acpi_cpufreq():1 intel_uncore():1 skx_edac():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_cstate():1 acpi_cpufreq():1 skx_edac():1 isst_if_mbox_msr():1 intel_uncore():1 intel_uncore_frequency():1 skx_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_cstate():1 intel_uncore_frequency():1 intel_uncore():1 isst_if_mbox_msr():1 skx_edac():1 acpi_cpufreq():1 intel_cstate():1 pcc_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 skx_edac():1 intel_uncore_frequency():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 intel_cstate():1 pcc_cpufreq():1 skx_edac():1 acpi_cpufreq():1 intel_uncore():1 intel_uncore_frequency():1 skx_edac():1 isst_if_mbox_msr():1 intel_cstate():1 pcc_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 skx_edac():1
[ 1414.012813] intel_cstate():1 intel_uncore():1 intel_uncore_frequency():1 pcc_cpufreq():1 skx_edac():1 isst_if_mbox_msr():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 intel_uncore_frequency():1 pcc_cpufreq():1 acpi_cpufreq():1 skx_edac():1 intel_cstate():1 intel_uncore_frequency():1 skx_edac():1 pcc_cpufreq():1 intel_uncore_frequency():1 skx_edac():1 intel_cstate():1 isst_if_mbox_msr():1 acpi_cpufreq():1 intel_uncore():1 intel_cstate():1 pcc_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_cstate():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore():1 skx_edac():1 isst_if_mbox_msr():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_uncore_frequency():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 intel_cstate():1 skx_edac():1 intel_uncore():1 skx_edac():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 intel_cstate():1 isst_if_mbox_msr():1 intel_uncore_frequency():1
[ 1414.027301] pcc_cpufreq():1 skx_edac():1 intel_uncore_frequency():1 intel_uncore():1 pcc_cpufreq():1 skx_edac():1 intel_cstate():1 isst_if_mbox_msr():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 skx_edac():1 intel_cstate():1 pcc_cpufreq():1 intel_cstate():1 skx_edac():1 isst_if_mbox_msr():1 pcc_cpufreq():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 pcc_cpufreq():1 intel_uncore():1 isst_if_mbox_msr():1 intel_uncore():1 intel_uncore_frequency():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore_frequency():1 intel_uncore():1 pcc_cpufreq():1 skx_edac():1 intel_cstate():1 isst_if_mbox_msr():1 acpi_cpufreq():1 intel_uncore_frequency():1 skx_edac():1 pcc_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 acpi_cpufreq():1 intel_uncore():1 intel_cstate():1 intel_uncore_frequency():1 skx_edac():1 pcc_cpufreq():1 intel_uncore():1 skx_edac():1
[ 1414.041739] isst_if_mbox_msr():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_cstate():1 isst_if_mbox_msr():1 skx_edac():1 pcc_cpufreq():1 intel_cstate():1 isst_if_mbox_msr():1 acpi_cpufreq():1 pcc_cpufreq():1 skx_edac():1 intel_uncore():1 intel_uncore_frequency():1 intel_cstate():1 intel_uncore_frequency():1 acpi_cpufreq():1 intel_cstate():1 intel_uncore():1 skx_edac():1 pcc_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 intel_cstate():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 pcc_cpufreq():1 intel_uncore_frequency():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore():1 isst_if_mbox_msr():1 intel_uncore():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_cstate():1 pcc_cpufreq():1 skx_edac():1 intel_uncore_frequency():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore():1 pcc_cpufreq():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 skx_edac():1 skx_edac():1 pcc_cpufreq():1 intel_cstate():1 intel_uncore_frequency():1
[ 1414.056263] acpi_cpufreq():1 isst_if_mbox_msr():1 skx_edac():1 intel_uncore():1 pcc_cpufreq():1 intel_uncore_frequency():1 acpi_cpufreq():1 intel_cstate():1 skx_edac():1 isst_if_mbox_msr():1 pcc_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_uncore_frequency():1 skx_edac():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 pcc_cpufreq():1 intel_cstate():1 intel_uncore_frequency():1 isst_if_mbox_msr():1 intel_uncore():1 acpi_cpufreq():1 intel_cstate():1 pcc_cpufreq():1 skx_edac():1 intel_cstate():1 pcc_cpufreq():1 intel_uncore():1 skx_edac():1 intel_uncore_frequency():1 acpi_cpufreq():1 isst_if_mbox_msr():1 intel_cstate():1 pcc_cpufreq():1 isst_if_mbox_msr():1 intel_uncore():1 skx_edac():1 acpi_cpufreq():1 acpi_cpufreq():1 intel_uncore():1 pcc_cpufreq():1 skx_edac():1 intel_cstate():1 pcc_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_cstate():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore():1 pcc_cpufreq():1
[ 1414.070664] intel_uncore():1 acpi_cpufreq():1 intel_cstate():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_cstate():1 intel_uncore():1 acpi_cpufreq():1 pcc_cpufreq():1 intel_cstate():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_uncore():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore():1 pcc_cpufreq():1 intel_cstate():1 acpi_cpufreq():1 intel_uncore():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 intel_cstate():1 intel_uncore():1 intel_uncore():1 acpi_cpufreq():1 [last unloaded: sch_teql]
[ 1414.092501] CR2: 0000000000000000
[ 1414.093086] ---[ end trace 0000000000000000 ]---
[ 1414.093888] RIP: 0010:0x0
[ 1414.094360] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 1414.095527] RSP: 0018:ffffb29bc1004ee8 EFLAGS: 00010202
[ 1414.096423] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff8c57cbc33c78
[ 1414.097630] RDX: 0000000000000000 RSI: ffffb29bc1004ef8 RDI: ffff8c2c45f6edc0
[ 1414.098857] RBP: ffff8c57cbc33b40 R08: 0000000000000000 R09: 0000000000000100
[ 1414.100053] R10: 0000000000000200 R11: ffffffff8aa060c0 R12: ffff8c2dd7ef8000
[ 1414.101256] R13: 0000000000000000 R14: 000000000000000a R15: 0000000000000000
[ 1414.102490] FS: 00007f4819ffb6c0(0000) GS:ffff8c57cbc00000(0000) knlGS:0000000000000000
[ 1414.103848] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1414.104833] CR2: ffffffffffffffd6 CR3: 00000002a9a94002 CR4: 0000000000770ee0
[ 1414.106056] PKRU: 55555554
[ 1414.106543] Kernel panic - not syncing: Fatal exception in interrupt
[ 1414.108473] Kernel Offset: 0x7e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1414.109741] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
```

Not sure if it's related.

I originally reported this to the Arch Linux issue tracker (https://bugs.archlinux.org/task/75828), but it seemed like a good idea to post it here as well to get more eyes on it.

Thank you.

-- 
PGP Key ID: 0xFB738CE27B634E4B

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-07  9:39 BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x František Šumšal
@ 2022-09-07 15:08 ` Sean Christopherson
  2022-09-07 15:11   ` František Šumšal
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-09-07 15:08 UTC (permalink / raw)
  To: František Šumšal; +Cc: kvm

On Wed, Sep 07, 2022, František Šumšal wrote:
> Hello!
> 
> In our Arch Linux part of the upstream systemd CI I recently noticed an
> uptrend in CPU soft lockups when running one of our tests. This test runs
> several systemd-nspawn containers in succession and sometimes the underlying
> VM locks up due to a CPU soft lockup

By "underlying VM", do you mean L1 or L2?  Where

    L0 == Bare Metal
    L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
    L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)

> (just to clarify, the topology is: CentOS Stream 8 (baremetal,
> 4.18.0-305.3.1.el8) -> Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1) ->
> Arch Linux (nested KVM or QEMU TCG, happens with both,
> 5.19.5-arch1-1/5.19.7-arch1-1) -> nspawn containers).

Since this repros with TCG, that rules out nested KVM as the cuplrit.

> I did some further testing, and it reproduces even when the baremetal is my
> local Fedora 36 machine (5.17.12-300.fc36.x86_64).
> 
> Unfortunately, I can't provide a simple and reliable reproducer, as I can
> reproduce it only with that particular test and not reliably (sometimes it's
> the first iteration, sometimes it takes an hour or more to reproduce).
> However, I'd be more than glad to collect more information from one such
> machine, if possible.

...

> Also, in one instance, the machine died with:

Probably unrelated, but same question as above: which layer does "the machine"
refer to?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-07 15:08 ` Sean Christopherson
@ 2022-09-07 15:11   ` František Šumšal
  2022-09-07 15:23     ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: František Šumšal @ 2022-09-07 15:11 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm


[-- Attachment #1.1: Type: text/plain, Size: 1748 bytes --]

On 9/7/22 17:08, Sean Christopherson wrote:
> On Wed, Sep 07, 2022, František Šumšal wrote:
>> Hello!
>>
>> In our Arch Linux part of the upstream systemd CI I recently noticed an
>> uptrend in CPU soft lockups when running one of our tests. This test runs
>> several systemd-nspawn containers in succession and sometimes the underlying
>> VM locks up due to a CPU soft lockup
> 
> By "underlying VM", do you mean L1 or L2?  Where
> 
>      L0 == Bare Metal
>      L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
>      L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)

I mean L2.

> 
>> (just to clarify, the topology is: CentOS Stream 8 (baremetal,
>> 4.18.0-305.3.1.el8) -> Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1) ->
>> Arch Linux (nested KVM or QEMU TCG, happens with both,
>> 5.19.5-arch1-1/5.19.7-arch1-1) -> nspawn containers).
> 
> Since this repros with TCG, that rules out nested KVM as the cuplrit.\

Ah, that's a good point, thanks.

> 
>> I did some further testing, and it reproduces even when the baremetal is my
>> local Fedora 36 machine (5.17.12-300.fc36.x86_64).
>>
>> Unfortunately, I can't provide a simple and reliable reproducer, as I can
>> reproduce it only with that particular test and not reliably (sometimes it's
>> the first iteration, sometimes it takes an hour or more to reproduce).
>> However, I'd be more than glad to collect more information from one such
>> machine, if possible.
> 
> ...
> 
>> Also, in one instance, the machine died with:
> 
> Probably unrelated, but same question as above: which layer does "the machine"
> refer to?


Same as in the previous case - it's the L2.


-- 
PGP Key ID: 0xFB738CE27B634E4B

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-07 15:11   ` František Šumšal
@ 2022-09-07 15:23     ` Sean Christopherson
  2022-09-08  8:30       ` František Šumšal
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-09-07 15:23 UTC (permalink / raw)
  To: František Šumšal; +Cc: kvm

On Wed, Sep 07, 2022, František Šumšal wrote:
> On 9/7/22 17:08, Sean Christopherson wrote:
> > On Wed, Sep 07, 2022, František Šumšal wrote:
> > > Hello!
> > > 
> > > In our Arch Linux part of the upstream systemd CI I recently noticed an
> > > uptrend in CPU soft lockups when running one of our tests. This test runs
> > > several systemd-nspawn containers in succession and sometimes the underlying
> > > VM locks up due to a CPU soft lockup
> > 
> > By "underlying VM", do you mean L1 or L2?  Where
> > 
> >      L0 == Bare Metal
> >      L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
> >      L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)
> 
> I mean L2.

Is there anything interesting in the L1 or L0 logs?  A failure in a lower level
can manifest as a soft lockup and/or stall in the VM, e.g. because a vCPU isn't
run by the host for whatever reason.

Does the bug repro with an older version of QEMU?  If it's difficult to roll back
the QEMU version, then we can punt on this question for now.

Is it possible to run the nspawn tests in L1?  If the bug repros there, that would
greatly shrink the size of the haystack.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-07 15:23     ` Sean Christopherson
@ 2022-09-08  8:30       ` František Šumšal
  2022-09-08 15:08         ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: František Šumšal @ 2022-09-08  8:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm


[-- Attachment #1.1: Type: text/plain, Size: 2351 bytes --]

On 9/7/22 17:23, Sean Christopherson wrote:
> On Wed, Sep 07, 2022, František Šumšal wrote:
>> On 9/7/22 17:08, Sean Christopherson wrote:
>>> On Wed, Sep 07, 2022, František Šumšal wrote:
>>>> Hello!
>>>>
>>>> In our Arch Linux part of the upstream systemd CI I recently noticed an
>>>> uptrend in CPU soft lockups when running one of our tests. This test runs
>>>> several systemd-nspawn containers in succession and sometimes the underlying
>>>> VM locks up due to a CPU soft lockup
>>>
>>> By "underlying VM", do you mean L1 or L2?  Where
>>>
>>>       L0 == Bare Metal
>>>       L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
>>>       L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)
>>
>> I mean L2.
> 
> Is there anything interesting in the L1 or L0 logs?  A failure in a lower level
> can manifest as a soft lockup and/or stall in the VM, e.g. because a vCPU isn't
> run by the host for whatever reason.

There's nothing (quite literally) in the L0 logs, the host is silent when running the VM/tests.
As for L1, there doesn't seem to be anything interesting as well. Here are the L1 and L2 logs
for reference: https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs/

> 
> Does the bug repro with an older version of QEMU?  If it's difficult to roll back
> the QEMU version, then we can punt on this question for now.

> 
> Is it possible to run the nspawn tests in L1?  If the bug repros there, that would
> greatly shrink the size of the haystack.

I've fiddled around with the test and managed to trim it down enough so it's easy to run in both
L1 and L2, and after couple of hours I managed to reproduce it in both layers. That also somewhat
answers the QEMU question, since L0 uses QEMU 6.2.0 to run L1, and L1 uses QEMU 7.0.0 to run L2.
In both cases I used TCG emulation, since with it the issue appears to reproduce slightly more
often (or maybe I was just unlucky with KVM).

https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs-no-L2/L1_console.log

As in the previous case, there's nothing of interest in the L0 logs.

This also raises a question - is this issue still KVM-related, since in the last case there's
just L0 baremetal and L1 QEMU/TCG without KVM involved?

-- 
PGP Key ID: 0xFB738CE27B634E4B

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-08  8:30       ` František Šumšal
@ 2022-09-08 15:08         ` Sean Christopherson
  2022-09-08 15:24           ` František Šumšal
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-09-08 15:08 UTC (permalink / raw)
  To: František Šumšal; +Cc: kvm

On Thu, Sep 08, 2022, František Šumšal wrote:
> On 9/7/22 17:23, Sean Christopherson wrote:
> > On Wed, Sep 07, 2022, František Šumšal wrote:
> > > On 9/7/22 17:08, Sean Christopherson wrote:
> > > > On Wed, Sep 07, 2022, František Šumšal wrote:
> > > > > Hello!
> > > > > 
> > > > > In our Arch Linux part of the upstream systemd CI I recently noticed an
> > > > > uptrend in CPU soft lockups when running one of our tests. This test runs
> > > > > several systemd-nspawn containers in succession and sometimes the underlying
> > > > > VM locks up due to a CPU soft lockup
> > > > 
> > > > By "underlying VM", do you mean L1 or L2?  Where
> > > > 
> > > >       L0 == Bare Metal
> > > >       L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
> > > >       L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)
> > > 
> > > I mean L2.
> > 
> > Is there anything interesting in the L1 or L0 logs?  A failure in a lower level
> > can manifest as a soft lockup and/or stall in the VM, e.g. because a vCPU isn't
> > run by the host for whatever reason.
> 
> There's nothing (quite literally) in the L0 logs, the host is silent when running the VM/tests.
> As for L1, there doesn't seem to be anything interesting as well. Here are the L1 and L2 logs
> for reference: https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs/
> 
> > 
> > Does the bug repro with an older version of QEMU?  If it's difficult to roll back
> > the QEMU version, then we can punt on this question for now.
> 
> > 
> > Is it possible to run the nspawn tests in L1?  If the bug repros there, that would
> > greatly shrink the size of the haystack.
> 
> I've fiddled around with the test and managed to trim it down enough so it's easy to run in both
> L1 and L2, and after couple of hours I managed to reproduce it in both layers. That also somewhat
> answers the QEMU question, since L0 uses QEMU 6.2.0 to run L1, and L1 uses QEMU 7.0.0 to run L2.
> In both cases I used TCG emulation, since with it the issue appears to reproduce slightly more
> often (or maybe I was just unlucky with KVM).
> 
> https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs-no-L2/L1_console.log
> 
> As in the previous case, there's nothing of interest in the L0 logs.
> 
> This also raises a question - is this issue still KVM-related, since in the last case there's
> just L0 baremetal and L1 QEMU/TCG without KVM involved?

Ya, unless there's a latent bug in KVM that's present in your L0 kernel, which is
extremely unlikely given that the bug repros with 4.18 and 5.17 as the bare metal
kernel, this isn't a KVM problem.

The mm, ext4, and scheduler subsystems are all likely candidates.  I'm not familiar
enough with their gory details to point fingers though.

Do you think it's possible to bisect the L1 kernel using the QEMU/TCG configuration?
That'd probably be the least awful way to get a root cause.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-08 15:08         ` Sean Christopherson
@ 2022-09-08 15:24           ` František Šumšal
  2022-09-09  8:33             ` František Šumšal
  0 siblings, 1 reply; 8+ messages in thread
From: František Šumšal @ 2022-09-08 15:24 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm


[-- Attachment #1.1: Type: text/plain, Size: 3293 bytes --]



On 9/8/22 17:08, Sean Christopherson wrote:
> On Thu, Sep 08, 2022, František Šumšal wrote:
>> On 9/7/22 17:23, Sean Christopherson wrote:
>>> On Wed, Sep 07, 2022, František Šumšal wrote:
>>>> On 9/7/22 17:08, Sean Christopherson wrote:
>>>>> On Wed, Sep 07, 2022, František Šumšal wrote:
>>>>>> Hello!
>>>>>>
>>>>>> In our Arch Linux part of the upstream systemd CI I recently noticed an
>>>>>> uptrend in CPU soft lockups when running one of our tests. This test runs
>>>>>> several systemd-nspawn containers in succession and sometimes the underlying
>>>>>> VM locks up due to a CPU soft lockup
>>>>>
>>>>> By "underlying VM", do you mean L1 or L2?  Where
>>>>>
>>>>>        L0 == Bare Metal
>>>>>        L1 == Arch Linux (KVM, 5.19.5-arch1-1/5.19.7-arch1-1)
>>>>>        L2 == Arch Linux (nested KVM or QEMU TCG, 5.19.5-arch1-1/5.19.7-arch1-1)
>>>>
>>>> I mean L2.
>>>
>>> Is there anything interesting in the L1 or L0 logs?  A failure in a lower level
>>> can manifest as a soft lockup and/or stall in the VM, e.g. because a vCPU isn't
>>> run by the host for whatever reason.
>>
>> There's nothing (quite literally) in the L0 logs, the host is silent when running the VM/tests.
>> As for L1, there doesn't seem to be anything interesting as well. Here are the L1 and L2 logs
>> for reference: https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs/
>>
>>>
>>> Does the bug repro with an older version of QEMU?  If it's difficult to roll back
>>> the QEMU version, then we can punt on this question for now.
>>
>>>
>>> Is it possible to run the nspawn tests in L1?  If the bug repros there, that would
>>> greatly shrink the size of the haystack.
>>
>> I've fiddled around with the test and managed to trim it down enough so it's easy to run in both
>> L1 and L2, and after couple of hours I managed to reproduce it in both layers. That also somewhat
>> answers the QEMU question, since L0 uses QEMU 6.2.0 to run L1, and L1 uses QEMU 7.0.0 to run L2.
>> In both cases I used TCG emulation, since with it the issue appears to reproduce slightly more
>> often (or maybe I was just unlucky with KVM).
>>
>> https://mrc0mmand.fedorapeople.org/kernel-kvm-soft-lockup/2022-09-07-logs-no-L2/L1_console.log
>>
>> As in the previous case, there's nothing of interest in the L0 logs.
>>
>> This also raises a question - is this issue still KVM-related, since in the last case there's
>> just L0 baremetal and L1 QEMU/TCG without KVM involved?
> 
> Ya, unless there's a latent bug in KVM that's present in your L0 kernel, which is
> extremely unlikely given that the bug repros with 4.18 and 5.17 as the bare metal
> kernel, this isn't a KVM problem.
> 
> The mm, ext4, and scheduler subsystems are all likely candidates.  I'm not familiar
> enough with their gory details to point fingers though.
> 
> Do you think it's possible to bisect the L1 kernel using the QEMU/TCG configuration?
> That'd probably be the least awful way to get a root cause.

Yeah, I can try, but it might take some time. Nevertheless, it sounds like the least awful way
how to debug this further, as you said. I'll report back if/when I find something interesting.

Thanks for the tips!

-- 
PGP Key ID: 0xFB738CE27B634E4B

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x
  2022-09-08 15:24           ` František Šumšal
@ 2022-09-09  8:33             ` František Šumšal
  0 siblings, 0 replies; 8+ messages in thread
From: František Šumšal @ 2022-09-09  8:33 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm


[-- Attachment #1.1: Type: text/plain, Size: 10985 bytes --]

On 9/8/22 17:24, František Šumšal wrote:
<snip.>
>>
>> Ya, unless there's a latent bug in KVM that's present in your L0 kernel, which is
>> extremely unlikely given that the bug repros with 4.18 and 5.17 as the bare metal
>> kernel, this isn't a KVM problem.
>>
>> The mm, ext4, and scheduler subsystems are all likely candidates.  I'm not familiar
>> enough with their gory details to point fingers though.
>>
>> Do you think it's possible to bisect the L1 kernel using the QEMU/TCG configuration?
>> That'd probably be the least awful way to get a root cause.
> 
> Yeah, I can try, but it might take some time. Nevertheless, it sounds like the least awful way
> how to debug this further, as you said. I'll report back if/when I find something interesting.
> 
> Thanks for the tips!
> 

Out of curiosity I tried to reproduce it on Fedora Rawhide (as both L0 and L1 with QEMU/TCG), which
currently has 6.0.0-0.rc4.20220906git53e99dcff61e.32.fc38.x86_64, and hit another soft lockup during boot:

```
[  228.311561] watchdog: BUG: soft lockup - CPU#0 stuck for 39s! [zram-generator:614]
[  228.312307] Modules linked in: fuse zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_blk serio_raw ata_generic pata_acpi qemu_fw_cfg
[  228.312307] irq event stamp: 17888
[  228.324256] hardirqs last  enabled at (17887): [<ffffffff95000f26>] asm_sysvec_call_function_single+0x16/0x20
[  228.324619] hardirqs last disabled at (17888): [<ffffffff94f180ca>] sysvec_apic_timer_interrupt+0xa/0xc0
[  228.333763] softirqs last  enabled at (17878): [<ffffffff940ff749>] __irq_exit_rcu+0xf9/0x170
[  228.334446] softirqs last disabled at (17867): [<ffffffff940ff749>] __irq_exit_rcu+0xf9/0x170
[  228.337785] CPU: 0 PID: 614 Comm: zram-generator Not tainted 6.0.0-0.rc4.20220906git53e99dcff61e.32.fc38.x86_64 #1
[  228.348634] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-2.fc37 04/01/2014
[  228.349359] RIP: 0010:_raw_spin_unlock_irqrestore+0x36/0x60
[  228.356249] Code: 74 24 10 48 89 fb 48 83 c7 18 e8 b5 0d 25 ff 48 89 df e8 8d 4c 25 ff f7 c5 00 02 00 00 74 0b e8 a0 ef 32 ff fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 a0 5f 20 ff 65 8b 05 79 43 0f 6b 85 c0 74 07 5b
[  228.363915] RSP: 0018:ff73ffef41403c28 EFLAGS: 00000206
[  228.366966] RAX: 00000000000045db RBX: ff203b9ab2dfe140 RCX: 0000000000000006
[  228.367334] RDX: 0000000000000000 RSI: ffffffff9597e547 RDI: ffffffff958ead16
[  228.367334] RBP: 0000000000000282 R08: 0000000000000001 R09: 0000000000000001
[  228.367334] R10: 0000000000000001 R11: 0000000000000000 R12: ff203b9ab2dfe140
[  228.367334] R13: 0000000000000064 R14: ff203b9ab2dfe198 R15: ffbfc6fb04274888
[  228.384777] FS:  00007f8c463af780(0000) GS:ff203b9ab2c00000(0000) knlGS:0000000000000000
[  228.384777] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  228.384777] CR2: 0000556d5c18a098 CR3: 000000010bed6000 CR4: 0000000000751ef0
[  228.384777] PKRU: 55555554
[  228.384777] Call Trace:
[  228.384777]  <TASK>
[  228.384777]  __alloc_pages_bulk+0x3cb/0x730
[  228.384777]  __vmalloc_node_range+0x253/0x820
[  228.384777]  __vmalloc_node+0x4a/0x60
[  228.384777]  ? disksize_store+0x7a/0x140 [zram]
[  228.384777]  disksize_store+0x7a/0x140 [zram]
[  228.384777]  kernfs_fop_write_iter+0x161/0x210
[  228.384777]  vfs_write+0x22b/0x4b0
[  228.384777]  ksys_write+0x5c/0xe0
[  228.384777]  do_syscall_64+0x5b/0x80
[  228.384777]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[  228.384777]  ? lockdep_hardirqs_on+0x7d/0x100
[  228.384777]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  228.384777] RIP: 0033:0x7f8c465a91f4
[  228.384777]
[  228.384777] ================================
[  228.384777] WARNING: inconsistent lock state
[  228.384777] 6.0.0-0.rc4.20220906git53e99dcff61e.32.fc38.x86_64 #1 Not tainted
[  228.384777] --------------------------------
[  228.384777] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  228.384777] zram-generator/614 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  228.384777] ffffffff9626f5d8 (vmap_area_lock){?.+.}-{2:2}, at: find_vmap_area+0x16/0x60
[  228.384777] {HARDIRQ-ON-W} state was registered at:
[  228.384777]   lock_acquire+0xce/0x2d0
[  228.384777]   _raw_spin_lock+0x33/0x80
[  228.384777]   alloc_vmap_area+0x3e6/0x890
[  228.384777]   __get_vm_area_node+0xb1/0x150
[  228.384777]   get_vm_area_caller+0x3a/0x50
[  228.384777]   __ioremap_caller+0x145/0x320
[  228.384777]   acpi_os_map_iomem+0x1c9/0x1e0
[  228.384777]   acpi_tb_acquire_table+0x39/0x62
[  228.384777]   acpi_tb_validate_table+0x44/0x7c
[  228.384777]   acpi_tb_verify_temp_table+0x42/0x30a
[  228.384777]   acpi_reallocate_root_table+0x129/0x143
[  228.384777]   acpi_early_init+0x4b/0xde
[  228.384777]   start_kernel+0x8da/0x998
[  228.384777]   secondary_startup_64_no_verify+0xe5/0xeb
[  228.384777] irq event stamp: 17888
[  228.384777] hardirqs last  enabled at (17887): [<ffffffff95000f26>] asm_sysvec_call_function_single+0x16/0x20
[  228.384777] hardirqs last disabled at (17888): [<ffffffff94f180ca>] sysvec_apic_timer_interrupt+0xa/0xc0
[  228.384777] softirqs last  enabled at (17878): [<ffffffff940ff749>] __irq_exit_rcu+0xf9/0x170
[  228.384777] softirqs last disabled at (17867): [<ffffffff940ff749>] __irq_exit_rcu+0xf9/0x170
[  228.384777]
[  228.384777] other info that might help us debug this:
[  228.384777]  Possible unsafe locking scenario:
[  228.384777]
[  228.384777]        CPU0
[  228.384777]        ----
[  228.384777]   lock(vmap_area_lock);
[  228.384777]   <Interrupt>
[  228.384777]     lock(vmap_area_lock);
[  228.384777]
[  228.384777]  *** DEADLOCK ***
[  228.384777]
[  228.384777] 4 locks held by zram-generator/614:
[  228.384777]  #0: ff203b9885d30498 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x5c/0xe0
[  228.384777]  #1: ff203b98839a9890 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210
[  228.384777]  #2: ff203b988be40a58 (kn->active#90){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210
[  228.384777]  #3: ff203b988b8a7298 (&zram->init_lock){+.+.}-{3:3}, at: disksize_store+0x49/0x140 [zram]
[  228.384777]
[  228.384777] stack backtrace:
[  228.384777] CPU: 0 PID: 614 Comm: zram-generator Not tainted 6.0.0-0.rc4.20220906git53e99dcff61e.32.fc38.x86_64 #1
[  228.384777] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-2.fc37 04/01/2014
[  228.384777] Call Trace:
[  228.384777]  <IRQ>
[  228.384777]  dump_stack_lvl+0x5b/0x77
[  228.384777]  mark_lock.cold+0x48/0xe1
[  228.384777]  ? sched_clock_local+0xe/0x80
[  228.384777]  ? __lock_acquire+0x388/0x1ef0
[  228.384777]  ? sched_clock_local+0xe/0x80
[  228.384777]  __lock_acquire+0x948/0x1ef0
[  228.384777]  lock_acquire+0xce/0x2d0
[  228.384777]  ? find_vmap_area+0x16/0x60
[  228.384777]  ? lock_release+0x14f/0x460
[  228.384777]  _raw_spin_lock+0x33/0x80
[  228.384777]  ? find_vmap_area+0x16/0x60
[  228.384777]  find_vmap_area+0x16/0x60
[  228.384777]  __check_object_size+0x161/0x290
[  228.384777]  copy_from_user_nmi+0x61/0x90
[  228.384777]  show_opcodes+0x5d/0xc0
[  228.384777]  ? irq_work_queue+0xa/0x50
[  228.384777]  show_iret_regs+0x12/0x37
[  228.384777]  __show_regs+0x24/0x40
[  228.384777]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  228.384777]  ? unwind_next_frame+0x3a7/0x620
[  228.384777]  show_trace_log_lvl+0x2e0/0x32f
[  228.384777]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  228.384777]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  228.384777]  watchdog_timer_fn+0x224/0x290
[  228.384777]  ? softlockup_fn+0x70/0x70
[  228.384777]  __hrtimer_run_queues+0x217/0x530
[  228.384777]  hrtimer_interrupt+0xfe/0x220
[  228.384777]  __sysvec_apic_timer_interrupt+0xa2/0x2b0
[  228.384777]  sysvec_apic_timer_interrupt+0x99/0xc0
[  228.384777]  </IRQ>
[  228.384777]  <TASK>
[  228.384777]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[  228.384777] RIP: 0010:_raw_spin_unlock_irqrestore+0x36/0x60
[  228.384777] Code: 74 24 10 48 89 fb 48 83 c7 18 e8 b5 0d 25 ff 48 89 df e8 8d 4c 25 ff f7 c5 00 02 00 00 74 0b e8 a0 ef 32 ff fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 a0 5f 20 ff 65 8b 05 79 43 0f 6b 85 c0 74 07 5b
[  228.384777] RSP: 0018:ff73ffef41403c28 EFLAGS: 00000206
[  228.384777] RAX: 00000000000045db RBX: ff203b9ab2dfe140 RCX: 0000000000000006
[  228.384777] RDX: 0000000000000000 RSI: ffffffff9597e547 RDI: ffffffff958ead16
[  228.384777] RBP: 0000000000000282 R08: 0000000000000001 R09: 0000000000000001
[  228.384777] R10: 0000000000000001 R11: 0000000000000000 R12: ff203b9ab2dfe140
[  228.384777] R13: 0000000000000064 R14: ff203b9ab2dfe198 R15: ffbfc6fb04274888
[  228.384777]  ? _raw_spin_unlock_irqrestore+0x30/0x60
[  228.384777]  __alloc_pages_bulk+0x3cb/0x730
[  228.384777]  __vmalloc_node_range+0x253/0x820
[  228.384777]  __vmalloc_node+0x4a/0x60
[  228.384777]  ? disksize_store+0x7a/0x140 [zram]
[  228.384777]  disksize_store+0x7a/0x140 [zram]
[  228.384777]  kernfs_fop_write_iter+0x161/0x210
[  228.384777]  vfs_write+0x22b/0x4b0
[  228.384777]  ksys_write+0x5c/0xe0
[  228.384777]  do_syscall_64+0x5b/0x80
[  228.384777]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[  228.384777]  ? lockdep_hardirqs_on+0x7d/0x100
[  228.384777]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  228.384777] RIP: 0033:0x7f8c465a91f4
[  228.384777] Code: 15 11 7c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d ed 03 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[  228.384777] RSP: 002b:00007fffa0d4b658 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  228.384777] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f8c465a91f4
[  228.384777] RDX: 000000000000000a RSI: 0000556d5c1866a0 RDI: 0000000000000004
[  228.384777] RBP: 0000000000000019 R08: 0000556d5c1854d0 R09: fefefefefefefeff
[  228.384777] R10: 00000000000001b6 R11: 0000000000000202 R12: 0000556d5c1866a0
[  228.384777] R13: 7fffffffffffffff R14: 0000000000000004 R15: 00007f8c465a91e0
[  228.384777]  </TASK>
[  228.384777] Code: 15 11 7c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d ed 03 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[  228.384777] RSP: 002b:00007fffa0d4b658 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  228.384777] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f8c465a91f4
[  228.384777] RDX: 000000000000000a RSI: 0000556d5c1866a0 RDI: 0000000000000004
[  228.384777] RBP: 0000000000000019 R08: 0000556d5c1854d0 R09: fefefefefefefeff
[  228.384777] R10: 00000000000001b6 R11: 0000000000000202 R12: 0000556d5c1866a0
[  228.384777] R13: 7fffffffffffffff R14: 0000000000000004 R15: 00007f8c465a91e0
```

Unfortunately, given the stack trace I don't think it's related to the original issue,
so the search continues.

-- 
PGP Key ID: 0xFB738CE27B634E4B

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-09  8:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07  9:39 BUG: soft lockup - CPU#0 stuck for 26s! with nested KVM on 5.19.x František Šumšal
2022-09-07 15:08 ` Sean Christopherson
2022-09-07 15:11   ` František Šumšal
2022-09-07 15:23     ` Sean Christopherson
2022-09-08  8:30       ` František Šumšal
2022-09-08 15:08         ` Sean Christopherson
2022-09-08 15:24           ` František Šumšal
2022-09-09  8:33             ` František Šumšal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.