From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 193A3E95A90 for ; Sun, 8 Oct 2023 22:13:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344768AbjJHWNK (ORCPT ); Sun, 8 Oct 2023 18:13:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344757AbjJHWNK (ORCPT ); Sun, 8 Oct 2023 18:13:10 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78A71BA for ; Sun, 8 Oct 2023 15:13:05 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPS id 088E7C433B6 for ; Sun, 8 Oct 2023 22:13:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696803185; bh=Nlmhv2kN0p4YSBMweJBxNg7VZoeLLrJVB/i6FhIGBxo=; h=From:To:Subject:Date:In-Reply-To:References:From; b=pVI4lDnWhK6xMWTPWUtFfXFNU5KYaDZWTggivAFSxOVI9qOS3Hhuu2knoD+Z+OAyB WaB46NhroXBY4wSzFNq+70aQtyofLhyTo6yX7+l75a7onCj2vAHVb+wKkEqU7Aa3gZ OUTiIy660v3FFO0mMuSxqTlT4a/sqmL64ZsqB02FqROfdmB7dMVthmGW2UW5J9tL6s JnfCSwPzF0e9iK9CHl6Bhxzdhv6ok6Sr03omzjUbxU7vNG4TGblxPhyaYXOj6nczls cqQ53uvD7dTiTl5XmhZgmOPYk2YmCBycIE9Gv1/kGeM3bVp80N/swvFz3/6SR3o835 zPx5hHgkSX6qg== Received: by aws-us-west-2-korg-bugzilla-1.web.codeaurora.org (Postfix, from userid 48) id EBB9EC53BD0; Sun, 8 Oct 2023 22:13:04 +0000 (UTC) From: bugzilla-daemon@kernel.org To: linux-xfs@vger.kernel.org Subject: [Bug 217572] Initial blocked tasks causing deterioration over hours until (nearly) complete system lockup and data loss with PostgreSQL 13 Date: Sun, 08 Oct 2023 22:13:04 +0000 X-Bugzilla-Reason: None X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: AssignedTo filesystem_xfs@kernel-bugs.kernel.org X-Bugzilla-Product: File System X-Bugzilla-Component: XFS X-Bugzilla-Version: 2.5 X-Bugzilla-Keywords: X-Bugzilla-Severity: high X-Bugzilla-Who: mironov.ivan@gmail.com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: filesystem_xfs@kernel-bugs.kernel.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugzilla.kernel.org/ Auto-Submitted: auto-generated MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=3D217572 --- Comment #17 from Ivan Mironov (mironov.ivan@gmail.com) --- More of it: [ 0.000000] Linux version 6.5.5-200.fc38.x86_64 (mockbuild@d4d01d62c9c942e59de1ef4aa94df5a2) (gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1), GNU ld version 2.39-9.fc38) #1 SMP PREEMPT_DYNAMIC Sun Sep 24 15:52:44 UTC 2023 [ 0.000000] Command line: BOOT_IMAGE=3D(md/boot)/vmlinuz-6.5.5-200.fc38.= x86_64 root=3D/dev/mapper/vg--bmsolv-root ro rd.md.uuid=3D216337a3:789c28b0:81fbad29:6f190e56 rd.lvm.lv=3Dvg-bmsolv/root rd.md.uuid=3D252001b9:2095e731:f1dd5baa:8b672d56 clocksource=3Dtsc tsc=3Dre= liable amd_pstate=3Dactive rhgb quiet ... [13024.332817] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [rocksdb:low:6331] [13024.332841] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reje= ct wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfki= ll tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_= ssif vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common [13024.332904] CPU: 8 PID: 6331 Comm: rocksdb:low Not tainted 6.5.5-200.fc38.x86_64 #1 [13024.332906] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021 [13024.332908] RIP: 0010:xas_load+0x45/0x50 [13024.332914] Code: 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 = 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d 00 00 <75> c= 7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 [13024.332917] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000206 [13024.332919] RAX: ffff8ee4773b8b6a RBX: ffff9c6541a1fbd8 RCX: 0000000000000002 [13024.332921] RDX: 0000000000000000 RSI: ffff8ed9bb019ff0 RDI: ffff9c6541a1fbd8 [13024.332923] RBP: ffff8ed9bb019ff0 R08: 0000000000000000 R09: 000000000000131c [13024.332924] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010 [13024.332925] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70 [13024.332927] FS: 00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000 [13024.332929] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13024.332931] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0 [13024.332933] PKRU: 55555554 [13024.332934] Call Trace: [13024.332937] [13024.332941] ? watchdog_timer_fn+0x1b8/0x220 [13024.332946] ? __pfx_watchdog_timer_fn+0x10/0x10 [13024.332949] ? __hrtimer_run_queues+0x112/0x2b0 [13024.332954] ? hrtimer_interrupt+0xf8/0x230 [13024.332957] ? __sysvec_apic_timer_interrupt+0x61/0x130 [13024.332961] ? sysvec_apic_timer_interrupt+0x6d/0x90 [13024.332965] [13024.332966] [13024.332968] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [13024.332974] ? xas_load+0x45/0x50 [13024.332976] filemap_get_read_batch+0x16e/0x250 [13024.332981] filemap_get_pages+0xa6/0x630 [13024.332984] ? srso_alias_return_thunk+0x5/0x7f [13024.332988] ? srso_alias_return_thunk+0x5/0x7f [13024.332990] ? touch_atime+0x48/0x1b0 [13024.332994] ? srso_alias_return_thunk+0x5/0x7f [13024.332996] ? filemap_read+0x329/0x350 [13024.332999] filemap_read+0xd9/0x350 [13024.333005] xfs_file_buffered_read+0x52/0xd0 [xfs] [13024.333107] xfs_file_read_iter+0x77/0xe0 [xfs] [13024.333216] vfs_read+0x201/0x350 [13024.333225] __x64_sys_pread64+0x98/0xd0 [13024.333229] do_syscall_64+0x60/0x90 [13024.333232] ? srso_alias_return_thunk+0x5/0x7f [13024.333236] ? __irq_exit_rcu+0x4b/0xc0 [13024.333241] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [13024.333245] RIP: 0033:0x7f32aa721105 [13024.333264] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c = 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3= d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b [13024.333266] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 [13024.333269] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105 [13024.333270] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9 [13024.333272] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268 [13024.333273] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060 [13024.333275] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580 [13024.333280] [13052.332283] watchdog: BUG: soft lockup - CPU#8 stuck for 52s! [rocksdb:low:6331] [13052.332303] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reje= ct wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfki= ll tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_= ssif vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common [13052.332364] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G L= =20=20=20=20 6.5.5-200.fc38.x86_64 #1 [13052.332367] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021 [13052.332368] RIP: 0010:xas_descend+0x3/0x90 [13052.332374] Code: 00 48 8b 57 10 48 89 07 48 c1 e8 20 48 89 57 08 e9 c2 = 79 02 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f b6 0e <48> 8= b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 [13052.332375] RSP: 0018:ffff9c6541a1fbb8 EFLAGS: 00000206 [13052.332377] RAX: ffff8ed9e8c56da2 RBX: ffff9c6541a1fbd8 RCX: 000000000000000c [13052.332378] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI: ffff9c6541a1fbd8 [13052.332379] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09: 000000000000131c [13052.332380] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010 [13052.332381] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70 [13052.332383] FS: 00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000 [13052.332384] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13052.332385] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0 [13052.332387] PKRU: 55555554 [13052.332387] Call Trace: [13052.332389] [13052.332391] ? watchdog_timer_fn+0x1b8/0x220 [13052.332395] ? __pfx_watchdog_timer_fn+0x10/0x10 [13052.332398] ? __hrtimer_run_queues+0x112/0x2b0 [13052.332402] ? hrtimer_interrupt+0xf8/0x230 [13052.332404] ? __sysvec_apic_timer_interrupt+0x61/0x130 [13052.332407] ? sysvec_apic_timer_interrupt+0x6d/0x90 [13052.332410] [13052.332411] [13052.332412] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [13052.332418] ? xas_descend+0x3/0x90 [13052.332420] ? srso_alias_return_thunk+0x5/0x7f [13052.332423] xas_load+0x41/0x50 [13052.332426] filemap_get_read_batch+0x16e/0x250 [13052.332431] filemap_get_pages+0xa6/0x630 [13052.332433] ? srso_alias_return_thunk+0x5/0x7f [13052.332436] ? srso_alias_return_thunk+0x5/0x7f [13052.332438] ? touch_atime+0x48/0x1b0 [13052.332441] ? srso_alias_return_thunk+0x5/0x7f [13052.332443] ? filemap_read+0x329/0x350 [13052.332445] filemap_read+0xd9/0x350 [13052.332451] xfs_file_buffered_read+0x52/0xd0 [xfs] [13052.332548] xfs_file_read_iter+0x77/0xe0 [xfs] [13052.332633] vfs_read+0x201/0x350 [13052.332641] __x64_sys_pread64+0x98/0xd0 [13052.332643] do_syscall_64+0x60/0x90 [13052.332646] ? srso_alias_return_thunk+0x5/0x7f [13052.332649] ? __irq_exit_rcu+0x4b/0xc0 [13052.332652] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [13052.332655] RIP: 0033:0x7f32aa721105 [13052.332671] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c = 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3= d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b [13052.332673] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 [13052.332675] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105 [13052.332676] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9 [13052.332677] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268 [13052.332678] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060 [13052.332679] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580 [13052.332683] [13059.632827] rcu: INFO: rcu_preempt self-detected stall on CPU [13059.632832] rcu: 8-....: (60000 ticks this GP) idle=3D5d8c/1/0x4000000000000000 softirq=3D3001365/3001365 fqs=3D27154 [13059.632836] rcu: (t=3D60001 jiffies g=3D4535761 q=3D1170639 ncpus=3D= 32) [13059.632838] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G L= =20=20=20=20 6.5.5-200.fc38.x86_64 #1 [13059.632840] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021 [13059.632841] RIP: 0010:xas_load+0x2d/0x50 [13059.632847] Code: fa 55 53 48 89 fb e8 22 ff ff ff 48 89 c2 83 e2 03 48 = 83 fa 02 75 08 48 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 <48> 8= d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d [13059.632849] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000282 [13059.632851] RAX: ffff8ed9bb019ff2 RBX: ffff9c6541a1fbd8 RCX: 0000000000000000 [13059.632852] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI: ffff9c6541a1fbd8 [13059.632853] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09: 000000000000131c [13059.632854] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010 [13059.632855] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70 [13059.632856] FS: 00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000 [13059.632858] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13059.632859] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0 [13059.632860] PKRU: 55555554 [13059.632861] Call Trace: [13059.632863] [13059.632865] ? rcu_dump_cpu_stacks+0xc4/0x100 [13059.632869] ? rcu_sched_clock_irq+0x4f2/0x1170 [13059.632872] ? srso_alias_return_thunk+0x5/0x7f [13059.632875] ? task_tick_fair+0x2fc/0x3f0 [13059.632879] ? srso_alias_return_thunk+0x5/0x7f [13059.632881] ? srso_alias_return_thunk+0x5/0x7f [13059.632883] ? trigger_load_balance+0x73/0x390 [13059.632885] ? srso_alias_return_thunk+0x5/0x7f [13059.632888] ? update_process_times+0x74/0xb0 [13059.632891] ? tick_sched_handle+0x21/0x60 [13059.632894] ? tick_sched_timer+0x6f/0x90 [13059.632896] ? __pfx_tick_sched_timer+0x10/0x10 [13059.632897] ? __hrtimer_run_queues+0x112/0x2b0 [13059.632901] ? hrtimer_interrupt+0xf8/0x230 [13059.632903] ? __sysvec_apic_timer_interrupt+0x61/0x130 [13059.632906] ? sysvec_apic_timer_interrupt+0x6d/0x90 [13059.632909] [13059.632910] [13059.632911] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [13059.632916] ? xas_load+0x2d/0x50 [13059.632918] filemap_get_read_batch+0x16e/0x250 [13059.632923] filemap_get_pages+0xa6/0x630 [13059.632925] ? srso_alias_return_thunk+0x5/0x7f [13059.632928] ? srso_alias_return_thunk+0x5/0x7f [13059.632929] ? touch_atime+0x48/0x1b0 [13059.632933] ? srso_alias_return_thunk+0x5/0x7f [13059.632935] ? filemap_read+0x329/0x350 [13059.632937] filemap_read+0xd9/0x350 [13059.632944] xfs_file_buffered_read+0x52/0xd0 [xfs] [13059.633041] xfs_file_read_iter+0x77/0xe0 [xfs] [13059.633121] vfs_read+0x201/0x350 [13059.633127] __x64_sys_pread64+0x98/0xd0 [13059.633130] do_syscall_64+0x60/0x90 [13059.633133] ? srso_alias_return_thunk+0x5/0x7f [13059.633135] ? __irq_exit_rcu+0x4b/0xc0 [13059.633139] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [13059.633142] RIP: 0033:0x7f32aa721105 [13059.633158] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c = 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3= d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b [13059.633159] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 [13059.633161] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105 [13059.633162] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9 [13059.633163] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268 [13059.633164] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060 [13059.633165] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580 [13059.633169] --=20 You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.=