From: Brian Foster <bfoster@redhat.com>
To: linux-xfs@vger.kernel.org
Subject: [BUG] generic/475 recovery failure(s)
Date: Thu, 10 Jun 2021 11:14:32 -0400 [thread overview]
Message-ID: <YMIsWJ0Cb2ot/UjG@bfoster> (raw)
Hi all,
I'm seeing what looks like at least one new generic/475 failure on
current for-next. (I've seen one related to an attr buffer that seems to
be older and harder to reproduce.). The test devices are a couple ~15GB
lvm devices formatted with mkfs defaults. I'm still trying to establish
reproducibility, but so far a failure seems fairly reliable within ~30
iterations.
The first [1] looks like log recovery failure processing an EFI. The
second variant [2] looks like it passes log recovery, but then fails the
mount in the COW extent cleanup stage due to a refcountbt problem. I've
also seen one that looks like the same free space corruption error as
[1], but triggered via the COW recovery codepath in [2], so these could
very well be related. A snippet of the dmesg output for each failed
mount is appended below.
Brian
[1]
...
XFS (dm-5): Mounting V5 Filesystem
XFS (dm-5): Starting recovery (logdev: internal)
XFS (dm-5): Internal error ltbno + ltlen > bno at line 1940 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_ag_extent+0x586/0xa00 [xfs]
CPU: 75 PID: 207978 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
Call Trace:
dump_stack+0x7f/0xa1
xfs_corruption_error+0x81/0x90 [xfs]
? xfs_free_ag_extent+0x586/0xa00 [xfs]
xfs_free_ag_extent+0x5ba/0xa00 [xfs]
? xfs_free_ag_extent+0x586/0xa00 [xfs]
__xfs_free_extent+0xed/0x210 [xfs]
xfs_trans_free_extent+0x55/0x180 [xfs]
xfs_efi_item_recover+0x11b/0x170 [xfs]
xlog_recover_process_intents+0xc5/0x3c0 [xfs]
? xfs_iget+0x7c0/0x10b0 [xfs]
xlog_recover_finish+0x19/0xb0 [xfs]
xfs_log_mount_finish+0x55/0x150 [xfs]
xfs_mountfs+0x552/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f457b46e19e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffec1895aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffec1895c20 RCX: 00007f457b46e19e
RDX: 0000562eaa1bb8b0 RSI: 0000562eaa1bb610 RDI: 0000562eaa1ba4e0
RBP: 0000562eaa1b95c0 R08: 0000000000000000 R09: 00007f457b530a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562eaa1ba4e0 R14: 0000562eaa1bb8b0 R15: 0000562eaa1b95c0
XFS (dm-5): Corruption detected. Unmount and run xfs_repair
XFS (dm-5): Internal error xfs_trans_cancel at line 955 of file fs/xfs/xfs_trans.c. Caller xfs_efi_item_recover+0x12d/0x170 [xfs]
CPU: 75 PID: 207978 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
Call Trace:
dump_stack+0x7f/0xa1
xfs_trans_cancel+0x1a1/0x1f0 [xfs]
xfs_efi_item_recover+0x12d/0x170 [xfs]
xlog_recover_process_intents+0xc5/0x3c0 [xfs]
? xfs_iget+0x7c0/0x10b0 [xfs]
xlog_recover_finish+0x19/0xb0 [xfs]
xfs_log_mount_finish+0x55/0x150 [xfs]
xfs_mountfs+0x552/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f457b46e19e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffec1895aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffec1895c20 RCX: 00007f457b46e19e
RDX: 0000562eaa1bb8b0 RSI: 0000562eaa1bb610 RDI: 0000562eaa1ba4e0
RBP: 0000562eaa1b95c0 R08: 0000000000000000 R09: 00007f457b530a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562eaa1ba4e0 R14: 0000562eaa1bb8b0 R15: 0000562eaa1b95c0
XFS (dm-5): xfs_do_force_shutdown(0x8) called from line 956 of file fs/xfs/xfs_trans.c. Return address = ffffffffc0a9aa4a
XFS (dm-5): Corruption of in-memory data detected. Shutting down filesystem
XFS (dm-5): Please unmount the filesystem and rectify the problem(s)
XFS (dm-5): Failed to recover intents
XFS (dm-5): log mount finish failed
[2]
...
XFS (dm-5): Mounting V5 Filesystem
XFS (dm-5): Starting recovery (logdev: internal)
XFS (dm-5): Ending recovery (logdev: internal)
XFS: Assertion failed: 0, file: fs/xfs/libxfs/xfs_btree.c, line: 1588
------------[ cut here ]------------
WARNING: CPU: 73 PID: 189091 at fs/xfs/xfs_message.c:112 assfail+0x25/0x28 [xfs]
Modules linked in: rfkill dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad iw_cm ib_ipoib intel_rapl_msr ib_cm intel_rapl_common isst_if_common mlx5_ib ib_uverbs skx_edac nfit ib_core libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mlx5_core ipmi_ssif iTCO_wdt irqbypass intel_pmc_bxt rapl iTCO_vendor_support intel_cstate intel_uncore psample mei_me tg3 acpi_ipmi mlxfw wmi_bmof i2c_i801 pcspkr pci_hyperv_intf mei lpc_ich intel_pch_thermal i2c_smbus ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter fuse zram ip_tables xfs lpfc mgag200 drm_kms_helper nvmet_fc nvmet cec nvme_fc crct10dif_pclmul drm nvme_fabrics crc32_pclmul crc32c_intel nvme_core ghash_clmulni_intel scsi_transport_fc megaraid_sas i2c_algo_bit wmi
CPU: 73 PID: 189091 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
RIP: 0010:assfail+0x25/0x28 [xfs]
Code: ff ff 0f 0b c3 0f 1f 44 00 00 41 89 c8 48 89 d1 48 89 f2 48 c7 c6 18 c9 af c0 e8 cf fa ff ff 80 3d 01 cc 0a 00 00 74 02 0f 0b <0f> 0b c3 48 8d 45 10 48 89 e2 4c 89 e6 48 89 1c 24 48 89 44 24 18
RSP: 0018:ffffb00069057b78 EFLAGS: 00010246
RAX: 00000000ffffffea RBX: ffff9186c6b55880 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 0000000000000000 RDI: ffffffffc0aedee4
RBP: ffffb00069057c98 R08: 0000000000000000 R09: 000000000000000a
R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
R13: 00000000ffffff8b R14: ffffb00069057c70 R15: 0000000000000001
FS: 00007ff3505eec40(0000) GS:ffff91b5bfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff350254000 CR3: 00000030f045c001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
xfs_btree_increment+0x27a/0x3d0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
xfs_btree_simple_query_range+0x133/0x1d0 [xfs]
? xfs_trans_read_buf_map+0x23f/0x5b0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
xfs_btree_query_range+0xf6/0x110 [xfs]
? kmem_cache_alloc+0x247/0x2d0
? xfs_refcountbt_init_common+0x2b/0xa0 [xfs]
xfs_refcount_recover_cow_leftovers+0x105/0x390 [xfs]
? trace_hardirqs_on+0x1b/0xd0
? lock_acquire+0x15d/0x380
xfs_reflink_recover_cow+0x43/0xa0 [xfs]
xfs_mountfs+0x5e5/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7ff35082119e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffebc43ea98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffebc43ec10 RCX: 00007ff35082119e
RDX: 000055ba504c98b0 RSI: 000055ba504c9610 RDI: 000055ba504c84e0
RBP: 000055ba504c75c0 R08: 0000000000000000 R09: 00007ff3508e3a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000055ba504c84e0 R14: 000055ba504c98b0 R15: 000055ba504c75c0
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff9e0da3f4>] copy_process+0x754/0x1d00
softirqs last enabled at (0): [<ffffffff9e0da3f4>] copy_process+0x754/0x1d00
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 3975c06460f0a3d7 ]---
XFS (dm-5): Error -117 recovering leftover CoW allocations.
XFS (dm-5): xfs_do_force_shutdown(0x8) called from line 917 of file fs/xfs/xfs_mount.c. Return address = ffffffffc0a904e5
XFS (dm-5): Corruption of in-memory data detected. Shutting down filesystem
XFS (dm-5): Please unmount the filesystem and rectify the problem(s)
next reply other threads:[~2021-06-10 15:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 15:14 Brian Foster [this message]
2021-06-11 19:02 ` [BUG] generic/475 recovery failure(s) Brian Foster
2021-06-11 22:33 ` Dave Chinner
[not found] ` <YMdMehWQoBJC9l0W@bfoster>
2021-06-14 12:56 ` Brian Foster
2021-06-14 23:41 ` Dave Chinner
2021-06-15 4:39 ` Dave Chinner
2021-06-16 7:05 ` Dave Chinner
2021-06-16 20:33 ` Brian Foster
2021-06-16 21:05 ` Darrick J. Wong
2021-06-16 22:54 ` Dave Chinner
2021-06-17 1:28 ` Darrick J. Wong
2021-06-17 12:52 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YMIsWJ0Cb2ot/UjG@bfoster \
--to=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).