* filesystem dead, xfs_repair won't help @ 2017-04-10 9:23 Avi Kivity 2017-04-10 9:42 ` Avi Kivity ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Avi Kivity @ 2017-04-10 9:23 UTC (permalink / raw) To: linux-xfs Today my kernel complained that in memory metadata is corrupt and asked that I run xfs_repair. But xfs_repair doesn't like the superblock and isn't able to find a secondary superblock. Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks without issue). Anything I can do to recover the data? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 9:23 filesystem dead, xfs_repair won't help Avi Kivity @ 2017-04-10 9:42 ` Avi Kivity 2017-04-10 15:35 ` Brian Foster 2017-04-10 9:43 ` allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) L A Walsh 2017-04-10 15:49 ` filesystem dead, xfs_repair won't help Eric Sandeen 2 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-10 9:42 UTC (permalink / raw) To: linux-xfs On 04/10/2017 12:23 PM, Avi Kivity wrote: > Today my kernel complained that in memory metadata is corrupt and > asked that I run xfs_repair. But xfs_repair doesn't like the > superblock and isn't able to find a secondary superblock. > > Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks > without issue). > > Anything I can do to recover the data? Initial error: Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block 0x2cb68e13 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and run xfs_repair Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 bytes of corrupted metadata buffer: Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75400: 23 40 8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75410: 62 87 57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75420: ae 7a ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75430: e4 2e 14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): xfs_do_force_shutdown(0x8) called from line 236 of file fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc05bdbc6 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption of in-memory data detected. Shutting down filesystem Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please umount the filesystem and rectify the problem(s) After restart: Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Mounting V5 Filesystem Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Starting recovery (logdev: internal) Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block 0x2cb68e13 Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and run xfs_repair Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 bytes of corrupted metadata buffer: Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a00: 23 40 8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a10: 62 87 57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a20: ae 7a ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a30: e4 2e 14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c. Caller xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: CPU: 3 PID: 1063 Comm: mount Not tainted 4.10.8-200.fc25.x86_64 #1 Apr 10 11:47:58 avi.cloudius-systems.com kernel: Hardware name: /DH77EB, BIOS EBH7710H.86A.0099.2013.0125.1400 01/25/2013 Apr 10 11:47:58 avi.cloudius-systems.com kernel: Call Trace: Apr 10 11:47:58 avi.cloudius-systems.com kernel: dump_stack+0x63/0x86 Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_error_report+0x3c/0x40 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_trans_cancel+0xb6/0xe0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_process_efi+0x2c/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_process_intents.isra.42+0x122/0x160 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_reinit_percpu_counters+0x46/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_finish+0x23/0xb0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_log_mount_finish+0x29/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_mountfs+0x6ce/0x930 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_fill_super+0x3ee/0x570 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_bdev+0x178/0x1b0 Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_test_remount_options.isra.14+0x60/0x60 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_mount+0x15/0x20 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_fs+0x38/0x150 Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? __alloc_percpu+0x15/0x20 Apr 10 11:47:58 avi.cloudius-systems.com kernel: vfs_kern_mount+0x67/0x130 Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_mount+0x1dd/0xc50 Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? _copy_from_user+0x4e/0x80 Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? memdup_user+0x4f/0x70 Apr 10 11:47:58 avi.cloudius-systems.com kernel: SyS_mount+0x83/0xd0 Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_syscall_64+0x67/0x180 Apr 10 11:47:58 avi.cloudius-systems.com kernel: entry_SYSCALL64_slow_path+0x25/0x25 Apr 10 11:47:58 avi.cloudius-systems.com kernel: RIP: 0033:0x7f5cb9a626fa Apr 10 11:47:58 avi.cloudius-systems.com kernel: RSP: 002b:00007ffeffa2c928 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 Apr 10 11:47:58 avi.cloudius-systems.com kernel: RAX: ffffffffffffffda RBX: 000055b59fd6f030 RCX: 00007f5cb9a626fa Apr 10 11:47:58 avi.cloudius-systems.com kernel: RDX: 000055b59fd6f210 RSI: 000055b59fd6f250 RDI: 000055b59fd6f230 Apr 10 11:47:58 avi.cloudius-systems.com kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000012 Apr 10 11:47:58 avi.cloudius-systems.com kernel: R10: 00000000c0ed0000 R11: 0000000000000246 R12: 000055b59fd6f230 Apr 10 11:47:58 avi.cloudius-systems.com kernel: R13: 000055b59fd6f210 R14: 0000000000000000 R15: 00000000ffffffff Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): xfs_do_force_shutdown(0x8) called from line 984 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffc056324f Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption of in-memory data detected. Shutting down filesystem Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please umount the filesystem and rectify the problem(s) Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Failed to recover intents Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): log mount finish failed smart (note error at end; there were no kernel I/O errors from the block layer): $ sudo smartctl -a /dev/nvme0n1 smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.10.8-200.fc25.x86_64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: INTEL SSDPEKKW512G7 Serial Number: BTPY6313086D512F Firmware Version: PSF100C PCI Vendor/Subsystem ID: 0x8086 IEEE OUI Identifier: 0x5cd2e4 Controller ID: 1 Number of Namespaces: 1 Namespace 1 Size/Capacity: 512,110,190,592 [512 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Mon Apr 10 12:36:41 2017 IDT Firmware Updates (0x12): 1 Slot, no Reset required Optional Admin Commands (0x0006): Format Frmw_DL Optional NVM Commands (0x001e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Maximum Data Transfer Size: 32 Pages Warning Comp. Temp. Threshold: 70 Celsius Critical Comp. Temp. Threshold: 80 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 9.00W - - 0 0 0 0 5 5 1 + 4.60W - - 1 1 1 1 30 30 2 + 3.80W - - 2 2 2 2 30 30 3 - 0.0700W - - 3 3 3 3 10000 300 4 - 0.0050W - - 4 4 4 4 2000 10000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0x1) Critical Warning: 0x00 Temperature: 27 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 8,854,487 [4.53 TB] Data Units Written: 5,652,445 [2.89 TB] Host Read Commands: 446,901,662 Host Write Commands: 35,627,742 Controller Busy Time: 633 Power Cycles: 24 Power On Hours: 987 Unsafe Shutdowns: 16 Media and Data Integrity Errors: 1 Error Information Log Entries: 1 Warning Comp. Temperature Time: 11 Critical Comp. Temperature Time: 0 Error Information (NVMe Log 0x01, max 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 1 1 0x0000 0x0286 - 0 1 - ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 9:42 ` Avi Kivity @ 2017-04-10 15:35 ` Brian Foster 2017-04-11 7:46 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Brian Foster @ 2017-04-10 15:35 UTC (permalink / raw) To: Avi Kivity; +Cc: linux-xfs On Mon, Apr 10, 2017 at 12:42:33PM +0300, Avi Kivity wrote: > On 04/10/2017 12:23 PM, Avi Kivity wrote: > > Today my kernel complained that in memory metadata is corrupt and > > asked that I run xfs_repair. But xfs_repair doesn't like the > > superblock and isn't able to find a secondary superblock. > > > > Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks > > without issue). > > > > Anything I can do to recover the data? > Well I can't explain why you have a checksum error, but what do you mean that xfs_repair doesn't like the superblock? Can you provide the xfs_repair output? It seems strange for xfs_repair to not find the superblock of a filesystem that can otherwise run log recovery up until it encounters the buffer with a bad crc. It also might be useful to find out exactly what that error reported by smartctl means. Are you aware of whether it pre-existed the filesystem issue or not? Brian > > Initial error: > > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC > error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block > 0x2cb68e13 > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and > run xfs_repair > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 > bytes of corrupted metadata buffer: > Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75400: 23 40 8f > 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... > Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75410: 62 87 57 > 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... > Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75420: ae 7a ea > b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... > Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75430: e4 2e 14 > d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O > error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): > xfs_do_force_shutdown(0x8) called from line 236 of file > fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc05bdbc6 > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption > of in-memory data detected. Shutting down filesystem > Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please > umount the filesystem and rectify the problem(s) > > > After restart: > > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Mounting V5 > Filesystem > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Starting > recovery (logdev: internal) > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC > error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block > 0x2cb68e13 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and > run xfs_repair > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 > bytes of corrupted metadata buffer: > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a00: 23 40 8f > 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a10: 62 87 57 > 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a20: ae 7a ea > b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a30: e4 2e 14 > d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O > error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Internal > error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c. Caller > xfs_efi_recover+0x18e/0x1c0 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: CPU: 3 PID: 1063 Comm: > mount Not tainted 4.10.8-200.fc25.x86_64 #1 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: Hardware name: > /DH77EB, BIOS EBH7710H.86A.0099.2013.0125.1400 01/25/2013 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: Call Trace: > Apr 10 11:47:58 avi.cloudius-systems.com kernel: dump_stack+0x63/0x86 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_error_report+0x3c/0x40 > [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? > xfs_efi_recover+0x18e/0x1c0 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_trans_cancel+0xb6/0xe0 > [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_efi_recover+0x18e/0x1c0 > [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > xlog_recover_process_efi+0x2c/0x50 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > xlog_recover_process_intents.isra.42+0x122/0x160 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? > xfs_reinit_percpu_counters+0x46/0x50 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > xlog_recover_finish+0x23/0xb0 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > xfs_log_mount_finish+0x29/0x50 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_mountfs+0x6ce/0x930 > [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > xfs_fs_fill_super+0x3ee/0x570 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_bdev+0x178/0x1b0 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? > xfs_test_remount_options.isra.14+0x60/0x60 [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_mount+0x15/0x20 > [xfs] > Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_fs+0x38/0x150 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? __alloc_percpu+0x15/0x20 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: vfs_kern_mount+0x67/0x130 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_mount+0x1dd/0xc50 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? > _copy_from_user+0x4e/0x80 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? memdup_user+0x4f/0x70 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: SyS_mount+0x83/0xd0 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_syscall_64+0x67/0x180 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: > entry_SYSCALL64_slow_path+0x25/0x25 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: RIP: 0033:0x7f5cb9a626fa > Apr 10 11:47:58 avi.cloudius-systems.com kernel: RSP: 002b:00007ffeffa2c928 > EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: RAX: ffffffffffffffda RBX: > 000055b59fd6f030 RCX: 00007f5cb9a626fa > Apr 10 11:47:58 avi.cloudius-systems.com kernel: RDX: 000055b59fd6f210 RSI: > 000055b59fd6f250 RDI: 000055b59fd6f230 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: RBP: 0000000000000000 R08: > 0000000000000000 R09: 0000000000000012 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: R10: 00000000c0ed0000 R11: > 0000000000000246 R12: 000055b59fd6f230 > Apr 10 11:47:58 avi.cloudius-systems.com kernel: R13: 000055b59fd6f210 R14: > 0000000000000000 R15: 00000000ffffffff > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): > xfs_do_force_shutdown(0x8) called from line 984 of file fs/xfs/xfs_trans.c. > Return address = 0xffffffffc056324f > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption > of in-memory data detected. Shutting down filesystem > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please > umount the filesystem and rectify the problem(s) > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Failed to > recover intents > Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): log mount > finish failed > > > > smart (note error at end; there were no kernel I/O errors from the block > layer): > > $ sudo smartctl -a /dev/nvme0n1 > smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.10.8-200.fc25.x86_64] (local > build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org > > === START OF INFORMATION SECTION === > Model Number: INTEL SSDPEKKW512G7 > Serial Number: BTPY6313086D512F > Firmware Version: PSF100C > PCI Vendor/Subsystem ID: 0x8086 > IEEE OUI Identifier: 0x5cd2e4 > Controller ID: 1 > Number of Namespaces: 1 > Namespace 1 Size/Capacity: 512,110,190,592 [512 GB] > Namespace 1 Formatted LBA Size: 512 > Local Time is: Mon Apr 10 12:36:41 2017 IDT > Firmware Updates (0x12): 1 Slot, no Reset required > Optional Admin Commands (0x0006): Format Frmw_DL > Optional NVM Commands (0x001e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat > Maximum Data Transfer Size: 32 Pages > Warning Comp. Temp. Threshold: 70 Celsius > Critical Comp. Temp. Threshold: 80 Celsius > > Supported Power States > St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat > 0 + 9.00W - - 0 0 0 0 5 5 > 1 + 4.60W - - 1 1 1 1 30 30 > 2 + 3.80W - - 2 2 2 2 30 30 > 3 - 0.0700W - - 3 3 3 3 10000 300 > 4 - 0.0050W - - 4 4 4 4 2000 10000 > > Supported LBA Sizes (NSID 0x1) > Id Fmt Data Metadt Rel_Perf > 0 + 512 0 0 > > === START OF SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > SMART/Health Information (NVMe Log 0x02, NSID 0x1) > Critical Warning: 0x00 > Temperature: 27 Celsius > Available Spare: 100% > Available Spare Threshold: 10% > Percentage Used: 0% > Data Units Read: 8,854,487 [4.53 TB] > Data Units Written: 5,652,445 [2.89 TB] > Host Read Commands: 446,901,662 > Host Write Commands: 35,627,742 > Controller Busy Time: 633 > Power Cycles: 24 > Power On Hours: 987 > Unsafe Shutdowns: 16 > Media and Data Integrity Errors: 1 > Error Information Log Entries: 1 > Warning Comp. Temperature Time: 11 > Critical Comp. Temperature Time: 0 > > Error Information (NVMe Log 0x01, max 64 entries) > Num ErrCount SQId CmdId Status PELoc LBA NSID VS > 0 1 1 0x0000 0x0286 - 0 1 - > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 15:35 ` Brian Foster @ 2017-04-11 7:46 ` Avi Kivity 2017-04-11 11:30 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-11 7:46 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On 04/10/2017 06:35 PM, Brian Foster wrote: > On Mon, Apr 10, 2017 at 12:42:33PM +0300, Avi Kivity wrote: >> On 04/10/2017 12:23 PM, Avi Kivity wrote: >>> Today my kernel complained that in memory metadata is corrupt and >>> asked that I run xfs_repair. But xfs_repair doesn't like the >>> superblock and isn't able to find a secondary superblock. >>> >>> Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks >>> without issue). >>> >>> Anything I can do to recover the data? > Well I can't explain why you have a checksum error, but what do you mean > that xfs_repair doesn't like the superblock? Can you provide the > xfs_repair output? > > It seems strange for xfs_repair to not find the superblock of a > filesystem that can otherwise run log recovery up until it encounters > the buffer with a bad crc. Sorry, should have done it earlier. $ sudo xfs_repair /dev/nvme0n1 Phase 1 - find and verify superblock... couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!! attempting to find secondary superblock... ...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................^C In a previous run, after returning from lunch, xfs_repair did not find the secondary superblock. The superblock is there though: $ sudo file -s /dev/nvme0n1 /dev/nvme0n1: SGI XFS filesystem data (blksz 4096, inosz 512, v2 dirs) I can provide it if it will help. > > It also might be useful to find out exactly what that error reported by > smartctl means. Are you aware of whether it pre-existed the filesystem > issue or not? I believe I ran it before and did not notice the error. I deal with many disks, though, so it could have been that I just didn't notice it, or that I ran it on a different machine. > >> Error Information (NVMe Log 0x01, max 64 entries) >> Num ErrCount SQId CmdId Status PELoc LBA NSID VS >> 0 1 1 0x0000 0x0286 - 0 1 - >> >> If CmdId is the opcode, then it's a flush (matches the fact that LBA=0), but I'm guessing it's the tag. 0x0286 is NVME_SC_ACCESS_DENIED, which doesn't appear to match, though (if I picked the right enum). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 7:46 ` Avi Kivity @ 2017-04-11 11:30 ` Emmanuel Florac 2017-04-11 11:40 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 11:30 UTC (permalink / raw) To: Avi Kivity; +Cc: Brian Foster, linux-xfs [-- Attachment #1: Type: text/plain, Size: 1003 bytes --] Le Tue, 11 Apr 2017 10:46:07 +0300 Avi Kivity <avi@scylladb.com> écrivait: > $ sudo xfs_repair /dev/nvme0n1 > Phase 1 - find and verify superblock... > couldn't verify primary superblock - not enough secondary superblocks > with matching geometry !!! Which version of xfs_repair is this? Try to export the FS structure with xfs_metadump, something like xfs_metadump /dev/nvme0n1 /some/file.dmp And check the errors it reports, they may be informative. In the case where metadump works out fine, you should then try to have a look at the FS structure using the dump (to avoid wrecking it more that it already is): xfs_db -c "sb 0" /some/file.dmp -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 11:30 ` Emmanuel Florac @ 2017-04-11 11:40 ` Avi Kivity 2017-04-11 12:00 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-11 11:40 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Brian Foster, linux-xfs On 04/11/2017 02:30 PM, Emmanuel Florac wrote: > Le Tue, 11 Apr 2017 10:46:07 +0300 > Avi Kivity <avi@scylladb.com> écrivait: > >> $ sudo xfs_repair /dev/nvme0n1 >> Phase 1 - find and verify superblock... >> couldn't verify primary superblock - not enough secondary superblocks >> with matching geometry !!! > Which version of xfs_repair is this? xfsprogs-4.9.0-1.fc25.x86_64 > Try to export the FS structure with xfs_metadump, something like > > xfs_metadump /dev/nvme0n1 /some/file.dmp > > And check the errors it reports, they may be informative. bad magic number xfs_metadump: cannot read superblock for ag 1 bad magic number xfs_metadump: cannot read superblock for ag 2 Metadata CRC error detected at xfs_agfl block 0x1dcf0963/0x200 bad magic number xfs_metadump: cannot read superblock for ag 3 Metadata CRC error detected at xfs_agfl block 0x2cb68e13/0x200 xfs_metadump: Filesystem log is dirty; image will contain unobfuscated metadata in log. cache_purge: shake on cache 0x55accee162b0 left 3 nodes!? > In the case where metadump works out fine, you should then try to have a > look at the FS structure using the dump (to avoid wrecking it more that > it already is): > > xfs_db -c "sb 0" /some/file.dmp > $ sudo xfs_db -c "sb 0" /tmp/fs xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic number 0x5846534d) Use -F to force a read attempt. $ sudo xfs_db -F -c "sb 0" /tmp/fs xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic number 0x5846534d) xfs_db: V1 inodes unsupported. Please try an older xfsprogs. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 11:40 ` Avi Kivity @ 2017-04-11 12:00 ` Emmanuel Florac 2017-04-11 12:03 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 12:00 UTC (permalink / raw) To: Avi Kivity; +Cc: Brian Foster, linux-xfs [-- Attachment #1: Type: text/plain, Size: 977 bytes --] Le Tue, 11 Apr 2017 14:40:15 +0300 Avi Kivity <avi@scylladb.com> écrivait: > $ sudo xfs_db -c "sb 0" /tmp/fs > xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic > number 0x5846534d) > Use -F to force a read attempt. > $ sudo xfs_db -F -c "sb 0" /tmp/fs > xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic > number 0x5846534d) > xfs_db: V1 inodes unsupported. Please try an older xfsprogs. Oops, I forgot one important part, sorry, you must restore the meta_dump to a file first: xfs_mdrestore /tmp/fs /tmp/fsimage then run xfs_db on the /tmp/fsimage: xfs_db -c 'sb 0' -c 'p' /tmp/fsimage -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 12:00 ` Emmanuel Florac @ 2017-04-11 12:03 ` Avi Kivity 2017-04-11 12:49 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-11 12:03 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Brian Foster, linux-xfs On 04/11/2017 03:00 PM, Emmanuel Florac wrote: > Le Tue, 11 Apr 2017 14:40:15 +0300 > Avi Kivity <avi@scylladb.com> écrivait: > >> $ sudo xfs_db -c "sb 0" /tmp/fs >> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic >> number 0x5846534d) >> Use -F to force a read attempt. >> $ sudo xfs_db -F -c "sb 0" /tmp/fs >> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic >> number 0x5846534d) >> xfs_db: V1 inodes unsupported. Please try an older xfsprogs. > Oops, I forgot one important part, sorry, you must restore the > meta_dump to a file first: > > xfs_mdrestore /tmp/fs /tmp/fsimage > > then run xfs_db on the /tmp/fsimage: > > xfs_db -c 'sb 0' -c 'p' /tmp/fsimage > magicnum = 0x58465342 blocksize = 4096 dblocks = 125026902 rblocks = 0 rextents = 0 uuid = 50b25ad8-3eb9-4273-b7f2-d0a435b3a08f logstart = 67108869 rootino = 96 rbmino = 97 rsumino = 98 rextsize = 1 agblocks = 31256726 agcount = 4 rbmblocks = 0 logblocks = 61048 versionnum = 0xb4b5 sectsize = 512 inodesize = 512 inopblock = 8 fname = "\000\000\000\000\000\000\000\000\000\000\000\000" blocklog = 12 sectlog = 9 inodelog = 9 inopblog = 3 agblklog = 25 rextslog = 0 inprogress = 0 imax_pct = 25 icount = 1959744 ifree = 89 fdblocks = 91586587 frextents = 0 uquotino = null gquotino = null qflags = 0 flags = 0 shared_vn = 0 inoalignmt = 4 unit = 0 width = 0 dirblklog = 0 logsectlog = 0 logsectsize = 0 logsunit = 1 features2 = 0x18a bad_features2 = 0x18a features_compat = 0 features_ro_compat = 0x1 features_incompat = 0x1 features_log_incompat = 0 crc = 0x3ebf41de (correct) spino_align = 0 pquotino = null lsn = 0x70002828b meta_uuid = 00000000-0000-0000-0000-000000000000 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 12:03 ` Avi Kivity @ 2017-04-11 12:49 ` Emmanuel Florac 2017-04-11 13:07 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 12:49 UTC (permalink / raw) To: Avi Kivity; +Cc: Brian Foster, linux-xfs [-- Attachment #1: Type: text/plain, Size: 2556 bytes --] Le Tue, 11 Apr 2017 15:03:12 +0300 Avi Kivity <avi@scylladb.com> écrivait: > On 04/11/2017 03:00 PM, Emmanuel Florac wrote: > > Le Tue, 11 Apr 2017 14:40:15 +0300 > > Avi Kivity <avi@scylladb.com> écrivait: > > > >> $ sudo xfs_db -c "sb 0" /tmp/fs > >> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic > >> number 0x5846534d) > >> Use -F to force a read attempt. > >> $ sudo xfs_db -F -c "sb 0" /tmp/fs > >> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic > >> number 0x5846534d) > >> xfs_db: V1 inodes unsupported. Please try an older xfsprogs. > > Oops, I forgot one important part, sorry, you must restore the > > meta_dump to a file first: > > > > xfs_mdrestore /tmp/fs /tmp/fsimage > > > > then run xfs_db on the /tmp/fsimage: > > > > xfs_db -c 'sb 0' -c 'p' /tmp/fsimage > > > > magicnum = 0x58465342 > blocksize = 4096 > dblocks = 125026902 > rblocks = 0 > rextents = 0 > uuid = 50b25ad8-3eb9-4273-b7f2-d0a435b3a08f > logstart = 67108869 > rootino = 96 > rbmino = 97 > rsumino = 98 > rextsize = 1 > agblocks = 31256726 > agcount = 4 > rbmblocks = 0 > logblocks = 61048 > versionnum = 0xb4b5 > sectsize = 512 > inodesize = 512 > inopblock = 8 > fname = "\000\000\000\000\000\000\000\000\000\000\000\000" > blocklog = 12 > sectlog = 9 > inodelog = 9 > inopblog = 3 > agblklog = 25 > rextslog = 0 > inprogress = 0 > imax_pct = 25 > icount = 1959744 > ifree = 89 > fdblocks = 91586587 > frextents = 0 > uquotino = null > gquotino = null > qflags = 0 > flags = 0 > shared_vn = 0 > inoalignmt = 4 > unit = 0 > width = 0 > dirblklog = 0 > logsectlog = 0 > logsectsize = 0 > logsunit = 1 > features2 = 0x18a > bad_features2 = 0x18a > features_compat = 0 > features_ro_compat = 0x1 > features_incompat = 0x1 > features_log_incompat = 0 > crc = 0x3ebf41de (correct) > spino_align = 0 > pquotino = null > lsn = 0x70002828b > meta_uuid = 00000000-0000-0000-0000-000000000000 > > Tha looks reasonable enough... Heck, what's happening? You could try to run an integrity check from xfs_db (still using the dump) to locate the error: xfs_db -c 'sb 0' -c 'check' /tmp/fsimage What does it report? -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 12:49 ` Emmanuel Florac @ 2017-04-11 13:07 ` Avi Kivity 2017-04-11 16:13 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-11 13:07 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Brian Foster, linux-xfs On 04/11/2017 03:49 PM, Emmanuel Florac wrote: > Le Tue, 11 Apr 2017 15:03:12 +0300 > Avi Kivity <avi@scylladb.com> écrivait: > >> On 04/11/2017 03:00 PM, Emmanuel Florac wrote: >>> Le Tue, 11 Apr 2017 14:40:15 +0300 >>> Avi Kivity <avi@scylladb.com> écrivait: >>> >>>> $ sudo xfs_db -c "sb 0" /tmp/fs >>>> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic >>>> number 0x5846534d) >>>> Use -F to force a read attempt. >>>> $ sudo xfs_db -F -c "sb 0" /tmp/fs >>>> xfs_db: /tmp/fs is not a valid XFS filesystem (unexpected SB magic >>>> number 0x5846534d) >>>> xfs_db: V1 inodes unsupported. Please try an older xfsprogs. >>> Oops, I forgot one important part, sorry, you must restore the >>> meta_dump to a file first: >>> >>> xfs_mdrestore /tmp/fs /tmp/fsimage >>> >>> then run xfs_db on the /tmp/fsimage: >>> >>> xfs_db -c 'sb 0' -c 'p' /tmp/fsimage >>> >> magicnum = 0x58465342 >> blocksize = 4096 >> dblocks = 125026902 >> rblocks = 0 >> rextents = 0 >> uuid = 50b25ad8-3eb9-4273-b7f2-d0a435b3a08f >> logstart = 67108869 >> rootino = 96 >> rbmino = 97 >> rsumino = 98 >> rextsize = 1 >> agblocks = 31256726 >> agcount = 4 >> rbmblocks = 0 >> logblocks = 61048 >> versionnum = 0xb4b5 >> sectsize = 512 >> inodesize = 512 >> inopblock = 8 >> fname = "\000\000\000\000\000\000\000\000\000\000\000\000" >> blocklog = 12 >> sectlog = 9 >> inodelog = 9 >> inopblog = 3 >> agblklog = 25 >> rextslog = 0 >> inprogress = 0 >> imax_pct = 25 >> icount = 1959744 >> ifree = 89 >> fdblocks = 91586587 >> frextents = 0 >> uquotino = null >> gquotino = null >> qflags = 0 >> flags = 0 >> shared_vn = 0 >> inoalignmt = 4 >> unit = 0 >> width = 0 >> dirblklog = 0 >> logsectlog = 0 >> logsectsize = 0 >> logsunit = 1 >> features2 = 0x18a >> bad_features2 = 0x18a >> features_compat = 0 >> features_ro_compat = 0x1 >> features_incompat = 0x1 >> features_log_incompat = 0 >> crc = 0x3ebf41de (correct) >> spino_align = 0 >> pquotino = null >> lsn = 0x70002828b >> meta_uuid = 00000000-0000-0000-0000-000000000000 >> >> > Tha looks reasonable enough... Heck, what's happening? You could try > to run an integrity check from xfs_db (still using the dump) to locate > the error: > > xfs_db -c 'sb 0' -c 'check' /tmp/fsimage > > What does it report? > $ sudo xfs_db -c 'sb 0' -c 'check' /tmp/fsimage ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_db. If you are unable to mount the filesystem, then use the xfs_repair -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. xfs_repair did not recognize the superblock, and started hunting for the second one, emitting dots in the process. I stopped it, since it failed on the live disk. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 13:07 ` Avi Kivity @ 2017-04-11 16:13 ` Emmanuel Florac 2017-04-11 16:44 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 16:13 UTC (permalink / raw) To: Avi Kivity; +Cc: Brian Foster, linux-xfs [-- Attachment #1: Type: text/plain, Size: 1148 bytes --] Le Tue, 11 Apr 2017 16:07:56 +0300 Avi Kivity <avi@scylladb.com> écrivait: > $ sudo xfs_db -c 'sb 0' -c 'check' /tmp/fsimage > ERROR: The filesystem has valuable metadata changes in a log which > needs to be replayed. Mount the filesystem to replay the log, and > unmount it before re-running xfs_db. If you are unable to mount the > filesystem, then use the xfs_repair -L option to destroy the log and > attempt a repair. Note that destroying the log may cause corruption > -- please attempt a mount of the filesystem before doing this. > > > xfs_repair did not recognize the superblock, and started hunting for > the second one, emitting dots in the process. I stopped it, since it > failed on the live disk. > Can you mount the image, or does it fail immmediately because of the CRC error? -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 16:13 ` Emmanuel Florac @ 2017-04-11 16:44 ` Avi Kivity 2017-04-11 16:48 ` Eric Sandeen 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-11 16:44 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Brian Foster, linux-xfs On 04/11/2017 07:13 PM, Emmanuel Florac wrote: > Le Tue, 11 Apr 2017 16:07:56 +0300 > Avi Kivity <avi@scylladb.com> écrivait: > >> $ sudo xfs_db -c 'sb 0' -c 'check' /tmp/fsimage >> ERROR: The filesystem has valuable metadata changes in a log which >> needs to be replayed. Mount the filesystem to replay the log, and >> unmount it before re-running xfs_db. If you are unable to mount the >> filesystem, then use the xfs_repair -L option to destroy the log and >> attempt a repair. Note that destroying the log may cause corruption >> -- please attempt a mount of the filesystem before doing this. >> >> >> xfs_repair did not recognize the superblock, and started hunting for >> the second one, emitting dots in the process. I stopped it, since it >> failed on the live disk. >> > Can you mount the image, or does it fail immmediately because of the > CRC error? > Fails immediately. I'll probably format it with ext4, wait for the firmware update, and then reformat it with xfs. Since the firmware bug was acknowledged, I don't know what we can gain from it. My disk is mostly a git and imap mirror anyway, + a large ccache repository, + a throwaway database. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 16:44 ` Avi Kivity @ 2017-04-11 16:48 ` Eric Sandeen 2017-04-12 15:15 ` Christoph Hellwig 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2017-04-11 16:48 UTC (permalink / raw) To: Avi Kivity, Emmanuel Florac; +Cc: Brian Foster, linux-xfs On 4/11/17 11:44 AM, Avi Kivity wrote: > > > On 04/11/2017 07:13 PM, Emmanuel Florac wrote: >> Le Tue, 11 Apr 2017 16:07:56 +0300 >> Avi Kivity <avi@scylladb.com> écrivait: >> >>> $ sudo xfs_db -c 'sb 0' -c 'check' /tmp/fsimage >>> ERROR: The filesystem has valuable metadata changes in a log which >>> needs to be replayed. Mount the filesystem to replay the log, and >>> unmount it before re-running xfs_db. If you are unable to mount the >>> filesystem, then use the xfs_repair -L option to destroy the log and >>> attempt a repair. Note that destroying the log may cause corruption >>> -- please attempt a mount of the filesystem before doing this. >>> >>> >>> xfs_repair did not recognize the superblock, and started hunting for >>> the second one, emitting dots in the process. I stopped it, since it >>> failed on the live disk. >>> >> Can you mount the image, or does it fail immmediately because of the >> CRC error? >> > > Fails immediately. > > I'll probably format it with ext4, wait for the firmware update, and > then reformat it with xfs. Since the firmware bug was acknowledged, I > don't know what we can gain from it. My disk is mostly a git and imap > mirror anyway, + a large ccache repository, + a throwaway database. Honestly, I'd be a little leary of ext4 too - I don't know what the underlying problem is, but it must be related to some IO pattern that is more common on xfs, but it's a leap to say that it's never present on any other fs... IOWS: a drive firmware bug that corrupts data probably can't be trusted with any filesystem. As an experiment, though, if you want to play, it might be interesting to mkfs.xfs it with a 4096 byte sector size, and see if that makes it happier. By default, xfs is doing metadata IO in 512 chunks, something other filesystems won't do by default. I guess you don't know how to provoke the corruption, though, to be able to run that test reliably... -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-11 16:48 ` Eric Sandeen @ 2017-04-12 15:15 ` Christoph Hellwig 2017-04-12 15:34 ` Eric Sandeen 0 siblings, 1 reply; 32+ messages in thread From: Christoph Hellwig @ 2017-04-12 15:15 UTC (permalink / raw) To: Eric Sandeen; +Cc: Avi Kivity, Emmanuel Florac, Brian Foster, linux-xfs On Tue, Apr 11, 2017 at 11:48:19AM -0500, Eric Sandeen wrote: > As an experiment, though, if you want to play, it might be interesting to > mkfs.xfs it with a 4096 byte sector size, and see if that makes it happier. > By default, xfs is doing metadata IO in 512 chunks, something other filesystems > won't do by default. > > I guess you don't know how to provoke the corruption, though, to be able > to run that test reliably... Btw, it might be a good idea to move to 4k sector size as the default, on pretty much any modern hardware sector sizes are 4k or larger internally, and 512 byte writes will always involve read-modify-write cycles. And unlike SATA or SCSI disks NVMe doesn't have a physical block size attribute, so we can't even look at that. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 15:15 ` Christoph Hellwig @ 2017-04-12 15:34 ` Eric Sandeen 2017-04-12 15:45 ` Christoph Hellwig 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2017-04-12 15:34 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Avi Kivity, Emmanuel Florac, Brian Foster, linux-xfs On 4/12/17 10:15 AM, Christoph Hellwig wrote: > On Tue, Apr 11, 2017 at 11:48:19AM -0500, Eric Sandeen wrote: >> As an experiment, though, if you want to play, it might be interesting to >> mkfs.xfs it with a 4096 byte sector size, and see if that makes it happier. >> By default, xfs is doing metadata IO in 512 chunks, something other filesystems >> won't do by default. >> >> I guess you don't know how to provoke the corruption, though, to be able >> to run that test reliably... > > Btw, it might be a good idea to move to 4k sector size as the default, > on pretty much any modern hardware sector sizes are 4k or larger > internally, and 512 byte writes will always involve read-modify-write > cycles. And unlike SATA or SCSI disks NVMe doesn't have a physical > block size attribute, so we can't even look at that. Is it safe to do that on a device that /actually/ has only 512 sectors? I /think/ Brian's tear detection helps, but </handwave> is it legit to do metadata IO larger than the fundamental IO size of the storage? -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 15:34 ` Eric Sandeen @ 2017-04-12 15:45 ` Christoph Hellwig 2017-04-12 16:15 ` Avi Kivity 0 siblings, 1 reply; 32+ messages in thread From: Christoph Hellwig @ 2017-04-12 15:45 UTC (permalink / raw) To: Eric Sandeen Cc: Christoph Hellwig, Avi Kivity, Emmanuel Florac, Brian Foster, linux-xfs On Wed, Apr 12, 2017 at 10:34:47AM -0500, Eric Sandeen wrote: > Is it safe to do that on a device that /actually/ has only 512 sectors? Except for NVMe none of the storage standards actually guarantees sector atomicy, although the whole storage stack traditionally relies on it.. Maybe we should claim a 4k physical block size for NVMe devices that hab 512 byte LBAs and a "Atomic Write Unit Power Fail" setting of at least 8 so that the mkfs sector size logic triggers.. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 15:45 ` Christoph Hellwig @ 2017-04-12 16:15 ` Avi Kivity 2017-04-12 16:20 ` Christoph Hellwig 0 siblings, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-12 16:15 UTC (permalink / raw) To: Christoph Hellwig, Eric Sandeen; +Cc: Emmanuel Florac, Brian Foster, linux-xfs On 04/12/2017 06:45 PM, Christoph Hellwig wrote: > On Wed, Apr 12, 2017 at 10:34:47AM -0500, Eric Sandeen wrote: >> Is it safe to do that on a device that /actually/ has only 512 sectors? > Except for NVMe none of the storage standards actually guarantees > sector atomicy, although the whole storage stack traditionally relies on > it.. > > Maybe we should claim a 4k physical block size for NVMe devices that > hab 512 byte LBAs and a "Atomic Write Unit Power Fail" setting of at > least 8 so that the mkfs sector size logic triggers.. This preserves the ability to do O_DIRECT reads at 512 byte granularity, yes? We make use of that (it's probably less important on NVMe; still why waste bandwidth needlessly). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 16:15 ` Avi Kivity @ 2017-04-12 16:20 ` Christoph Hellwig 2017-04-12 16:22 ` Eric Sandeen 2017-04-12 16:22 ` Avi Kivity 0 siblings, 2 replies; 32+ messages in thread From: Christoph Hellwig @ 2017-04-12 16:20 UTC (permalink / raw) To: Avi Kivity Cc: Christoph Hellwig, Eric Sandeen, Emmanuel Florac, Brian Foster, linux-xfs On Wed, Apr 12, 2017 at 07:15:40PM +0300, Avi Kivity wrote: > This preserves the ability to do O_DIRECT reads at 512 byte granularity, > yes? No. > We make use of that (it's probably less important on NVMe; still why > waste bandwidth needlessly). In that case it would be the wrong thing for you. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 16:20 ` Christoph Hellwig @ 2017-04-12 16:22 ` Eric Sandeen 2017-04-12 16:24 ` Avi Kivity 2017-04-12 16:22 ` Avi Kivity 1 sibling, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2017-04-12 16:22 UTC (permalink / raw) To: Christoph Hellwig, Avi Kivity; +Cc: Emmanuel Florac, Brian Foster, linux-xfs On 4/12/17 11:20 AM, Christoph Hellwig wrote: > On Wed, Apr 12, 2017 at 07:15:40PM +0300, Avi Kivity wrote: >> This preserves the ability to do O_DIRECT reads at 512 byte granularity, >> yes? > > No. > >> We make use of that (it's probably less important on NVMe; still why >> waste bandwidth needlessly). > > In that case it would be the wrong thing for you. And it would be really interesting to see if the 512-byte DIOs you issue under ext4 might trigger the same problem in the firmware. (This is all pure speculation, but that's all we've got) ;) -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 16:22 ` Eric Sandeen @ 2017-04-12 16:24 ` Avi Kivity 0 siblings, 0 replies; 32+ messages in thread From: Avi Kivity @ 2017-04-12 16:24 UTC (permalink / raw) To: Eric Sandeen, Christoph Hellwig; +Cc: Emmanuel Florac, Brian Foster, linux-xfs On 04/12/2017 07:22 PM, Eric Sandeen wrote: > On 4/12/17 11:20 AM, Christoph Hellwig wrote: >> On Wed, Apr 12, 2017 at 07:15:40PM +0300, Avi Kivity wrote: >>> This preserves the ability to do O_DIRECT reads at 512 byte granularity, >>> yes? >> No. >> >>> We make use of that (it's probably less important on NVMe; still why >>> waste bandwidth needlessly). >> In that case it would be the wrong thing for you. > And it would be really interesting to see if the 512-byte DIOs you > issue under ext4 might trigger the same problem in the firmware. We only issue 512-byte reads, writes are always 4096-byte aligned (and usually 128k). The disk that crashed was my /home; it did see some database loads, but not much. > > (This is all pure speculation, but that's all we've got) ;) > > -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 16:20 ` Christoph Hellwig 2017-04-12 16:22 ` Eric Sandeen @ 2017-04-12 16:22 ` Avi Kivity 2017-04-12 17:41 ` Christoph Hellwig 1 sibling, 1 reply; 32+ messages in thread From: Avi Kivity @ 2017-04-12 16:22 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Eric Sandeen, Emmanuel Florac, Brian Foster, linux-xfs On 04/12/2017 07:20 PM, Christoph Hellwig wrote: > On Wed, Apr 12, 2017 at 07:15:40PM +0300, Avi Kivity wrote: >> This preserves the ability to do O_DIRECT reads at 512 byte granularity, >> yes? > No. :( > >> We make use of that (it's probably less important on NVMe; still why >> waste bandwidth needlessly). > In that case it would be the wrong thing for you. Would it be under our control? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-12 16:22 ` Avi Kivity @ 2017-04-12 17:41 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2017-04-12 17:41 UTC (permalink / raw) To: Avi Kivity Cc: Christoph Hellwig, Eric Sandeen, Emmanuel Florac, Brian Foster, linux-xfs On Wed, Apr 12, 2017 at 07:22:20PM +0300, Avi Kivity wrote: > > > We make use of that (it's probably less important on NVMe; still why > > > waste bandwidth needlessly). > > In that case it would be the wrong thing for you. > > Would it be under our control? Yes, even if we switch a mkfs default you could still manually override it as long as the device supports a smaller logical block size, similar to how we treat 512 logical / 4k physical SAS and SATA drives today. ^ permalink raw reply [flat|nested] 32+ messages in thread
* allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-10 9:23 filesystem dead, xfs_repair won't help Avi Kivity 2017-04-10 9:42 ` Avi Kivity @ 2017-04-10 9:43 ` L A Walsh 2017-04-10 16:01 ` Eric Sandeen 2017-04-10 15:49 ` filesystem dead, xfs_repair won't help Eric Sandeen 2 siblings, 1 reply; 32+ messages in thread From: L A Walsh @ 2017-04-10 9:43 UTC (permalink / raw) To: linux-xfs Avi Kivity wrote: > Today my kernel complained that in memory metadata is corrupt and > asked that I run xfs_repair. But xfs_repair doesn't like the > superblock and isn't able to find a secondary superblock. > Why doesn't xfs have an option to mount with metadata checksumming disabled so people can recover their data? Seems like it should be easy to provide, no? Or rather, if a disk is created with the crc option, is it possible to later switch it off or mount it without with checking disabled? Yes, I know the mantra is that they should have had backups, but in practice it's seems not the case in a majority of uses outside of enterprise usage. It sure seems that disabling a particular file or directory (if necessary) affected by a bad-crc, would be preferable to losing the whole disk. That said, how many crc errors would be likely to make things unreadable or inaccessible? Given that the default before crc-checking was that the disks were still usable (often with no error being flagged or noticed), I'd suspect that the crc-checking is causing many errors to be be flagged that before wouldn't have even been noticed. Overall I'm wondering if the crc option won't cause more disk-losses than would occur without the option. Or, in other words, it seems that since crc-checking seems to cause the disk to be lost, turning on crc checking is almost guaranteed to cause a higher incidence of data loss if it can't be disable. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-10 9:43 ` allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) L A Walsh @ 2017-04-10 16:01 ` Eric Sandeen 2017-04-10 18:05 ` L A Walsh 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2017-04-10 16:01 UTC (permalink / raw) To: L A Walsh, linux-xfs On 4/10/17 4:43 AM, L A Walsh wrote: > Avi Kivity wrote: >> Today my kernel complained that in memory metadata is corrupt and >> asked that I run xfs_repair. But xfs_repair doesn't like the >> superblock and isn't able to find a secondary superblock. >> > Why doesn't xfs have an option to mount with metadata checksumming > disabled so people can recover their data? Because if checksums are bad, your metadata is almost certainly bad, and with bad metadata, you're not going to be recovering data either. (and FWIW, CRCs are only the first line of defense: structure verifiers come after that. The chance of a CRC being bad and everything else checking out is extremely small.) > Seems like it should be easy to provide, no? > > Or rather, if a disk is created with the crc option, is it possible > to later switch it off or mount it without with checking disabled? It is not possible. > Yes, I know the mantra is that they should have had backups, but > in practice it's seems not the case in a majority of uses outside > of enterprise usage. It sure seems that disabling a particular file > or directory (if necessary) affected by a bad-crc, would be > preferable to losing the whole disk. That said, how many crc > errors would be likely to make things unreadable or inaccessible? How log is a piece of string? ;) Totally depends on the details. > Given that the default before crc-checking was that the disks > were still usable (often with no error being flagged or noticed), Before, we had a lot of ad-hoc checks (or not.) Many of those checks, and/or IO errors when trying to read garbage metadata, would also shut down the filesystem. Proceeding with mutilated metadata is almost never a good thing. You'll wander off into garbage and shut down the fs at best, and OOPS at worst. (Losing a filesystem is preferable to losing a system!) > I'd suspect that the crc-checking is causing many errors to be > be flagged that before wouldn't have even been noticed. Yes, that's precisely the point of CRCs. :) > Overall I'm wondering if the crc option won't cause more disk-losses > than would occur without the option. Or, in other words, it seems > that since crc-checking seems to cause the disk to be lost, turning > on crc checking is almost guaranteed to cause a higher incidence of > data loss if it can't be disable. When CRCs detect metadata corruption, the next step is to run xfs_repair to salvage what can be salvaged, and retrieve what's left of your data after that. Disabling CRCs and proceeding in kernelspace with known metadata corruption would be a dangerous total crapshoot. -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-10 16:01 ` Eric Sandeen @ 2017-04-10 18:05 ` L A Walsh 2017-04-11 12:57 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: L A Walsh @ 2017-04-10 18:05 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-xfs Eric Sandeen wrote: > On 4/10/17 4:43 AM, L A Walsh wrote: >> Avi Kivity wrote: >>> Today my kernel complained that in memory metadata is corrupt and >>> asked that I run xfs_repair. But xfs_repair doesn't like the >>> superblock and isn't able to find a secondary superblock. >>> >> Why doesn't xfs have an option to mount with metadata checksumming >> disabled so people can recover their data? > > Because if checksums are bad, your metadata is almost certainly bad, > and with bad metadata, you're not going to be recovering data either. ---- Sorry, but I really don't by that a 1 bit error in metadata will automatically cause problems with data recovery. If the date on the file shows it was created at some time with nanoseconds=1 and that gets bumped to 3 (or virtually any number < an equivalent of 1 second) it will trigger a crc error. But I don't care. > > (and FWIW, CRCs are only the first line of defense: structure > verifiers come after that. The chance of a CRC being bad and > everything else checking out is extremely small.) ---- If the crc error has caught a bit rot, that wouldn't be true. Only if the crc error catches a bug in the XFS SW, would that be likely. Since I was told that it was protecting me against bit-rot and not a lower stability or quality of XFS overall, then it's more likely that data could be recovered. Though, again, this is one of those things like allowing use of the free-space extent that you could *allow* users to use at their own risk -- but something, likely, that you won't. This is another case where your logic is flawed. Permitting mounting w/o enforcement is not a guarantee of data recovery, BUT allowing the user to make the decision of whether or not they can recover anything useful should be up to the owner of the computer. Yet it seems clear you aren't using sound engineering practice to justify your actions. Any bit rot metadata corruption is unlikely to wipe 10 TeraBytes of data. Understand your position. You are claiming the crc option is detecting errors that were previously undetected. People have operated huge filesystems (I'm certain that my 10TB partition is tiny compared to enterprise usage) for years without experiencing noticeable problems. Yet when crc is turned on, suddenly they are expected to buy into crc detecting corruption so severe that nothing can be recovered (when such has not been the case since XFS's inception). >> Seems like it should be easy to provide, no? >> >> Or rather, if a disk is created with the crc option, is it possible >> to later switch it off or mount it without with checking disabled? > > It is not possible. ----- Not possible eh? In the SW world? The only way it would not be possible is if it were *administratively prohibited*. Working around detected bugs or flaws isn't known to be "not possible" by a long shot. Take ZFS, which , I'm told, can not only recover corrupted data from other sectors, but doesn't require shutting down the file system due to the problem detection. That certainly doesn't sound like "impossible". If the crc option is only a canary, and not a cipher then recovery of most data should be possible. Are you saying that the crc option doesn't simply do an integrity check but is converting what was "plaintext" into some encoded form? That isn't what it is documented to do. >> Yes, I know the mantra is that they should have had backups, but >> in practice it's seems not the case in a majority of uses outside >> of enterprise usage. It sure seems that disabling a particular file >> or directory (if necessary) affected by a bad-crc, would be >> preferable to losing the whole disk. That said, how many crc >> errors would be likely to make things unreadable or inaccessible? > > How log is a piece of string? ;) Totally depends on the details. -- That depends on whether or not it is a software error caused by a typo or by 1 or more bit-flips in a given sector. > >> Given that the default before crc-checking was that the disks >> were still usable (often with no error being flagged or noticed), > > Before, we had a lot of ad-hoc checks (or not.) Many of those checks, > and/or IO errors when trying to read garbage metadata, would also > shut down the filesystem. --- But those checks were rarely triggered. It was often the case (you claim) that they went undiscovered for some time -- thus a "need"[sic] for crc to detect a 1 bit-rot-flip in a 100TB file system and mark the entire file system as bad. Sorry, that's bull. You need to compartmentalize damage or its worthless. Noticing a error in 1 sector shouldn't shutdown or prevent 100TB of other daya from being accessed (or usable). > Proceeding with mutilated metadata is almost never a good thing. > You'll wander off into garbage and shut down the fs at best, and OOPS > at worst. (Losing a filesystem is preferable to losing a system!) ---- > >> I'd suspect that the crc-checking is causing many errors to be >> be flagged that before wouldn't have even been noticed. > > Yes, that's precisely the point of CRCs. :) ---- If they wouldn't have been noticed -- then they wouldn't have caused problems. crc is creating problems where before they didn't -- by definition -- because they catch "many errors... that before, WOULDN'T HAVE BEEN NOTICED". That's my point. >> Overall I'm wondering if the crc option won't cause more >> disk-losses than would occur without the option. Or, in other >> words, it seems that since crc-checking seems to cause the disk >> to be lost, turning on crc checking is almost guaranteed to cause >> a higher incidence of data loss if it can't be disable. > > When CRCs detect metadata corruption, the next step is to run > xfs_repair to salvage what can be salvaged, and retrieve what's > left of your data after that. Disabling CRCs and proceeding in > kernelspace with known metadata corruption would be a dangerous > total crapshoot. --- Right..xfsrepair -- like the base-note poster tried and and had fail. The crc errors I'm seen complaints about are ones were xfsrepair don't work. At that point, disabling the volume is not helpful. I'm sure it wouldn't be trivial, but creating a separate file system, "XFS2" from the original XFS sources that responded to data or metadata corruption by returning empty data where it was impossible to return anything useful instead of flagging the disk as "bad", would be a way to allow data recovery to the extent that it made sense (assuming the original sources couldn't do the same toggling off a config-flag). I'm sure you can out-type and come up with various reasons as to why XFS or crc can't auto-correct. Maybe instead of a crc, you should be using a well established check that allows recovery from multiple data bit failure. Supposedly the 4K block size had more error-resistance and *recover* than the 512-byte format. Certainly, with crc's on all the metadata, a more robust algorithm could automatically recover from such errors. If it is that fragile, then perhaps you should consider enabling the independant use of the free-inode, which would certainly raise performance on mature filesystems. I did get that it's been tested on virgin and fresh file systems and showed no benefit with such, but it would be nice if such tests were done on 7-10+ year-old filesystems that "often" exceeded 75% disk space usage -- even going over 80-90% usage at times for a short period. It may not be a normal state, but it does happen. Certainly it would be something worthy of testing with real-life data. :) *cheers* -linda ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-10 18:05 ` L A Walsh @ 2017-04-11 12:57 ` Emmanuel Florac 2017-04-11 13:34 ` Eric Sandeen 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 12:57 UTC (permalink / raw) To: L A Walsh; +Cc: Eric Sandeen, linux-xfs [-- Attachment #1: Type: text/plain, Size: 1278 bytes --] Le Mon, 10 Apr 2017 11:05:38 -0700 L A Walsh <xfs@tlinx.org> écrivait: > I'm sure it wouldn't be trivial, but creating a separate > file system, "XFS2" from the original XFS sources that responded > to data or metadata corruption by returning empty data where > it was impossible to return anything useful instead of flagging > the disk as "bad", would be a way to allow data recovery to > the extent that it made sense (assuming the original sources > couldn't do the same toggling off a config-flag). It would probably much easier to add an option to mount the filesystem without crc, similar to "norecovery", that doesn't replay the journal. It would be of course read-only, but in a similar case it would be much easier and practical for everyone. So far I believed that metadata CRCs were a promise of safer filesystems; now that I've setup several multi-hundred terabytes volumes with CRC enabled, I'm getting nervous... -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-11 12:57 ` Emmanuel Florac @ 2017-04-11 13:34 ` Eric Sandeen 2017-04-11 16:18 ` Emmanuel Florac 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2017-04-11 13:34 UTC (permalink / raw) To: Emmanuel Florac, L A Walsh; +Cc: linux-xfs [-- Attachment #1.1: Type: text/plain, Size: 2366 bytes --] On 4/11/17 7:57 AM, Emmanuel Florac wrote: > Le Mon, 10 Apr 2017 11:05:38 -0700 > L A Walsh <xfs@tlinx.org> écrivait: > >> I'm sure it wouldn't be trivial, but creating a separate >> file system, "XFS2" from the original XFS sources that responded >> to data or metadata corruption by returning empty data where >> it was impossible to return anything useful instead of flagging >> the disk as "bad", would be a way to allow data recovery to >> the extent that it made sense (assuming the original sources >> couldn't do the same toggling off a config-flag). > > It would probably much easier to add an option to mount the filesystem > without crc, similar to "norecovery", that doesn't replay the journal. > It would be of course read-only, but in a similar case it would be much > easier and practical for everyone. Yes, I actually whipped up a patch to do just that, because I was curious. Although I don't think it would fly, I may send it just to have a record out on the list. > So far I believed that metadata CRCs were a promise of safer > filesystems; now that I've setup several multi-hundred terabytes > volumes with CRC enabled, I'm getting nervous... Why? So far there's a lot of fear & speculation from some quarters, but no reports of any actual real-world significant downside to CRC integrity checking. A few amendments to my possibly too-quick reply yesterday, though... One, not every CRC error will shut down your filesystem - far from it. As a quick test of Linda's first scenario, you can corrupt a timestamp without changing the CRC, using xfs_db's expert mode. That inode will be inaccessible until it's fixed with xfs_repair, but the filesystem continues on happily. Two, after talking with Darrick I realized that I misrepresented things a bit; we checksum the entire sector of metadata, so yes, even a bitflip in an unused portion of that location could cause a crc mismatch and therefore a metadata read error. But again, this would render that data structure inaccessible until repair, but it would not take the entire filesystem offline. Three, none of this has anything to do with the email that started this thread. Bad firmware turned Avi's SSD into a vat of goo, and CRCs are not in any way related to his inability to recover his filesystem. Thanks, -Eric [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-11 13:34 ` Eric Sandeen @ 2017-04-11 16:18 ` Emmanuel Florac 2017-04-11 16:34 ` Eric Sandeen 0 siblings, 1 reply; 32+ messages in thread From: Emmanuel Florac @ 2017-04-11 16:18 UTC (permalink / raw) To: Eric Sandeen; +Cc: L A Walsh, linux-xfs [-- Attachment #1: Type: text/plain, Size: 813 bytes --] Le Tue, 11 Apr 2017 08:34:44 -0500 Eric Sandeen <sandeen@sandeen.net> écrivait: > Three, none of this has anything to do with the email that started > this thread. Bad firmware turned Avi's SSD into a vat of goo, and > CRCs are not in any way related to his inability to recover his > filesystem. OK, but xfs_db finds and reads the sb OK, and it looks fine at first look; why does xfs_repair fail completely? I'm not actually certain that Avi's SSD is a "vat of goo"... -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ [-- Attachment #2: Signature digitale OpenPGP --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) 2017-04-11 16:18 ` Emmanuel Florac @ 2017-04-11 16:34 ` Eric Sandeen 0 siblings, 0 replies; 32+ messages in thread From: Eric Sandeen @ 2017-04-11 16:34 UTC (permalink / raw) To: Emmanuel Florac; +Cc: L A Walsh, linux-xfs [-- Attachment #1.1: Type: text/plain, Size: 2410 bytes --] On 4/11/17 11:18 AM, Emmanuel Florac wrote: > Le Tue, 11 Apr 2017 08:34:44 -0500 > Eric Sandeen <sandeen@sandeen.net> écrivait: > >> Three, none of this has anything to do with the email that started >> this thread. Bad firmware turned Avi's SSD into a vat of goo, and >> CRCs are not in any way related to his inability to recover his >> filesystem. > > OK, but xfs_db finds and reads the sb OK, and it looks fine at first > look; why does xfs_repair fail completely? I'm not actually certain > that Avi's SSD is a "vat of goo"... Well, the AGF printed out by the kernel on the mount attempt was nowhere close to a valid AGF structure. Ok, "vat of goo" may have been too strong, but there is at least one core filesystem structure which are completely scrambled. That's just the one that was obvious from the mount attempt; I was assuming there were likely more areas of extreme damage, but that was an assumption on my part. Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block 0x2cb68e13 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and run xfs_repair Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 bytes of corrupted metadata buffer: Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75400: 23 40 8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75410: 62 87 57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75420: ae 7a ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75430: e4 2e 14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... Again, don't fixate on the "CRC" error. The above is /not/ an AGFL for this filesystem. typedef struct xfs_agfl { __be32 agfl_magicnum; __be32 agfl_seqno; uuid_t agfl_uuid; __be64 agfl_lsn; __be32 agfl_crc; __be32 agfl_bno[]; /* actually XFS_AGFL_SIZE(mp) */ } __attribute__((packed)) xfs_agfl_t; The magicnum is wrong. The seqno is invalid. The UUID data in agfl_uuid does not match this filesystem. etc... -Eric [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 9:23 filesystem dead, xfs_repair won't help Avi Kivity 2017-04-10 9:42 ` Avi Kivity 2017-04-10 9:43 ` allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) L A Walsh @ 2017-04-10 15:49 ` Eric Sandeen 2017-04-10 16:23 ` Christoph Hellwig 2017-04-11 7:48 ` Avi Kivity 2 siblings, 2 replies; 32+ messages in thread From: Eric Sandeen @ 2017-04-10 15:49 UTC (permalink / raw) To: Avi Kivity, linux-xfs There is a known firmware bug on Intel 600P drives which causes corruption with XFS, FWIW. https://bugzilla.redhat.com/show_bug.cgi?id=1402533 On 4/10/17 4:23 AM, Avi Kivity wrote: > Today my kernel complained that in memory metadata is corrupt and > asked that I run xfs_repair. But xfs_repair doesn't like the > superblock and isn't able to find a secondary superblock. > > Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks > without issue). > > Anything I can do to recover the data? > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 15:49 ` filesystem dead, xfs_repair won't help Eric Sandeen @ 2017-04-10 16:23 ` Christoph Hellwig 2017-04-11 7:48 ` Avi Kivity 1 sibling, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2017-04-10 16:23 UTC (permalink / raw) To: Eric Sandeen; +Cc: Avi Kivity, linux-xfs On Mon, Apr 10, 2017 at 10:49:55AM -0500, Eric Sandeen wrote: > There is a known firmware bug on Intel 600P drives which > causes corruption with XFS, FWIW. Which apparently affects the whole range of controllers, e.g. also the Pro 6000p and E6000p at least, no idea if there are any more. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: filesystem dead, xfs_repair won't help 2017-04-10 15:49 ` filesystem dead, xfs_repair won't help Eric Sandeen 2017-04-10 16:23 ` Christoph Hellwig @ 2017-04-11 7:48 ` Avi Kivity 1 sibling, 0 replies; 32+ messages in thread From: Avi Kivity @ 2017-04-11 7:48 UTC (permalink / raw) To: Eric Sandeen, linux-xfs That seems to be it. On 04/10/2017 06:49 PM, Eric Sandeen wrote: > There is a known firmware bug on Intel 600P drives which > causes corruption with XFS, FWIW. > > https://bugzilla.redhat.com/show_bug.cgi?id=1402533 > > On 4/10/17 4:23 AM, Avi Kivity wrote: >> Today my kernel complained that in memory metadata is corrupt and >> asked that I run xfs_repair. But xfs_repair doesn't like the >> superblock and isn't able to find a secondary superblock. >> >> Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks >> without issue). >> >> Anything I can do to recover the data? >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-04-12 17:41 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-10 9:23 filesystem dead, xfs_repair won't help Avi Kivity 2017-04-10 9:42 ` Avi Kivity 2017-04-10 15:35 ` Brian Foster 2017-04-11 7:46 ` Avi Kivity 2017-04-11 11:30 ` Emmanuel Florac 2017-04-11 11:40 ` Avi Kivity 2017-04-11 12:00 ` Emmanuel Florac 2017-04-11 12:03 ` Avi Kivity 2017-04-11 12:49 ` Emmanuel Florac 2017-04-11 13:07 ` Avi Kivity 2017-04-11 16:13 ` Emmanuel Florac 2017-04-11 16:44 ` Avi Kivity 2017-04-11 16:48 ` Eric Sandeen 2017-04-12 15:15 ` Christoph Hellwig 2017-04-12 15:34 ` Eric Sandeen 2017-04-12 15:45 ` Christoph Hellwig 2017-04-12 16:15 ` Avi Kivity 2017-04-12 16:20 ` Christoph Hellwig 2017-04-12 16:22 ` Eric Sandeen 2017-04-12 16:24 ` Avi Kivity 2017-04-12 16:22 ` Avi Kivity 2017-04-12 17:41 ` Christoph Hellwig 2017-04-10 9:43 ` allow mounting w/crc-checking disabled? (was Re: filesystem dead, xfs_repair won't help) L A Walsh 2017-04-10 16:01 ` Eric Sandeen 2017-04-10 18:05 ` L A Walsh 2017-04-11 12:57 ` Emmanuel Florac 2017-04-11 13:34 ` Eric Sandeen 2017-04-11 16:18 ` Emmanuel Florac 2017-04-11 16:34 ` Eric Sandeen 2017-04-10 15:49 ` filesystem dead, xfs_repair won't help Eric Sandeen 2017-04-10 16:23 ` Christoph Hellwig 2017-04-11 7:48 ` Avi Kivity
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.