linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
@ 2019-06-21  6:51 bugzilla-daemon
  2019-06-21  7:15 ` [Bug 203943] " bugzilla-daemon
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21  6:51 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

            Bug ID: 203943
           Summary: ext4 corruption after RAID6 degraded; e2fsck skips
                    block checks and fails
           Product: File System
           Version: 2.5
    Kernel Version: 4.19.52-gentoo
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: yann@ormanns.net
        Regression: No

I use a 24TB SW-RAID6 with 10 3TB HDDs. This array contains a dmcrypt
container, which contains a ext4 FS [1].
One of the disks had errors and got kicked out of the array. Before I was able
to replace it, the ext4 FS began to throw errors like these:

EXT4-fs error (device dm-1): ext4_find_dest_de:1802: inode #3833864: block
61343924: comm nfsd: bad entry in directory: rec_len % 4 != 0 - offset=1000,
inode=2620025549, rec_len=30675, name_len=223, size=4096
EXT4-fs error (device dm-1): ext4_lookup:1577: inode #172824586: comm
tvh:tasklet: iget: bad extra_isize 13022 (inode size 256)
EXT4-fs error (device dm-1): htree_dirblock_to_tree:1010: inode #7372807: block
117967811: comm tar: bad entry in directory: rec_len % 4 != 0 - offset=104440,
inode=1855122647, rec_len=12017, name_len=209, size=4096 

I then used e2fsck to check the FS for errors, but it only created dozens of
the following output lines:
German original: "Block %$b von Inode %$i steht in Konflikt mit kritischen
Metadaten, Blockprüfungen werden übersprungen."
Translated: "Inode %$i block %$b conflicts with critical metadata, skipping
block checks." 
It also filled up my RAM, so I added ~100G of swap space. After scanning for a
few days, the e2fsck process died.

Afterwards I was able to replace the disk, re-scan the FS with e2fsck and after
2 hours, the ext4 FS was clean again.

To understand the problem, I set one of the disks faulty, and suddenly the ext4
errors occurred again. I added a spare disk, resynced the array, but e2fsck is
unable to fix the errors, and only throws "Inode %$i block %$b conflicts with
critical metadata, skipping block checks." while using more and more RAM.

Used software versions:
Kernel 4.19.52-gentoo
e2fsprogs-1.44.5
mdadm-4.1

Please let me know, if I should provide additional information.

Kind regards,
Yann

########################################## additional information

#e2fsck -c /dev/mapper/share 
e2fsck 1.44.5 (15-Dec-2018)
badblocks: ungültige letzter Block - 5860269055
/dev/mapper/share: Updating bad block inode.
Durchgang 1: Inodes, Blöcke und Größen werden geprüft
Block %$b von Inode %$i steht in Konflikt mit kritischen Metadaten,
Blockprüfungen werden übersprungen.
[...]
Block %$b von Inode %$i steht in Konflikt mit kritischen Metadaten,
Blockprüfungen werden übersprungen.
Signal (6) SIGABRT si_code=SI_TKILL 
e2fsck(+0x33469)[0x55ec4942a469]
/lib64/libc.so.6(+0x39770)[0x7f562449f770]
/lib64/libc.so.6(gsignal+0x10b)[0x7f562449f6ab]
/lib64/libc.so.6(abort+0x123)[0x7f5624488539]
/lib64/libext2fs.so.2(+0x18af5)[0x7f5624ac3af5]
e2fsck(+0x18c30)[0x55ec4940fc30]
/lib64/libext2fs.so.2(+0x1a1ec)[0x7f5624ac51ec]
/lib64/libext2fs.so.2(+0x1a589)[0x7f5624ac5589]
/lib64/libext2fs.so.2(+0x1b04b)[0x7f5624ac604b]
e2fsck(+0x1978c)[0x55ec4941078c]
e2fsck(+0x1b0f6)[0x55ec494120f6]
e2fsck(+0x1b1dc)[0x55ec494121dc]
/lib64/libext2fs.so.2(ext2fs_get_next_inode_full+0x8e)[0x7f5624adc53e]
e2fsck(e2fsck_pass1+0xa12)[0x55ec49412d22]
e2fsck(e2fsck_run+0x6a)[0x55ec4940b54a]
e2fsck(main+0xefd)[0x55ec4940703d]
/lib64/libc.so.6(__libc_start_main+0xeb)[0x7f5624489e6b]
e2fsck(_start+0x2a)[0x55ec494092ea]

---------------------------------------------------------------------

#tune2fs -l /dev/mapper/share
tune2fs 1.44.5 (15-Dec-2018)
Filesystem volume name:   <none>
Last mounted on:          /home/share
Filesystem UUID:          c5f0559d-e3bd-473f-abc0-7c42b3115897
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      ext_attr dir_index filetype extent 64bit flex_bg
sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              366268416
Block count:              5860269056
Reserved block count:     0
Free blocks:              755383351
Free inodes:              363816793
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         2048
Inode blocks per group:   128
RAID stride:              128
RAID stripe width:        1024
Flex block group size:    16
Filesystem created:       Sat Mar 17 14:36:16 2018
Last mount time:          Fri Jun 21 05:25:34 2019
Last write time:          Fri Jun 21 05:30:27 2019
Mount count:              3
Maximum mount count:      -1
Last checked:             Thu Jun 20 08:55:17 2019
Check interval:           0 (<none>)
Lifetime writes:          139 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Default directory hash:   half_md4
Directory Hash Seed:      4c37872e-3207-4ff4-8939-a428feaeb49f
Journal backup:           inode blocks
FS Error count:           20776
First error time:         Thu Jun 20 14:18:47 2019
First error function:     ext4_lookup
First error line #:       1577
First error inode #:      172824586
First error block #:      0
Last error time:          Fri Jun 21 05:53:24 2019
Last error function:      ext4_lookup
Last error line #:        1577
Last error inode #:       172824586
Last error block #:       0

---------------------------------------------------------------------

#cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
[faulty] 
md2 : active raid6 sde[0] sdc[9] sdb[8] sdj[7] sdi[6] sda[11] sdd[10] sdh[3]
sdg[2] sdf[1]
      23441080320 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10]
[UUUUUUUUUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

md1 : active raid1 sdk2[0] sdl2[2]
      52396032 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sdk1[0] sdl1[2]
      511680 blocks super 1.2 [2/2] [UU]

unused devices: <none>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
@ 2019-06-21  7:15 ` bugzilla-daemon
  2019-06-21 12:40 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21  7:15 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

Andreas Dilger (adilger.kernelbugzilla@dilger.ca) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adilger.kernelbugzilla@dilg
                   |                            |er.ca

--- Comment #1 from Andreas Dilger (adilger.kernelbugzilla@dilger.ca) ---
This seems like a RAID problem and not an ext4 problem. The RAID array
shouldn't be returning random garbage if one of the drives is unavailable.
Maybe it is not doing data parity verification on reads, so that it is blindly
returning bad blocks from the failed drive rather than reconstructing valid
data from parity if the drive does not fail completely?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
  2019-06-21  7:15 ` [Bug 203943] " bugzilla-daemon
@ 2019-06-21 12:40 ` bugzilla-daemon
  2019-06-21 14:52 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21 12:40 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

Theodore Tso (tytso@mit.edu) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #2 from Theodore Tso (tytso@mit.edu) ---
Did you resync the disks *before* you ran e2fsck?   Or only afterwards?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
  2019-06-21  7:15 ` [Bug 203943] " bugzilla-daemon
  2019-06-21 12:40 ` bugzilla-daemon
@ 2019-06-21 14:52 ` bugzilla-daemon
  2019-06-21 18:29 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21 14:52 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #3 from Yann Ormanns (yann@ormanns.net) ---
Andreas & Ted, thank you for your replies.

(In reply to Andreas Dilger from comment #1)
> This seems like a RAID problem and not an ext4 problem. The RAID array
> shouldn't be returning random garbage if one of the drives is unavailable.
> Maybe it is not doing data parity verification on reads, so that it is
> blindly returning bad blocks from the failed drive rather than
> reconstructing valid data from parity if the drive does not fail completely?

How can I check that? At least running "checkarray" did not find anything new
or helpful.

(In reply to Theodore Tso from comment #2)
> Did you resync the disks *before* you ran e2fsck?   Or only afterwards?

1. my RAID6 got degraded and ext4 errors showed up
2. I ran e2fsck, it consumed all  memory and showed only "Inode %$i block %$b
conflicts with critical metadata, skipping block checks."
3. I replaced the faulty disk and resynced the RAID6
4. e2fsck was able to clean the filesystem
5. I simulated a drive fault (so my RAID6 had n+1 working disks left)
6. the ext4 FS got corrupted again
7. although the RAID is clean again, e2fsck is not able to clean the FS (like
in step 2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-06-21 14:52 ` bugzilla-daemon
@ 2019-06-21 18:29 ` bugzilla-daemon
  2019-06-21 19:34 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21 18:29 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #4 from Theodore Tso (tytso@mit.edu) ---
That sounds *very* clearly as a RAID bug.   If RAID6 is returning garbage to
the file system in degraded mode, there is nothing the file system can do.

What worries me is if the RAID6 system was returning garbage when *reading* who
knows how it was trashing the file system image when the ext4 kernel code was
*writing* to it?

In any case, there's very little we as ext4 developers can do here to help,
except give you some advice for how to recover your file system.

What I'd suggest that you do is to use the debugfs tool to sanity check the
inode.  If the inode number reported by e2fsck was 123456, you can look at it
by using the debugfs command: "stat <123456>".    If the timestamps, user id
and group id numbers, etc, look insane, you can speed up the recovery time by
using the command "clri <123456>", which zeros out the inode.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-06-21 18:29 ` bugzilla-daemon
@ 2019-06-21 19:34 ` bugzilla-daemon
  2019-06-21 20:47 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21 19:34 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #5 from Yann Ormanns (yann@ormanns.net) ---
Thank you for your support, Ted. About one week before the RAID got degraded, I
created a full file-based backup, so at least I can expect minimal data loss -
I just would like to save the time it would take to copy ~22TB back :-)

Before filing a possible bug in the correct section for RAID, I'd like to
comprehend the steps you described. 
In fact e2fsck does not report inode numbers, only variables ("%$i", and so
does it for blocks, "%$b"). 
Using the the total inode count number in debugfs leads to an error:
share ~ # tune2fs -l /dev/mapper/share | grep "Inode count"
Inode count:              366268416
share ~ # debugfs /dev/mapper/share 
debugfs 1.44.5 (15-Dec-2018)
debugfs:  stat 366268416
366268416: File not found by ext2_lookup

Or did I get you wrong?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-06-21 19:34 ` bugzilla-daemon
@ 2019-06-21 20:47 ` bugzilla-daemon
  2019-06-22  7:48 ` bugzilla-daemon
  2019-06-22  8:06 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-21 20:47 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #6 from Theodore Tso (tytso@mit.edu) ---
That's because the German translation is busted.   I've complained to the
German maintainer of the e2fsprogs messages file at the Translation Project,
but it hasn't been fixed yet.   See [1] for more details.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892173#10

It *should* have actually printed the inode number, and it does if you just use
the built-in English text.  In general, I *really* wish people would disable
the use of the translation when reporting bugs, because very often the use of
the translations add noise that make life harder for developers, and in this
case, it was losing information for you, the system administrator.

See the debugfs man page for more details, but to specify inode numbers to
debugfs, you need to surround the inode number with angle brackets.  Here are
some examples:

% debugfs /tmp/test.img
debugfs 1.45.2 (27-May-2019)
debugfs:  ls a
 12  (12) .    2  (12) ..    13  (988) motd   
debugfs:  stat /a/motd
Inode: 13   Type: regular    Mode:  0644   Flags: 0x80000
Generation: 0    Version: 0x00000000
User:     0   Group:     0   Size: 286
File ACL: 0
Links: 1   Blockcount: 2
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
atime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
mtime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
Inode checksum: 0x0000857d
EXTENTS:
(0):20
debugfs:  stat <13>
Inode: 13   Type: regular    Mode:  0644   Flags: 0x80000
Generation: 0    Version: 0x00000000
User:     0   Group:     0   Size: 286
File ACL: 0
Links: 1   Blockcount: 2
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
atime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
mtime: 0x5d0d416e -- Fri Jun 21 16:43:26 2019
Inode checksum: 0x0000857d
EXTENTS:
(0):20
debugfs:  ncheck 13
Inode   Pathname
13      /a/motd 
debugfs:  stat a
Inode: 12   Type: directory    Mode:  0755   Flags: 0x80000
Generation: 0    Version: 0x00000000
User:     0   Group:     0   Size: 1024
File ACL: 0
Links: 2   Blockcount: 2
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x5d0d416b -- Fri Jun 21 16:43:23 2019
atime: 0x5d0d416b -- Fri Jun 21 16:43:23 2019
mtime: 0x5d0d416b -- Fri Jun 21 16:43:23 2019
Inode checksum: 0x000042d4
EXTENTS:
(0):18
debugfs:  block_dump 18
0000  0c00 0000 0c00 0102 2e00 0000 0200 0000  ................
0020  0c00 0202 2e2e 0000 0d00 0000 dc03 0401  ................
0040  6d6f 7464 0000 0000 0000 0000 0000 0000  motd............
0060  0000 0000 0000 0000 0000 0000 0000 0000  ................
*
1760  0000 0000 0000 0000 0c00 00de be4e 16e9  .............N..

debugfs:

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-06-21 20:47 ` bugzilla-daemon
@ 2019-06-22  7:48 ` bugzilla-daemon
  2019-06-22  8:06 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-22  7:48 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #7 from Yann Ormanns (yann@ormanns.net) ---
I changed the locale to en_GB and got now the actual 5 or 6 inodes, which
caused the extremely slow procedure of e2fsck. I zeroed them with debugfs and
afterwards, e2fsck was able to fix many, many errors.
lost+found contains 934G of data (53190 entries). During the next days, I will
try to examine them.

While copying files out of my backup, I re-tested a failure of a disk an
removed it from the array. After a while I re-added it and now the array is
rsyncing. I will watch the logs for possible FS errors - for now, all seems
clean.

Thanks a lot for your support!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 203943] ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails
  2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-06-22  7:48 ` bugzilla-daemon
@ 2019-06-22  8:06 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2019-06-22  8:06 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=203943

--- Comment #8 from Yann Ormanns (yann@ormanns.net) ---
It seems, as if the data corruption only affects the directories and files,
that the system wrote to while the ext4 FS was not clean.
A quick rsync between my backup fileserver copied nothing of the static data
back to the productive system - so I assume, that these files should be okay
(but of course, I will test that at least randomly).

To sum it up: the data loss affects my backup files locally on this system and
my tv recordings - 934G of total data size are just about right for that.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-22  8:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21  6:51 [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips block checks and fails bugzilla-daemon
2019-06-21  7:15 ` [Bug 203943] " bugzilla-daemon
2019-06-21 12:40 ` bugzilla-daemon
2019-06-21 14:52 ` bugzilla-daemon
2019-06-21 18:29 ` bugzilla-daemon
2019-06-21 19:34 ` bugzilla-daemon
2019-06-21 20:47 ` bugzilla-daemon
2019-06-22  7:48 ` bugzilla-daemon
2019-06-22  8:06 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).