* e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent".
@ 2017-05-22 23:20 Marc Thomas
2017-05-23 15:41 ` Theodore Ts'o
0 siblings, 1 reply; 5+ messages in thread
From: Marc Thomas @ 2017-05-22 23:20 UTC (permalink / raw)
To: linux-ext4
Hi All,
I hope this is the correct place for ext4 / e2fsprogs bug reports. I
think this is a bug.
As per the title, I've discovered sparse files appear corrupt after
running "e2fsck -E bmap2extent" on a filesystem I'm migrating.
Tested with kernel.org 4.10.15 and 4.11.1, using e2fsprogs-1.43.4 on an
x86_64 system.
In short, I'm carrying out a data migration from a snapshot of an ext3
filesystem to new storage, then expanding the filesystem, converting (in
place) to ext4 (as per
https://ext4.wiki.kernel.org/index.php/UpgradeToExt4), before
de-fragmenting the new filesystem with e4defrag.
As part of the testing, I created md5sums of all the source files. After
a "dry run" of the migration, a minority of files (approx 30 out of
1095578 inodes) were found to be corrupt (ie the md5sum had changed).
I repeated the process a second time, checking the md5sums after each
step of the migration. It appears the corruption occurs after running
"e2fsck -E bmap2extent -fy" on the newly converted ext4 filesystem.
The one thing I can find in common is all the affected files are sparse
files.
Here's a few example files:
linux-2.6.30/arch/x86/kernel/acpi/realmode/wakeup.bin: FAILED
linux-2.6.30/arch/x86/boot/compressed/vmlinux.bin: FAILED
marc.old/.mozilla/firefox/ecyfs7l9.default/Cache/_CACHE_003_: FAILED
Linux4.x/linux-4.10.12/arch/x86/realmode/rm/realmode.bin: FAILED
Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin: FAILED
They are all definitely sparse files, eg:
$ ls -l Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
-rwxr-xr-x 1 marc users 21080 Mar 8 12:03
Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
$ du -k Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
16 Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
$ du --apparent -k Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
21 Linux4.x/linux-4.10.1/arch/x86/realmode/rm/realmode.bin
Would you agree this is a bug? As I understand it, reading from an
"unpopulated" region of a sparse file should return all zeros - so the
md5sum should be the same before and after migration.
I'm happy to provide more details and do testing if required. I have an
offline copy of the source filesystem so can run the migration again.
Let me know if there's anything you need.
Thanks & Kind Regards,
Marc
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent".
2017-05-22 23:20 e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent" Marc Thomas
@ 2017-05-23 15:41 ` Theodore Ts'o
2017-05-23 16:25 ` Darrick J. Wong
2017-05-23 17:41 ` Marc Thomas
0 siblings, 2 replies; 5+ messages in thread
From: Theodore Ts'o @ 2017-05-23 15:41 UTC (permalink / raw)
To: Marc Thomas; +Cc: linux-ext4
On Tue, May 23, 2017 at 12:20:59AM +0100, Marc Thomas wrote:
> Hi All,
>
> Would you agree this is a bug? As I understand it, reading from an
> "unpopulated" region of a sparse file should return all zeros - so the
> md5sum should be the same before and after migration.
I agree that sparse files should be properly handled after bmap2extent
conversion. This code hasn't received that much use or testing, but
I've tried to replicate the problem and I haven't succeeded. This
script works for me:
----- cut here -----
#!/bin/bash
rm -f /tmp/foo.img
mke2fs -t ext3 -Fq /tmp/foo.img 65536
mount -o loop /tmp/foo.img /mnt
dd if=/etc/motd of=/mnt/test bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test conv=notrunc seek=32 bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test2 bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test2 conv=notrunc seek=1024 bs=4k >& /dev/null
lsattr /mnt/test*
md5sum /mnt/test* > /mnt/MD5SUMS
umount /mnt
e2fsck -fy -E bmap2extent /tmp/foo.img
mount -o loop /tmp/foo.img /mnt
md5sum -c /mnt/MD5SUMS
lsattr /mnt/test*
umount /mnt
----- cut here -----
Maybe you can come up with a simple repro case that fails for you?
Thanks,
- Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent".
2017-05-23 15:41 ` Theodore Ts'o
@ 2017-05-23 16:25 ` Darrick J. Wong
2017-05-23 17:41 ` Marc Thomas
1 sibling, 0 replies; 5+ messages in thread
From: Darrick J. Wong @ 2017-05-23 16:25 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Marc Thomas, linux-ext4
On Tue, May 23, 2017 at 11:41:52AM -0400, Theodore Ts'o wrote:
> On Tue, May 23, 2017 at 12:20:59AM +0100, Marc Thomas wrote:
> > Hi All,
> >
> > Would you agree this is a bug? As I understand it, reading from an
> > "unpopulated" region of a sparse file should return all zeros - so the
> > md5sum should be the same before and after migration.
>
> I agree that sparse files should be properly handled after bmap2extent
> conversion. This code hasn't received that much use or testing, but
> I've tried to replicate the problem and I haven't succeeded. This
> script works for me:
>
> ----- cut here -----
> #!/bin/bash
>
> rm -f /tmp/foo.img
> mke2fs -t ext3 -Fq /tmp/foo.img 65536
> mount -o loop /tmp/foo.img /mnt
> dd if=/etc/motd of=/mnt/test bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test conv=notrunc seek=32 bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test2 bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test2 conv=notrunc seek=1024 bs=4k >& /dev/null
> lsattr /mnt/test*
> md5sum /mnt/test* > /mnt/MD5SUMS
> umount /mnt
> e2fsck -fy -E bmap2extent /tmp/foo.img
> mount -o loop /tmp/foo.img /mnt
> md5sum -c /mnt/MD5SUMS
> lsattr /mnt/test*
> umount /mnt
> ----- cut here -----
>
> Maybe you can come up with a simple repro case that fails for you?
Or take an e2image of one of the affected filesystems so that the
developers can directly reproduce your error case. :)
e2image -Q /dev/sda1 /some/qcow/dump/file
(Preferably xz the dumpfile before uploading...)
--D
>
> Thanks,
>
> - Ted
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent".
2017-05-23 15:41 ` Theodore Ts'o
2017-05-23 16:25 ` Darrick J. Wong
@ 2017-05-23 17:41 ` Marc Thomas
2017-05-23 18:42 ` Darrick J. Wong
1 sibling, 1 reply; 5+ messages in thread
From: Marc Thomas @ 2017-05-23 17:41 UTC (permalink / raw)
To: linux-ext4; +Cc: Theodore Ts'o
Hi Ted,
Firstly, thank-you for taking the time to look into this.
I have modified your script with a "test-case" which fails for me every
time, as I noticed there is a file created by a Linux kernel build which
exhibits the problem (arch/x86/realmode/rm/realmode.bin) I think any
kernel version 3.10.x on x86_64 ought to work. I've also included a few
extra steps which I'm doing as part of the ext3 to ext4 conversion, and
added the "-b" (binary) flag to md5sum.
Here's the script. I'm not sure if it counts as a "simple" repro! :)
----- cut here -----
#!/bin/bash
rm -f /tmp/foo.img
mke2fs -t ext3 -Fq /tmp/foo.img 5G
mount -o loop /tmp/foo.img /mnt
dd if=/etc/motd of=/mnt/test bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test conv=notrunc seek=32 bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test2 bs=4k >& /dev/null
dd if=/etc/motd of=/mnt/test2 conv=notrunc seek=1024 bs=4k >& /dev/null
md5sum -b /mnt/test* > /mnt/MD5SUMS
lsattr /mnt/test*
( cd /mnt
xz -dc ~marc/LinuxStuff/Linux4.x/linux-4.11.1.tar.xz | tar xvf -
cd linux-4.11.1
make defconfig
time make -j 15 ) > /dev/null 2>&1
md5sum -b /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin >>
/mnt/MD5SUMS
lsattr /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
umount /mnt
tune2fs -O extents,uninit_bg,dir_index /tmp/foo.img
e2fsck -fy /tmp/foo.img
e2fsck -fy -E bmap2extent /tmp/foo.img
mount -o loop /tmp/foo.img /mnt
md5sum -c /mnt/MD5SUMS
lsattr /mnt/test*
lsattr /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
umount /mnt
----- cut here -----
The output of the script is:
time ./marc2.sh
------------------- /mnt/test
------------------- /mnt/test2
------------------- /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
tune2fs 1.43.4 (31-Jan-2017)
e2fsck 1.43.4 (31-Jan-2017)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/tmp/foo.img: 71184/327680 files (6.4% non-contiguous), 338237/1310720
blocks
e2fsck 1.43.4 (31-Jan-2017)
Pass 1: Checking inodes, blocks, and sizes
Pass 1E: Optimizing extent trees
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/tmp/foo.img: ***** FILE SYSTEM WAS MODIFIED *****
/tmp/foo.img: 71184/327680 files (6.4% non-contiguous), 334994/1310720
blocks
/mnt/test: OK
/mnt/test2: OK
/mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin: FAILED
md5sum: WARNING: 1 computed checksum did NOT match
--------------e---- /mnt/test
--------------e---- /mnt/test2
--------------e---- /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
real 1m46.829s
user 17m41.611s
sys 1m39.268s
Thanks & Kind Regards,
Marc
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent".
2017-05-23 17:41 ` Marc Thomas
@ 2017-05-23 18:42 ` Darrick J. Wong
0 siblings, 0 replies; 5+ messages in thread
From: Darrick J. Wong @ 2017-05-23 18:42 UTC (permalink / raw)
To: Marc Thomas; +Cc: linux-ext4, Theodore Ts'o
On Tue, May 23, 2017 at 06:41:48PM +0100, Marc Thomas wrote:
> Hi Ted,
>
> Firstly, thank-you for taking the time to look into this.
>
> I have modified your script with a "test-case" which fails for me every
> time, as I noticed there is a file created by a Linux kernel build which
> exhibits the problem (arch/x86/realmode/rm/realmode.bin) I think any
> kernel version 3.10.x on x86_64 ought to work. I've also included a few
> extra steps which I'm doing as part of the ext3 to ext4 conversion, and
> added the "-b" (binary) flag to md5sum.
>
> Here's the script. I'm not sure if it counts as a "simple" repro! :)
Aha! There is a bug in e2fsck's bmap2extent converter that merges bmap
records that are physically but not logically contiguous. I will
produce a patch + testcase shortly.
--D
>
>
> ----- cut here -----
> #!/bin/bash
>
> rm -f /tmp/foo.img
> mke2fs -t ext3 -Fq /tmp/foo.img 5G
> mount -o loop /tmp/foo.img /mnt
> dd if=/etc/motd of=/mnt/test bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test conv=notrunc seek=32 bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test2 bs=4k >& /dev/null
> dd if=/etc/motd of=/mnt/test2 conv=notrunc seek=1024 bs=4k >& /dev/null
> md5sum -b /mnt/test* > /mnt/MD5SUMS
> lsattr /mnt/test*
> ( cd /mnt
> xz -dc ~marc/LinuxStuff/Linux4.x/linux-4.11.1.tar.xz | tar xvf -
> cd linux-4.11.1
> make defconfig
> time make -j 15 ) > /dev/null 2>&1
> md5sum -b /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin >>
> /mnt/MD5SUMS
> lsattr /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
> umount /mnt
> tune2fs -O extents,uninit_bg,dir_index /tmp/foo.img
> e2fsck -fy /tmp/foo.img
> e2fsck -fy -E bmap2extent /tmp/foo.img
> mount -o loop /tmp/foo.img /mnt
> md5sum -c /mnt/MD5SUMS
> lsattr /mnt/test*
> lsattr /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
> umount /mnt
> ----- cut here -----
>
>
> The output of the script is:
>
> time ./marc2.sh
> ------------------- /mnt/test
> ------------------- /mnt/test2
> ------------------- /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
> tune2fs 1.43.4 (31-Jan-2017)
> e2fsck 1.43.4 (31-Jan-2017)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /tmp/foo.img: 71184/327680 files (6.4% non-contiguous), 338237/1310720
> blocks
> e2fsck 1.43.4 (31-Jan-2017)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 1E: Optimizing extent trees
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
>
> /tmp/foo.img: ***** FILE SYSTEM WAS MODIFIED *****
> /tmp/foo.img: 71184/327680 files (6.4% non-contiguous), 334994/1310720
> blocks
> /mnt/test: OK
> /mnt/test2: OK
> /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin: FAILED
> md5sum: WARNING: 1 computed checksum did NOT match
> --------------e---- /mnt/test
> --------------e---- /mnt/test2
> --------------e---- /mnt/linux-4.11.1/arch/x86/realmode/rm/realmode.bin
>
> real 1m46.829s
> user 17m41.611s
> sys 1m39.268s
>
> Thanks & Kind Regards,
> Marc
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-23 18:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-22 23:20 e2fsck bug? - Sparse files corrupt after "e2fsck -E bmap2extent" Marc Thomas
2017-05-23 15:41 ` Theodore Ts'o
2017-05-23 16:25 ` Darrick J. Wong
2017-05-23 17:41 ` Marc Thomas
2017-05-23 18:42 ` Darrick J. Wong
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.