linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BTRFS/EXT4 Data Corruption
@ 2020-06-28 18:55 Sebastian Hyrwall
  2020-06-30  9:22 ` David Sterba
  2020-07-02 21:16 ` Pavel Machek
  0 siblings, 2 replies; 3+ messages in thread
From: Sebastian Hyrwall @ 2020-06-28 18:55 UTC (permalink / raw)
  To: linux-kernel

Hi

Sorry if this is not the right place for this email but I can't think of 
another place (might be linux-fsdevel)
Someone here is ought to be an expert in this.

It all started as having file corruptions inside VMs that then led to 
alot of testing that
resulted in replicatable results on the backend NAS.

Tests where done by generating 100 1GB files from /dev/urandom to 
"volume1" (both BTRFS and EXT4 tested).
MD5 hashing the files and then copying the files to "volume2". 2-4% of 
the files would fail the hash match every time
the test was done.

After alot of fiddling around it turned out that the problem goes away 
if doing "cp --sparse=never"
when copying the files. This would to me exclude any hardware errors and 
feels more like something
deeper inside the kernel.

The box runs Kernel 3.10.105. Version >4 seems unaffected (not 100% 
confirmed, too few testboxes).

Here is a diff between a hexdump of a failed file,

43861581c43861581
< 29d464c0: aca0 d68f 0ff4 0bad fa4M-5 1339 8148 30e8 .........E.9.H0.
---
 > 29d464c0: aca0 d68f 0ff4 0bad fa45 1339 8148 30e8 .........E.9.H0.
55989446c55989446
< 35654c50: 31f4 f7b5 40be 2188 c539 043b 35b4 abb5 1...@.!..9.;5...
---
 > 35654c50: 3174 f7b5 40be 2188 c539 043b 35b4 abb5 1t..@.!..9.;5...

As you can see it's a single flipped bit (31f4, 3174). I'm not sure 
about "fa4M-5". Never seen "M-" before.


Details,

Linux 3.10.105,
Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz,
Volume ontop of lvm and md-raid,
md2 : active raid5 sda3[0] sdj3[5] sdi3[4] sdf3[3] sde3[2] sdb3[1]
       39046022720 blocks super 1.2 level 5, 64k chunk, algorithm 2 
[6/6] [UUUUUU],
cp (GNU coreutils) 8.24

BTRFS and EXT4 default mount options.



// Sebastian H


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BTRFS/EXT4 Data Corruption
  2020-06-28 18:55 BTRFS/EXT4 Data Corruption Sebastian Hyrwall
@ 2020-06-30  9:22 ` David Sterba
  2020-07-02 21:16 ` Pavel Machek
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2020-06-30  9:22 UTC (permalink / raw)
  To: Sebastian Hyrwall; +Cc: linux-kernel

On Mon, Jun 29, 2020 at 01:55:40AM +0700, Sebastian Hyrwall wrote:
> Sorry if this is not the right place for this email but I can't think of 
> another place (might be linux-fsdevel)

You can always CC the mailinglists of the filesystems.

> Someone here is ought to be an expert in this.
> 
> It all started as having file corruptions inside VMs that then led to 
> alot of testing that
> resulted in replicatable results on the backend NAS.
> 
> Tests where done by generating 100 1GB files from /dev/urandom to 
> "volume1" (both BTRFS and EXT4 tested).
> MD5 hashing the files and then copying the files to "volume2". 2-4% of 
> the files would fail the hash match every time
> the test was done.
> 
> After alot of fiddling around it turned out that the problem goes away 
> if doing "cp --sparse=never"
> when copying the files. This would to me exclude any hardware errors and 
> feels more like something
> deeper inside the kernel.

That the problem goes away when you use a completely different way to
write data maybe just hiding the fact that hardware is faulty.

Generating 100G of data will have different memory usage pattern and
likely spanning way more pages than the reflink approach that will be
metadata-only operation (adding the extent references).

> The box runs Kernel 3.10.105. Version >4 seems unaffected (not 100% 
> confirmed, too few testboxes).
> 
> Here is a diff between a hexdump of a failed file,
> 
> 43861581c43861581
> < 29d464c0: aca0 d68f 0ff4 0bad fa4M-5 1339 8148 30e8 .........E.9.H0.
> ---
>  > 29d464c0: aca0 d68f 0ff4 0bad fa45 1339 8148 30e8 .........E.9.H0.
> 55989446c55989446
> < 35654c50: 31f4 f7b5 40be 2188 c539 043b 35b4 abb5 1...@.!..9.;5...
> ---
>  > 35654c50: 3174 f7b5 40be 2188 c539 043b 35b4 abb5 1t..@.!..9.;5...
> 
> As you can see it's a single flipped bit (31f4, 3174). I'm not sure 
> about "fa4M-5". Never seen "M-" before.

If it's a bitflip, then it's faulty RAM. All other explanations like
random memory overwrites typically lead to whole byte or byte sequences.
The reasons for bad RAM could be a faulty module, but I've also seen
transient bitflips on a box without enough PSU power when the system was
under load. Which also makes it hard to make sure memtest will catch the
errors, as was in my case, because the disks were not active.

I'd recommend to stop using the machine for anything than testing.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BTRFS/EXT4 Data Corruption
  2020-06-28 18:55 BTRFS/EXT4 Data Corruption Sebastian Hyrwall
  2020-06-30  9:22 ` David Sterba
@ 2020-07-02 21:16 ` Pavel Machek
  1 sibling, 0 replies; 3+ messages in thread
From: Pavel Machek @ 2020-07-02 21:16 UTC (permalink / raw)
  To: Sebastian Hyrwall; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 719 bytes --]

Hi!

> After alot of fiddling around it turned out that the problem goes away if
> doing "cp --sparse=never"
> when copying the files. This would to me exclude any hardware errors and
> feels more like something
> deeper inside the kernel.

If files contain random data, they are never sparse. It is strange
that sparse=never would make any difference.

> The box runs Kernel 3.10.105. Version >4 seems unaffected (not 100%
> confirmed, too few testboxes).

I'm afraid relevant developers will not be willing to debug 3.10
kernel.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-07-02 21:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-28 18:55 BTRFS/EXT4 Data Corruption Sebastian Hyrwall
2020-06-30  9:22 ` David Sterba
2020-07-02 21:16 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).