All of lore.kernel.org
 help / color / mirror / Atom feed
* BTRFS checksum mismatch - false positives
@ 2019-09-23 18:19 hoegge
  2019-09-23 19:11 ` Chris Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: hoegge @ 2019-09-23 18:19 UTC (permalink / raw)
  To: linux-btrfs

Dear BTRFS mailing list,

I'm running BTRFS on my Synology Diskstation and they have referred me to
the BTRFS developers.

For a while I have had quite a few (tens - not hundreds) checksum mismatch
errors on my device (around 6 TB data). It runs BTRFS on SHR (Synology
Hybrid Raid). Most of these checksum mismatches, though, do not seem "real".
Most of the files are identical to the original files (checked by binary
comparison and by recalculated MD5 hashes). 

What can explain that problem? I thought BTRFS only reported checksum
mismatch errors, when it cannot self-heal the files?

Best
Hoegge




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS checksum mismatch - false positives
  2019-09-23 18:19 BTRFS checksum mismatch - false positives hoegge
@ 2019-09-23 19:11 ` Chris Murphy
  2019-09-23 20:24   ` hoegge
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Murphy @ 2019-09-23 19:11 UTC (permalink / raw)
  To: hoegge; +Cc: Btrfs BTRFS

On Mon, Sep 23, 2019 at 12:19 PM <hoegge@gmail.com> wrote:
>
> Dear BTRFS mailing list,
>
> I'm running BTRFS on my Synology Diskstation and they have referred me to
> the BTRFS developers.

If it's a generic question that's fine, but all the development
happening on this list is very recent kernels and btrfs-progs, which
is not typically the case with distribution specific products.

>
> For a while I have had quite a few (tens - not hundreds) checksum mismatch
> errors on my device (around 6 TB data). It runs BTRFS on SHR (Synology
> Hybrid Raid). Most of these checksum mismatches, though, do not seem "real".
> Most of the files are identical to the original files (checked by binary
> comparison and by recalculated MD5 hashes).
>
> What can explain that problem? I thought BTRFS only reported checksum
> mismatch errors, when it cannot self-heal the files?

It'll report them in any case, and also report if they're fixed. There
are different checksum errors depending on whether metadata or data
are affected (both are checksummed). Btrfs can only self-heal with
redundant copies available. By default metadata is duplicated and
should be able to self-heal, but data is not redundant by default. So
it'd depend on how the storage stack layout is created.

We need logs though. It's better to have more than less, if you can go
back ~5 minutes from the first report of checksum mismatch error,
that's better than too aggressive log trimming. Also possibly useful:

# uname -a
# btrfs --version


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: BTRFS checksum mismatch - false positives
  2019-09-23 19:11 ` Chris Murphy
@ 2019-09-23 20:24   ` hoegge
  2019-09-23 20:59     ` Chris Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: hoegge @ 2019-09-23 20:24 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: 'Btrfs BTRFS'

Hi Chris

uname:
Linux MHPNAS 3.10.105 #24922 SMP Wed Jul 3 16:37:24 CST 2019 x86_64 GNU/Linux synology_avoton_1815+

btrfs --version                                                                                            
btrfs-progs v4.0

ash-4.3# btrfs device stats .                                                                                                                                                                                                               
[/dev/mapper/vg1-volume_1].write_io_errs   0
[/dev/mapper/vg1-volume_1].read_io_errs    0
[/dev/mapper/vg1-volume_1].flush_io_errs   0
[/dev/mapper/vg1-volume_1].corruption_errs 1014
[/dev/mapper/vg1-volume_1].generation_errs 0

Concerning self healing? Synology run BTRFS on top of their SHR - which means, this where there is redundancy (like RAID5 / RAID6). I don't think they use any BTRFS RAID  (likely due to the RAID5/6 issues with BTRFS). Does that then mean, there is no redundancy / self-healing available for data? 

How would you like the log files - in private mail. I assume it is the kern.log. To make them useful, I suppose I should also pinpoint which files seem to be intact?

I gather it is the "BTRFS: (null) at logical ... " line that indicate mismatch errors ? Not sure why the state "(null"). Like:

2019-09-22T16:52:09+02:00 MHPNAS kernel: [1208505.999676] BTRFS: (null) at logical 1123177283584 on dev /dev/vg1/volume_1, sector 2246150816, root 259, inode 305979, offset 1316306944, length 4096, links 1 (path: Backup/Virtual Machines/Kan slettes/Smaller Clone of Windows 7 x64 for win 10 upgrade.vmwarevm/Windows 7 x64-cl1.vmdk)

Best
Hoegge

-----Original Message-----
From: Chris Murphy <lists@colorremedies.com> 
Sent: 2019-09-23 21:12
To: hoegge@gmail.com
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS checksum mismatch - false positives

On Mon, Sep 23, 2019 at 12:19 PM <hoegge@gmail.com> wrote:
>
> Dear BTRFS mailing list,
>
> I'm running BTRFS on my Synology Diskstation and they have referred me 
> to the BTRFS developers.

If it's a generic question that's fine, but all the development happening on this list is very recent kernels and btrfs-progs, which is not typically the case with distribution specific products.

>
> For a while I have had quite a few (tens - not hundreds) checksum 
> mismatch errors on my device (around 6 TB data). It runs BTRFS on SHR 
> (Synology Hybrid Raid). Most of these checksum mismatches, though, do not seem "real".
> Most of the files are identical to the original files (checked by 
> binary comparison and by recalculated MD5 hashes).
>
> What can explain that problem? I thought BTRFS only reported checksum 
> mismatch errors, when it cannot self-heal the files?

It'll report them in any case, and also report if they're fixed. There are different checksum errors depending on whether metadata or data are affected (both are checksummed). Btrfs can only self-heal with redundant copies available. By default metadata is duplicated and should be able to self-heal, but data is not redundant by default. So it'd depend on how the storage stack layout is created.

We need logs though. It's better to have more than less, if you can go back ~5 minutes from the first report of checksum mismatch error, that's better than too aggressive log trimming. Also possibly useful:

# uname -a
# btrfs --version


--
Chris Murphy


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS checksum mismatch - false positives
  2019-09-23 20:24   ` hoegge
@ 2019-09-23 20:59     ` Chris Murphy
  2019-09-24  9:29       ` hoegge
  2019-09-24 13:42       ` hoegge
  0 siblings, 2 replies; 10+ messages in thread
From: Chris Murphy @ 2019-09-23 20:59 UTC (permalink / raw)
  To: hoegge; +Cc: Chris Murphy, Btrfs BTRFS

On Mon, Sep 23, 2019 at 2:24 PM <hoegge@gmail.com> wrote:
>
> Hi Chris
>
> uname:
> Linux MHPNAS 3.10.105 #24922 SMP Wed Jul 3 16:37:24 CST 2019 x86_64 GNU/Linux synology_avoton_1815+
>
> btrfs --version
> btrfs-progs v4.0
>
> ash-4.3# btrfs device stats .
> [/dev/mapper/vg1-volume_1].write_io_errs   0
> [/dev/mapper/vg1-volume_1].read_io_errs    0
> [/dev/mapper/vg1-volume_1].flush_io_errs   0
> [/dev/mapper/vg1-volume_1].corruption_errs 1014
> [/dev/mapper/vg1-volume_1].generation_errs 0

I'm pretty sure these values are per 4KiB block on x86. If that's
correct, this is ~4MiB of corruption.


> Concerning self healing? Synology run BTRFS on top of their SHR - which means, this where there is redundancy (like RAID5 / RAID6). I don't think they use any BTRFS RAID  (likely due to the RAID5/6 issues with BTRFS). Does that then mean, there is no redundancy / self-healing available for data?

That's correct. What do you get for

# btrfs fi show
# btrfs fi df <mountpoint>

mountpoint is for the btrfs volume - any location it's mounted on will do



> How would you like the log files - in private mail. I assume it is the kern.log. To make them useful, I suppose I should also pinpoint which files seem to be intact?

You could do a firefox send which will encrypt it locally and allow
you to put a limit on the number of times it can be downloaded if you
want to avoid bots from seeing it. *shrug*

>
> I gather it is the "BTRFS: (null) at logical ... " line that indicate mismatch errors ? Not sure why the state "(null"). Like:
>
> 2019-09-22T16:52:09+02:00 MHPNAS kernel: [1208505.999676] BTRFS: (null) at logical 1123177283584 on dev /dev/vg1/volume_1, sector 2246150816, root 259, inode 305979, offset 1316306944, length 4096, links 1 (path: Backup/Virtual Machines/Kan slettes/Smaller Clone of Windows 7 x64 for win 10 upgrade.vmwarevm/Windows 7 x64-cl1.vmdk)

If they're all like this one, this is strictly a data corruption
issue. You can resolve it by replacing it with a known good copy. Or
you can unmount the Btrfs file system and use 'btrfs restore' to
scrape out the "bad" copy. Whenever there's a checksum error like this
on Btrfs, it will EIO to user space, it will not let you copy out this
file if it thinks it's corrupt. Whereas 'btrfs restore' will let you
do it. That particular version you have, I'm not sure if it'll
complain, but if so, there's a flag to make it ignore errors so you
can still get that file out. Then remount, and copy that file right on
top of itself. Of course this isn't fixing corruption if it's real, it
just makes the checksum warnings go away.

I'm gonna guess Synology has a way to do a scrub and check the results
but I don't know how it works, whether it does a Btrfs only scrub or
also an md scrub. You'd need to ask them or infer it from how this
whole stack is assembled and what processes get used. But you can do
an md scrub on your own. From 'man 4 md'

 "      md arrays can be scrubbed by writing either check or repair to
the file md/sync_action in the sysfs directory for the device."

You'd probably want to do a check. If you write repair, then md
assumes data chunks are good, and merely rewrites all new parity
chunks. The check will compare data chunks to parity chunks and report
any mismatch in

"       A count of mismatches is recorded in the sysfs file
md/mismatch_cnt.  This is set to zero when a scrub starts and is
incremented whenever a  sector "

That should be 0.

If that is not a 0 then there's a chance there's been some form of
silent data corruption since that file was originally copied to the
NAS. But offhand I can't account for why they trigger checksum
mismatches on Btrfs and yet md5 matches the original files elsewhere.

Are you sharing the vmdk over the network to a VM? Or is it static and
totally unused while on the NAS?



Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: BTRFS checksum mismatch - false positives
  2019-09-23 20:59     ` Chris Murphy
@ 2019-09-24  9:29       ` hoegge
  2019-09-24 21:46         ` Chris Murphy
  2019-09-24 13:42       ` hoegge
  1 sibling, 1 reply; 10+ messages in thread
From: hoegge @ 2019-09-24  9:29 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: 'Btrfs BTRFS'

# btrfs fi show
gives no result - not when adding path either

# btrfs fi df /volume1
Data, single: total=4.38TiB, used=4.30TiB
System, DUP: total=8.00MiB, used=96.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=89.50GiB, used=6.63GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

Here is the log:
https://send.firefox.com/download/5a19aee66a42c04e/#PTt0UkT53Wrxe9EjCQfrWA (password in separate e-mail)
I have removed a few mac-addresses and things before a certain data (that contained all other kinds of info). Let me know if it is too little.

Concerning restoring files - I should have all originals backed up, so assume I can just delete the bad ones and restore the originals. That would take care also of all the checksums, right? But BTRFS does not do anything to prevent the bad blocks from being used again, right?
I'll ask Synology about their stack.

I can't find sysfs on the system - should it be mounted uner /sys ? This is what I have:
morten@MHPNAS:/$ cd sys
morten@MHPNAS:/sys$ ls
block  bus  class  dev  devices  firmware  fs  kernel  module  power
morten@MHPNAS:/sys$ cd fs
morten@MHPNAS:/sys/fs$ ls
btrfs  cgroup  ecryptfs  ext4  fuse  pstore
morten@MHPNAS:/sys/fs$


With respect to the vmdk, I only store it on the NAS for backup. 

Thanks a lot

Best 
Hoegge

-----Original Message-----
From: Chris Murphy <lists@colorremedies.com> 
Sent: 2019-09-23 22:59
To: hoegge@gmail.com
Cc: Chris Murphy <lists@colorremedies.com>; Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS checksum mismatch - false positives

On Mon, Sep 23, 2019 at 2:24 PM <hoegge@gmail.com> wrote:
>
> Hi Chris
>
> uname:
> Linux MHPNAS 3.10.105 #24922 SMP Wed Jul 3 16:37:24 CST 2019 x86_64 
> GNU/Linux synology_avoton_1815+
>
> btrfs --version
> btrfs-progs v4.0
>
> ash-4.3# btrfs device stats .
> [/dev/mapper/vg1-volume_1].write_io_errs   0
> [/dev/mapper/vg1-volume_1].read_io_errs    0
> [/dev/mapper/vg1-volume_1].flush_io_errs   0
> [/dev/mapper/vg1-volume_1].corruption_errs 1014 
> [/dev/mapper/vg1-volume_1].generation_errs 0

I'm pretty sure these values are per 4KiB block on x86. If that's correct, this is ~4MiB of corruption.


> Concerning self healing? Synology run BTRFS on top of their SHR - which means, this where there is redundancy (like RAID5 / RAID6). I don't think they use any BTRFS RAID  (likely due to the RAID5/6 issues with BTRFS). Does that then mean, there is no redundancy / self-healing available for data?

That's correct. What do you get for

# btrfs fi show
# btrfs fi df <mountpoint>

mountpoint is for the btrfs volume - any location it's mounted on will do



> How would you like the log files - in private mail. I assume it is the kern.log. To make them useful, I suppose I should also pinpoint which files seem to be intact?

You could do a firefox send which will encrypt it locally and allow you to put a limit on the number of times it can be downloaded if you want to avoid bots from seeing it. *shrug*

>
> I gather it is the "BTRFS: (null) at logical ... " line that indicate mismatch errors ? Not sure why the state "(null"). Like:
>
> 2019-09-22T16:52:09+02:00 MHPNAS kernel: [1208505.999676] BTRFS: 
> (null) at logical 1123177283584 on dev /dev/vg1/volume_1, sector 
> 2246150816, root 259, inode 305979, offset 1316306944, length 4096, 
> links 1 (path: Backup/Virtual Machines/Kan slettes/Smaller Clone of 
> Windows 7 x64 for win 10 upgrade.vmwarevm/Windows 7 x64-cl1.vmdk)

If they're all like this one, this is strictly a data corruption issue. You can resolve it by replacing it with a known good copy. Or you can unmount the Btrfs file system and use 'btrfs restore' to scrape out the "bad" copy. Whenever there's a checksum error like this on Btrfs, it will EIO to user space, it will not let you copy out this file if it thinks it's corrupt. Whereas 'btrfs restore' will let you do it. That particular version you have, I'm not sure if it'll complain, but if so, there's a flag to make it ignore errors so you can still get that file out. Then remount, and copy that file right on top of itself. Of course this isn't fixing corruption if it's real, it just makes the checksum warnings go away.

I'm gonna guess Synology has a way to do a scrub and check the results but I don't know how it works, whether it does a Btrfs only scrub or also an md scrub. You'd need to ask them or infer it from how this whole stack is assembled and what processes get used. But you can do an md scrub on your own. From 'man 4 md'

 "      md arrays can be scrubbed by writing either check or repair to
the file md/sync_action in the sysfs directory for the device."

You'd probably want to do a check. If you write repair, then md assumes data chunks are good, and merely rewrites all new parity chunks. The check will compare data chunks to parity chunks and report any mismatch in

"       A count of mismatches is recorded in the sysfs file
md/mismatch_cnt.  This is set to zero when a scrub starts and is incremented whenever a  sector "

That should be 0.

If that is not a 0 then there's a chance there's been some form of silent data corruption since that file was originally copied to the NAS. But offhand I can't account for why they trigger checksum mismatches on Btrfs and yet md5 matches the original files elsewhere.

Are you sharing the vmdk over the network to a VM? Or is it static and totally unused while on the NAS?



Chris Murphy


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: BTRFS checksum mismatch - false positives
  2019-09-23 20:59     ` Chris Murphy
  2019-09-24  9:29       ` hoegge
@ 2019-09-24 13:42       ` hoegge
  2019-09-24 22:05         ` Chris Murphy
  2019-09-26 20:27         ` Chris Murphy
  1 sibling, 2 replies; 10+ messages in thread
From: hoegge @ 2019-09-24 13:42 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: 'Btrfs BTRFS'

Sorry forgot root when issuing commands:

ash-4.3# btrfs fi show                                                                                                      
Label: '2016.05.06-09:13:52 v7321'  uuid: 63121c18-2bed-4c81-a514-77be2fba7ab8
Total devices 1 FS bytes used 4.31TiB
devid    1 size 9.97TiB used 4.55TiB path /dev/mapper/vg1-volume_1

ash-4.3# btrfs device stats /volume1/                                                                                       
[/dev/mapper/vg1-volume_1].write_io_errs   0
[/dev/mapper/vg1-volume_1].read_io_errs    0
[/dev/mapper/vg1-volume_1].flush_io_errs   0
[/dev/mapper/vg1-volume_1].corruption_errs 1014
[/dev/mapper/vg1-volume_1].generation_errs 0

ash-4.3# btrfs fi df /volume1/                                                                                              
Data, single: total=4.38TiB, used=4.30TiB
System, DUP: total=8.00MiB, used=96.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=89.50GiB, used=6.63GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

Synology indicates that BTRFS can do self healing of data using RAID information? Is that really the case if it is not a "BTRFS raid" but a MD or SHR raid?





-----Original Message-----
From: Chris Murphy <lists@colorremedies.com> 
Sent: 2019-09-23 22:59
To: hoegge@gmail.com
Cc: Chris Murphy <lists@colorremedies.com>; Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS checksum mismatch - false positives

On Mon, Sep 23, 2019 at 2:24 PM <hoegge@gmail.com> wrote:
>
> Hi Chris
>
> uname:
> Linux MHPNAS 3.10.105 #24922 SMP Wed Jul 3 16:37:24 CST 2019 x86_64 
> GNU/Linux synology_avoton_1815+
>
> btrfs --version
> btrfs-progs v4.0
>
> ash-4.3# btrfs device stats .
> [/dev/mapper/vg1-volume_1].write_io_errs   0
> [/dev/mapper/vg1-volume_1].read_io_errs    0
> [/dev/mapper/vg1-volume_1].flush_io_errs   0
> [/dev/mapper/vg1-volume_1].corruption_errs 1014 
> [/dev/mapper/vg1-volume_1].generation_errs 0

I'm pretty sure these values are per 4KiB block on x86. If that's correct, this is ~4MiB of corruption.


> Concerning self healing? Synology run BTRFS on top of their SHR - which means, this where there is redundancy (like RAID5 / RAID6). I don't think they use any BTRFS RAID  (likely due to the RAID5/6 issues with BTRFS). Does that then mean, there is no redundancy / self-healing available for data?

That's correct. What do you get for

# btrfs fi show
# btrfs fi df <mountpoint>

mountpoint is for the btrfs volume - any location it's mounted on will do



> How would you like the log files - in private mail. I assume it is the kern.log. To make them useful, I suppose I should also pinpoint which files seem to be intact?

You could do a firefox send which will encrypt it locally and allow you to put a limit on the number of times it can be downloaded if you want to avoid bots from seeing it. *shrug*

>
> I gather it is the "BTRFS: (null) at logical ... " line that indicate mismatch errors ? Not sure why the state "(null"). Like:
>
> 2019-09-22T16:52:09+02:00 MHPNAS kernel: [1208505.999676] BTRFS: 
> (null) at logical 1123177283584 on dev /dev/vg1/volume_1, sector 
> 2246150816, root 259, inode 305979, offset 1316306944, length 4096, 
> links 1 (path: Backup/Virtual Machines/Kan slettes/Smaller Clone of 
> Windows 7 x64 for win 10 upgrade.vmwarevm/Windows 7 x64-cl1.vmdk)

If they're all like this one, this is strictly a data corruption issue. You can resolve it by replacing it with a known good copy. Or you can unmount the Btrfs file system and use 'btrfs restore' to scrape out the "bad" copy. Whenever there's a checksum error like this on Btrfs, it will EIO to user space, it will not let you copy out this file if it thinks it's corrupt. Whereas 'btrfs restore' will let you do it. That particular version you have, I'm not sure if it'll complain, but if so, there's a flag to make it ignore errors so you can still get that file out. Then remount, and copy that file right on top of itself. Of course this isn't fixing corruption if it's real, it just makes the checksum warnings go away.

I'm gonna guess Synology has a way to do a scrub and check the results but I don't know how it works, whether it does a Btrfs only scrub or also an md scrub. You'd need to ask them or infer it from how this whole stack is assembled and what processes get used. But you can do an md scrub on your own. From 'man 4 md'

 "      md arrays can be scrubbed by writing either check or repair to
the file md/sync_action in the sysfs directory for the device."

You'd probably want to do a check. If you write repair, then md assumes data chunks are good, and merely rewrites all new parity chunks. The check will compare data chunks to parity chunks and report any mismatch in

"       A count of mismatches is recorded in the sysfs file
md/mismatch_cnt.  This is set to zero when a scrub starts and is incremented whenever a  sector "

That should be 0.

If that is not a 0 then there's a chance there's been some form of silent data corruption since that file was originally copied to the NAS. But offhand I can't account for why they trigger checksum mismatches on Btrfs and yet md5 matches the original files elsewhere.

Are you sharing the vmdk over the network to a VM? Or is it static and totally unused while on the NAS?



Chris Murphy


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS checksum mismatch - false positives
  2019-09-24  9:29       ` hoegge
@ 2019-09-24 21:46         ` Chris Murphy
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Murphy @ 2019-09-24 21:46 UTC (permalink / raw)
  To: hoegge; +Cc: Chris Murphy, Btrfs BTRFS

On Tue, Sep 24, 2019 at 3:29 AM <hoegge@gmail.com> wrote:
>
> # btrfs fi show
> gives no result - not when adding path either
>
> # btrfs fi df /volume1
> Data, single: total=4.38TiB, used=4.30TiB
> System, DUP: total=8.00MiB, used=96.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=89.50GiB, used=6.63GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> Here is the log:
> https://send.firefox.com/download/5a19aee66a42c04e/#PTt0UkT53Wrxe9EjCQfrWA (password in separate e-mail)
> I have removed a few mac-addresses and things before a certain data (that contained all other kinds of info). Let me know if it is too little.

I think they were having problems, I kept getting 502 errors, and then
reached the download retries limit I bet.



>
> Concerning restoring files - I should have all originals backed up, so assume I can just delete the bad ones and restore the originals. That would take care also of all the checksums, right? But BTRFS does not do anything to prevent the bad blocks from being used again, right?

We don't know if they're bad sectors or not yet. That would be shown
by libata as a read or write error. A read error, md will handle by
reconstructing the data from parity and write it back to the drive.
And at that time the drive firmware will determine if the write
succeeds or not and if not it'll internally mark that physical sector
as bad and remap the LBA for that sector to a different reserve
physical sector.


> I'll ask Synology about their stack.
>
> I can't find sysfs on the system - should it be mounted uner /sys ? This is what I have:

# echo check > /sys/block/mdX/md/sync_action

replace the X with the raid you're checking - this is a bit different
if they're using LVM raid.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS checksum mismatch - false positives
  2019-09-24 13:42       ` hoegge
@ 2019-09-24 22:05         ` Chris Murphy
  2019-09-26 20:27         ` Chris Murphy
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Murphy @ 2019-09-24 22:05 UTC (permalink / raw)
  To: hoegge; +Cc: Chris Murphy, Btrfs BTRFS

On Tue, Sep 24, 2019 at 7:42 AM <hoegge@gmail.com> wrote:
>
> Sorry forgot root when issuing commands:
>
> ash-4.3# btrfs fi show
> Label: '2016.05.06-09:13:52 v7321'  uuid: 63121c18-2bed-4c81-a514-77be2fba7ab8
> Total devices 1 FS bytes used 4.31TiB
> devid    1 size 9.97TiB used 4.55TiB path /dev/mapper/vg1-volume_1

OK so you can do
# pvs

And that should show what makes up that logical volume. And you can
also double check with

# cat /proc/mdstat


> Data, single: total=4.38TiB, used=4.30TiB
> System, DUP: total=8.00MiB, used=96.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=89.50GiB, used=6.63GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B

Yeah there's a couple of issues there that aren't problems per se. But
with the older kernel, it's probably a good idea to reduce the large
number of unused metadata block groups:

# btrfs balance start -mconvert=dup,soft /mountpoint/    ##no idea
where the mount point is for your btrfs volume

that command will get rid of those empty single profile system and
metadata block groups. It should complete almost instantly.

# btrfs balance start -musage=25 /mountpoint

That will find block groups with 25% or less usage, move and
consolidate their extents into new metadata block groups and then
delete the old ones. 25% is pretty conservative. There's ~89GiB
allocated to metadata, but only ~7GiB is used. So this command will
find the tiny bits of metadata strewn out over those 89GiB and
consolidate them, and basically it'll free up a bunch of space.

It's not really necessary to do this, you've got a ton of free space
left, only 1/2 the pool is used

9.97TiB used 4.55TiB

>
> Synology indicates that BTRFS can do self healing of data using RAID information? Is that really the case if it is not a "BTRFS raid" but a MD or SHR raid?

Btrfs will only self heal the metadata in this file system, because
there's two copies of metadata. It can't do self heal on data. That'd
be up to whatever lower layer is providing the RAID capability and
whether md or lvm based, it depends on the drive itself spitting out
some kind of discrete read or write error in order for md/lvm to know
what to do. There are no checksums available to it, so it has no idea
if the data is corrupt. It only knows if a drive complains, it needs
to attempt reconstruction. If that reconstruction produces corrupt
data, Btrfs still detects it and will report on it, but it can't fix
it.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS checksum mismatch - false positives
  2019-09-24 13:42       ` hoegge
  2019-09-24 22:05         ` Chris Murphy
@ 2019-09-26 20:27         ` Chris Murphy
  2019-09-30  9:31           ` hoegge
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Murphy @ 2019-09-26 20:27 UTC (permalink / raw)
  To: hoegge; +Cc: Chris Murphy, Btrfs BTRFS

From the log offlist

2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.396165] md: invalid
raid superblock magic on sda5
2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.401816] md: sda5 does
not have a valid v0.90 superblock, not importing!

That doesn't sound good. It's not a Btrfs problem but a md/mdadm
problem. You'll have to get support for this from Synology, only they
understand the design of the storage stack layout and whether these
error messages are important or not and how to fix them. Anyone else
speculating could end up causing damage to the NAS and data to be
lost.

--------
2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.913298] md: sda2 has
different UUID to sda1

There are several messages like this. I can't tell if they're just
informational and benign or a problem. Also not related to Btrfs.

--------
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.419199] BTRFS warning
(device dm-1): BTRFS: dm-1 checksum verify failed on 375259512832
wanted EA1A10E3 found 3080B64F level 0
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.419199]
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.458453] BTRFS warning
(device dm-1): BTRFS: dm-1 checksum verify failed on 375259512832
wanted EA1A10E3 found 3080B64F level 0
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.458453]
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.528385] BTRFS: read
error corrected: ino 1 off 375259512832 (dev /dev/vg1/volume_1 sector
751819488)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.539631] BTRFS: read
error corrected: ino 1 off 375259516928 (dev /dev/vg1/volume_1 sector
751819496)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.550785] BTRFS: read
error corrected: ino 1 off 375259521024 (dev /dev/vg1/volume_1 sector
751819504)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.561990] BTRFS: read
error corrected: ino 1 off 375259525120 (dev /dev/vg1/volume_1 sector
751819512)

There are a bunch of messages like this. Btrfs is finding metadata
checksum errors, some kind of corruption has happened with one of the
copies, and it's been fixed up. But why are things being corrupt in
the first place? Ordinary bad sectors maybe? There's a lot of these  -
like really a lot. Hundreds of affected sectors. There are too many
for me to read through and see if all of them were corrected by DUP
metadata.

--------

2019-09-22T21:24:27+02:00 MHPNAS kernel: [1224856.764098] md2:
syno_self_heal_is_valid_md_stat(496): md's current state is not
suitable for data correction

What does that mean? Also not a Btrfs problem. There are quite a few of these.

--------

2019-09-23T11:49:20+02:00 MHPNAS kernel: [1276791.652946] BTRFS error
(device dm-1): BTRFS: dm-1 failed to repair btree csum error on
1353162506240, mirror = 1

OK and a few of these also. This means that some metadata could not be
repaired, likely because both copies are corrupt.

My recommendation is to freshen your backups now while you still can,
and prepare to rebuild the NAS. i.e. these are not likely repairable
problems. Once both copies of Btrfs metadata are bad, it's usually not
fixable you just have to recreate the file system from scratch.

You'll have to move everything off the NAS - and anything that's
really important you will want at least two independent copies of, of
course, and then you're going to obliterate the array and start from
scratch. While you're at it, you might as well make sure you've got
the latest supported version of the software for this product. Start
with that. Then follow the Synology procedure to wipe the NAS totally
and set it up again. You'll want to make sure the procedure you use
writes out all new metadata for everything: mdadm, lvm, Btrfs. Nothing
stale or old reused. And then you'll copy you data back over to the
NAS.

There's nothing in the provided log that helps me understand why this
is happening. I suspect hardware problems of some sort - maybe one of
the drives is starting to slowly die, by spitting out bad sectors. To
know more about that we'd need to see 'smartctl -x /dev/' for each
drive in the NAS and see if smart gives a clue. Somewhere around 50/50
shot that smart will predict a drive failure in advance. So my
suggestion again, without delay, is to make sure the NAS is backed up,
and keep those backups fresh. You can recreate the NAS when you have
free time - but these problems likely will get worse.



---
Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: BTRFS checksum mismatch - false positives
  2019-09-26 20:27         ` Chris Murphy
@ 2019-09-30  9:31           ` hoegge
  0 siblings, 0 replies; 10+ messages in thread
From: hoegge @ 2019-09-30  9:31 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: 'Btrfs BTRFS'

Hi Christ,

Thank you very much for your help and insight. Has helped me a lot understanding the setup with BTRFS on top of SHR. I will rebuild the whole thing from scratch, as you recommend, since it seems only to get worse. The file mapping tool from Synology (synodataverifier) used on the files with checksum errors indicate that the corrupted blocks are on two different discs (out of 8), one quite new (3 TB drive) and one older (500 GB drive). Having two partially defect drives on a RAID 5'ish system of course will create a lot of trouble. Only wonder why none of them have any errors on S.M.A.R.T. (also extended tests) at all - and that Synology did not report any read errors.

I'm halfway through my second full backup of all files in addition to my cloud backup, so then I'll toss the two suspicious drives and build a new system and RAID (RAID 6 / SHR 2) with two parity disks instead of just one. 

BTW, Do you have any idea of when the fixed RAID 5/6 BTRFS functionality might become mainstream, so we can abandon MD and simplify the stack to purely BTRFS?

Thanks again

Hoegge



-----Original Message-----
From: Chris Murphy <lists@colorremedies.com> 
Sent: 2019-09-26 22:28
To: hoegge@gmail.com
Cc: Chris Murphy <lists@colorremedies.com>; Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: BTRFS checksum mismatch - false positives

From the log offlist

2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.396165] md: invalid
raid superblock magic on sda5
2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.401816] md: sda5 does
not have a valid v0.90 superblock, not importing!

That doesn't sound good. It's not a Btrfs problem but a md/mdadm problem. You'll have to get support for this from Synology, only they understand the design of the storage stack layout and whether these error messages are important or not and how to fix them. Anyone else speculating could end up causing damage to the NAS and data to be lost.

--------
2019-09-08T17:27:02+02:00 MHPNAS kernel: [   22.913298] md: sda2 has
different UUID to sda1

There are several messages like this. I can't tell if they're just informational and benign or a problem. Also not related to Btrfs.

--------
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.419199] BTRFS warning (device dm-1): BTRFS: dm-1 checksum verify failed on 375259512832 wanted EA1A10E3 found 3080B64F level 0
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.419199]
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.458453] BTRFS warning (device dm-1): BTRFS: dm-1 checksum verify failed on 375259512832 wanted EA1A10E3 found 3080B64F level 0
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.458453]
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.528385] BTRFS: read error corrected: ino 1 off 375259512832 (dev /dev/vg1/volume_1 sector
751819488)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.539631] BTRFS: read error corrected: ino 1 off 375259516928 (dev /dev/vg1/volume_1 sector
751819496)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.550785] BTRFS: read error corrected: ino 1 off 375259521024 (dev /dev/vg1/volume_1 sector
751819504)
2019-09-08T22:09:33+02:00 MHPNAS kernel: [16997.561990] BTRFS: read error corrected: ino 1 off 375259525120 (dev /dev/vg1/volume_1 sector
751819512)

There are a bunch of messages like this. Btrfs is finding metadata checksum errors, some kind of corruption has happened with one of the copies, and it's been fixed up. But why are things being corrupt in the first place? Ordinary bad sectors maybe? There's a lot of these  - like really a lot. Hundreds of affected sectors. There are too many for me to read through and see if all of them were corrected by DUP metadata.

--------

2019-09-22T21:24:27+02:00 MHPNAS kernel: [1224856.764098] md2:
syno_self_heal_is_valid_md_stat(496): md's current state is not suitable for data correction

What does that mean? Also not a Btrfs problem. There are quite a few of these.

--------

2019-09-23T11:49:20+02:00 MHPNAS kernel: [1276791.652946] BTRFS error (device dm-1): BTRFS: dm-1 failed to repair btree csum error on 1353162506240, mirror = 1

OK and a few of these also. This means that some metadata could not be repaired, likely because both copies are corrupt.

My recommendation is to freshen your backups now while you still can, and prepare to rebuild the NAS. i.e. these are not likely repairable problems. Once both copies of Btrfs metadata are bad, it's usually not fixable you just have to recreate the file system from scratch.

You'll have to move everything off the NAS - and anything that's really important you will want at least two independent copies of, of course, and then you're going to obliterate the array and start from scratch. While you're at it, you might as well make sure you've got the latest supported version of the software for this product. Start with that. Then follow the Synology procedure to wipe the NAS totally and set it up again. You'll want to make sure the procedure you use writes out all new metadata for everything: mdadm, lvm, Btrfs. Nothing stale or old reused. And then you'll copy you data back over to the NAS.

There's nothing in the provided log that helps me understand why this is happening. I suspect hardware problems of some sort - maybe one of the drives is starting to slowly die, by spitting out bad sectors. To know more about that we'd need to see 'smartctl -x /dev/' for each drive in the NAS and see if smart gives a clue. Somewhere around 50/50 shot that smart will predict a drive failure in advance. So my suggestion again, without delay, is to make sure the NAS is backed up, and keep those backups fresh. You can recreate the NAS when you have free time - but these problems likely will get worse.



---
Chris Murphy


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-09-30  9:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 18:19 BTRFS checksum mismatch - false positives hoegge
2019-09-23 19:11 ` Chris Murphy
2019-09-23 20:24   ` hoegge
2019-09-23 20:59     ` Chris Murphy
2019-09-24  9:29       ` hoegge
2019-09-24 21:46         ` Chris Murphy
2019-09-24 13:42       ` hoegge
2019-09-24 22:05         ` Chris Murphy
2019-09-26 20:27         ` Chris Murphy
2019-09-30  9:31           ` hoegge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.