All of lore.kernel.org
 help / color / mirror / Atom feed
* Intregrity of files restored with btrfs restore
@ 2019-12-29 22:58 Alexander Veit
  2019-12-29 23:20 ` Chris Murphy
  2019-12-30  1:38 ` Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Alexander Veit @ 2019-12-29 22:58 UTC (permalink / raw)
  To: linux-btrfs

Hi,

btrfs restore has recovered files from a crashed partition.  The command
used was

btrfs restore -m -v /dev/sdX /dst/path/

without further options like -i etc.

Are the recovered files consistent in the sense that if the file was
committed to disk and was not open during the crash, then the content of
the file would be the same as before the crash, and that damage to files
during the crash (e.g. by random writes) would result in the file not
being recovered by btrfs restore?

I could not find a clear statement about this in the man page or the
btrfs wiki.

# uname -a
Linux healer 5.3.0-3-amd64 #1 SMP Debian 5.3.15-1 (2019-12-07) x86_64
GNU/Linux

# btrfs --version
btrfs-progs v5.4

The btrfs file system had been created in a system with a Linux 4.19.72
kernel.

-- 
Thanks in advance,
Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Intregrity of files restored with btrfs restore
  2019-12-29 22:58 Intregrity of files restored with btrfs restore Alexander Veit
@ 2019-12-29 23:20 ` Chris Murphy
  2019-12-30  0:23   ` Alexander Veit
  2019-12-30  1:38 ` Qu Wenruo
  1 sibling, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2019-12-29 23:20 UTC (permalink / raw)
  To: Alexander Veit; +Cc: Btrfs BTRFS

On Sun, Dec 29, 2019 at 4:05 PM Alexander Veit <list@nezwerg.de> wrote:
>
> Hi,
>
> btrfs restore has recovered files from a crashed partition.  The command
> used was
>
> btrfs restore -m -v /dev/sdX /dst/path/
>
> without further options like -i etc.
>
> Are the recovered files consistent in the sense that if the file was
> committed to disk and was not open during the crash, then the content of
> the file would be the same as before the crash, and that damage to files
> during the crash (e.g. by random writes) would result in the file not
> being recovered by btrfs restore?

In theory they're fine. But in practice it depends on how the
application is updating those files. It's possible the updates are
part of separate transactions, so some of the files may be updated and
other files not updated, depending on when the crash happened. But
since there are no overwrites in Btrfs (so long as the files haven't
had chattr +C set), and if the hardware is honoring barriers,  what
should be true with a crash is that a restore recovers the most recent
fully committed version of that file.

For example, a directory of 50 files that somehow relate to each
other, and some of them are being updated for some minutes prior to
the crash, it's possible some of those files have committed updates
and others don't. Some buffered writes may not get committed to stable
media for several transactions, depending on the application's fsync
strategy and how well tested it is.

See 'man 5 btrfs' and read about flushoncommit mount option for more info.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Intregrity of files restored with btrfs restore
  2019-12-29 23:20 ` Chris Murphy
@ 2019-12-30  0:23   ` Alexander Veit
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Veit @ 2019-12-30  0:23 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Am 30.12.2019 um 00:20 schrieb Chris Murphy:
> In theory they're fine. But in practice it depends on how the
> application is updating those files. It's possible the updates are
> part of separate transactions, so some of the files may be updated and
> other files not updated, depending on when the crash happened. But
> since there are no overwrites in Btrfs (so long as the files haven't
> had chattr +C set), and if the hardware is honoring barriers,  what
> should be true with a crash is that a restore recovers the most recent
> fully committed version of that file.
> ...

This sounds good. All files on the 6TB partition should have been writen
long before the crash (it's a backup disk that had been transfered to an
external enclosure that failed while reading).

5 files and 5 directories could not be recovered, but this is tolerable
for me.

-- 
Thank you very much,
Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Intregrity of files restored with btrfs restore
  2019-12-29 22:58 Intregrity of files restored with btrfs restore Alexander Veit
  2019-12-29 23:20 ` Chris Murphy
@ 2019-12-30  1:38 ` Qu Wenruo
  2019-12-30 14:05   ` Alexander Veit
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2019-12-30  1:38 UTC (permalink / raw)
  To: Alexander Veit, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1458 bytes --]



On 2019/12/30 上午6:58, Alexander Veit wrote:
> Hi,
> 
> btrfs restore has recovered files from a crashed partition.  The command
> used was
> 
> btrfs restore -m -v /dev/sdX /dst/path/
> 
> without further options like -i etc.
> 
> Are the recovered files consistent in the sense that if the file was
> committed to disk and was not open during the crash, then the content of
> the file would be the same as before the crash,

Normally crash shouldn't corrupt btrfs, it's either btrfs bug or
something else causing corruption.

> and that damage to files
> during the crash (e.g. by random writes) would result in the file not
> being recovered by btrfs restore?

The restore doesn't check data csum. And by default it reads the first
copy of data.

If the read succeeded, btrfs-restore just call it a day, thus no data
csum verification or anything else.

So it's not as good as you would expect.

Anyway, btrfs-restore is the last resort method, before that RO mount
and various rescue mount options should be tried before it.
Kernel will always verify data csum.

Thanks,
Qu

> 
> I could not find a clear statement about this in the man page or the
> btrfs wiki.
> 
> # uname -a
> Linux healer 5.3.0-3-amd64 #1 SMP Debian 5.3.15-1 (2019-12-07) x86_64
> GNU/Linux
> 
> # btrfs --version
> btrfs-progs v5.4
> 
> The btrfs file system had been created in a system with a Linux 4.19.72
> kernel.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Intregrity of files restored with btrfs restore
  2019-12-30  1:38 ` Qu Wenruo
@ 2019-12-30 14:05   ` Alexander Veit
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Veit @ 2019-12-30 14:05 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Am 30.12.2019 um 02:38 schrieb Qu Wenruo:
> 
> Normally crash shouldn't corrupt btrfs, it's either btrfs bug or
> something else causing corruption.

The file system had been corrupted after the hard disk (WD Gold 6 TB
WD6002FRYZ) that had been transfered from a PC to an external enclosure
that failed during read operations. The disk could not be mounted
anymore. According to SMART the drive itself does not have errors.

# dmesg
 BTRFS info (device sdc1): disk space caching is enabled
 BTRFS info (device sdc1): has skinny extents
 BTRFS error (device sdc1): parent transid verify failed on 97288192
wanted 248243 found 248241
 BTRFS error (device sdc1): parent transid verify failed on 97288192
wanted 248243 found 248241
 BTRFS error (device sdc1): failed to read block groups: -5
 BTRFS error (device sdc1): open_ctree failed

Mounting with

 mount -t btrfs -o rootflags=recovery,nospace_cache
 mount -t btrfs -o ro,rescue=skip_bg
 mount -t btrfs -o ro,usebackuproot

did not work either.

Similar error with btrfs-find-root
# btrfs-find-root /dev/sdc1
parent transid verify failed on 97288192 wanted 248243 found 248241
parent transid verify failed on 97288192 wanted 248243 found 248241
parent transid verify failed on 97288192 wanted 248243 found 248241
parent transid verify failed on 97288192 wanted 248243 found 248241
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=71925760 item=38 parent level=2
child level=0
Superblock thinks the generation is 248274
Superblock thinks the level is 1
Found tree root at 71794688 gen 248274 level 1
Well block 43712512(gen: 248228 level: 1) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 34603008(gen: 247269 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 34078720(gen: 247269 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 33931264(gen: 247269 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 33325056(gen: 247269 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 30375936(gen: 247269 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 30425088(gen: 247268 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1
Well block 30408704(gen: 247268 level: 0) seems good, but
generation/level doesn't match, want gen: 248274 level: 1

btrfs rescue zero-log with a file copy of the partition also gave errors.


I've also tried to apply the patch you have provided with
 https://patchwork.kernel.org/project/linux-btrfs/list/?series=130637
but unfortunately it does not apply to the Linux kernel sources I've
used (5.3.13).


During btrfs restore I encountered errors of the form

offset is ...
We seem to be looping a lot on /path/to/file.dat, do you want to keep
going on ? (y/N/a): y
...
offset is ...
ERROR: cannot map block logical 1643391221760 length 3221225472: -2
Error copying data for /path/to/file.dat
Error searching /path/to/file.dat
Error searching /path/to/file.dat

This led me to the conclusion that the file system is being corrupted.


> The restore doesn't check data csum. And by default it reads the first
> copy of data.
> 
> If the read succeeded, btrfs-restore just call it a day, thus no data
> csum verification or anything else.
> 
> So it's not as good as you would expect.

This sounds bad. What is the rationale behind btrfs restore not
verifying checksums?

I would expect a file with errors not to be restored (without opting for
doing so).

And even worse, there are cases where a "restored" file could be
synchronized with other storage locations and spread corruption this way.


> Anyway, btrfs-restore is the last resort method, before that RO mount
> and various rescue mount options should be tried before it.
> Kernel will always verify data csum.

Are there any options left to get all files that have not been corrupted
back?


-- 
Thank you very much,
Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-12-30 14:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-29 22:58 Intregrity of files restored with btrfs restore Alexander Veit
2019-12-29 23:20 ` Chris Murphy
2019-12-30  0:23   ` Alexander Veit
2019-12-30  1:38 ` Qu Wenruo
2019-12-30 14:05   ` Alexander Veit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.