All of lore.kernel.org
 help / color / mirror / Atom feed
* Need help recovering broken RAID5 array (parent transid verify failed)
@ 2020-05-15  6:03 Emil Heimpel
  2020-05-15 21:46 ` Chris Murphy
  0 siblings, 1 reply; 8+ messages in thread
From: Emil Heimpel @ 2020-05-15  6:03 UTC (permalink / raw)
  To: linux-btrfs


Hi,

I hope this is the right place to ask for help. I am unable to mount my BTRFS array and wanted to know, if it is possible to recover (some) data from it.

I have a RAID1-Metadata/RAID5-Data array consisting of 6 drives, 2x8TB, 5TB, 4TB and 2x3TB. It was running fine for the last 3 months. Because I expanded it drive by drive I wanted to do a full balance the other day, when after around 40% completion (ca 1.5 days) I noticed, that one drive was missing from the array (If I remember correctly, it was the 5TB one). I tried to cancel the balance, but even after a few hours it didn't cancel, so I tried to do a reboot. That didn't work either, so I did a hard reset. Probably not the best idea, I know....

My array looks like this:

[bluemond@BlueQ btrfslogs]$ sudo btrfs fi show                               Label: none  uuid: 19b4f289-a87f-4ed8-8882-b0d03e014104
Total devices 6 FS bytes used 15.47TiB
devid    1 size 7.28TiB used 5.83TiB path /dev/sdc1
devid    2 size 4.55TiB used 4.39TiB path /dev/sdg1
devid    3 size 3.64TiB used 3.63TiB path /dev/sdf1
devid    4 size 7.28TiB used 3.03TiB path /dev/sda1
devid    5 size 2.73TiB used 2.22TiB path /dev/sde1
devid    6 size 2.73TiB used 2.22TiB path /dev/sdd1

After the reboot all drives appeared again but now I can't mount the array anymore, it gives me the following error in dmesg:

[  858.554594] BTRFS info (device sdc1): disk space caching is enabled
[  858.554596] BTRFS info (device sdc1): has skinny extents
[  858.556165] BTRFS error (device sdc1): parent transid verify failed on 23219912048640 wanted 116443 found 116484
[  858.556516] BTRFS error (device sdc1): parent transid verify failed on 23219912048640 wanted 116443  found 116484
[  858.556527] BTRFS error (device sdc1): failed to read chunk root
[  858.588332] BTRFS error (device sdc1): open_ctree failed

Mounting with the backuproot option isn't working either:

[  793.730875] BTRFS info (device sdc1): trying to use backup root at mount time
[  793.730879] BTRFS info (device sdc1): disk space caching is enabled
[  793.730880] BTRFS info (device sdc1): has skinny extents
[  793.732479] BTRFS error (device sdc1): parent transid verify failed on 23219912048640 wanted 116443 found 116484
[  793.732775] BTRFS error (device sdc1): parent transid verify failed on 23219912048640 wanted 116443 found 116484
[  793.732785] BTRFS error (device sdc1): failed to read chunk root
[  793.756693] BTRFS error (device sdc1): open_ctree failed

Btrfs restore isn't finding any data either:

[bluemond@BlueQ ~]$ sudo btrfs restore -xmSivD /dev/sda1 /btrfs/
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
This is a dry-run, no files are going to be restored
Done searching

Btrfs checks of each drive produce the following output:

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sda1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sdc1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sdd1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sde1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sdf1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

[bluemond@BlueQ btrfslogs]$ sudo btrfs check /dev/sdg1
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
parent transid verify failed on 23219912048640 wanted 116443 found 116484
Ignoring transid failure
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
parent transid verify failed on 30122559078400 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
parent transid verify failed on 30122559127552 wanted 116443 found 116492
Ignoring transid failure
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
parent transid verify failed on 30122471063552 wanted 116437 found 116492
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
parent transid verify failed on 30122546839552 wanted 116438 found 116458
Ignoring transid failure
leaf parent key incorrect 30122546839552
ERROR: failed to repair root items: Operation not permitted

I tried to read up on the issue, but I only found it mentioned in the gotchas page of the wiki, Marc's blog and in a thread about the stability of RAID56 in this mailing list, where it was stated, that the recoverability is from 0 to 100%, but nowhere was it mentioned, what to do when you encounter that problem. Is there anything I can do to at least recover some of my data from the array?

And how can I prevent it from happening again? Would using the new multi-parity raid1 for Metadata help?

Some info on my system:
I'm running arch on an ssd.
[bluemond@BlueQ btrfslogs]$ uname -a
Linux BlueQ 5.6.12-arch1-1 #1 SMP PREEMPT Sun, 10 May 2020 10:43:42 +0000 x86_64 GNU/Linux

[bluemond@BlueQ btrfslogs]$ btrfs --version
btrfs-progs v5.6

I'm not very familiar with mailing lists, so pardon me if I have done anything wrong!
Hope someone can give me more information about what to do now.

Thanks,
Emil


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-01  4:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15  6:03 Need help recovering broken RAID5 array (parent transid verify failed) Emil Heimpel
2020-05-15 21:46 ` Chris Murphy
2020-05-16  1:44   ` Emil Heimpel
2020-05-20 11:56     ` Emil Heimpel
2020-05-20 19:01       ` Chris Murphy
     [not found]         ` <21913a92-5059-405f-b2d4-91e785ab77bd@gmail.com>
2020-06-04 22:55           ` Emil Heimpel
2020-09-28 13:04             ` Dan van der Ster
2020-10-01  4:49             ` Zygo Blaxell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.