All of lore.kernel.org
 help / color / mirror / Atom feed
* HELP unmountable partition after btrfs balance to RAID0
@ 2018-12-06 11:31 Thomas Mohr
  2018-12-07  9:19 ` Duncan
  0 siblings, 1 reply; 2+ messages in thread
From: Thomas Mohr @ 2018-12-06 11:31 UTC (permalink / raw)
  To: linux-btrfs

Dear developers of BTRFS,

we have a problem. We wanted to convert a file system to a RAID0 with 
two partitions. Unfortunately we had to reboot the server during the 
balance operation before it could complete.

Now following happens:

A mount attempt of the array fails with following error code:

btrfs recover yields roughly 1.6 out of 4 TB.

to recover the rest we have tried:

mount:

[18192.357444] BTRFS info (device sdb1): disk space caching is enabled
[18192.357447] BTRFS info (device sdb1): has skinny extents
[18192.370664] BTRFS error (device sdb1): parent transid verify failed 
on 30523392 wanted 7432 found 7445
[18192.370810] BTRFS error (device sdb1): parent transid verify failed 
on 30523392 wanted 7432 found 7445
[18192.394745] BTRFS error (device sdb1): open_ctree failed

mounting with options ro, degraded, cache_clear etc yields the same errors.


btrfs rescue zero-log. This operation works, however, the error persists 
and the array remains unmountable

parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
Ignoring transid failure
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
Ignoring transid failure
Clearing log on /dev/sdb1, previous log_root 0, level 0

btrfs rescue chunk-recover fails with following error message:

btrfs check results in:

Opening filesystem to check...
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
Ignoring transid failure
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
Ignoring transid failure
Checking filesystem on /dev/sdb1
UUID: 6c9ed4e1-d63f-46f0-b1e9-608b8fa43bb8
[1/7] checking root items
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
Ignoring transid failure
leaf parent key incorrect 30523392ERROR: failed to repair root items: 
Operation not permitted

Any ideas what is going on or how to recover the file system ? I would 
greatly appreciate your help !!!

best,

Thomas


uname -a:

Linux server2 4.19.5-1-default #1 SMP PREEMPT Tue Nov 27 19:56:09 UTC 
2018 (6210279) x86_64 x86_64 x86_64 GNU/Linux

btrfs-progs version 4.19


-- 
ScienceConsult - DI Thomas Mohr KG
DI Thomas Mohr
Enzianweg 10a
2353 Guntramsdorf
Austria
+43 2236 56793
+43 660 461 1966
http://www.mohrkeg.co.at


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: HELP unmountable partition after btrfs balance to RAID0
  2018-12-06 11:31 HELP unmountable partition after btrfs balance to RAID0 Thomas Mohr
@ 2018-12-07  9:19 ` Duncan
  0 siblings, 0 replies; 2+ messages in thread
From: Duncan @ 2018-12-07  9:19 UTC (permalink / raw)
  To: linux-btrfs

Thomas Mohr posted on Thu, 06 Dec 2018 12:31:15 +0100 as excerpted:

> We wanted to convert a file system to a RAID0 with two partitions.
> Unfortunately we had to reboot the server during the balance operation
> before it could complete.
> 
> Now following happens:
> 
> A mount attempt of the array fails with following error code:
> 
> btrfs recover yields roughly 1.6 out of 4 TB.

[Just another btrfs user and list regular, not a dev.  A dev may reply to 
your specific case, but meanwhile, for next time...]

That shouldn't be a problem.  Because with raid0 a failure of any of the 
components will take down the entire raid, making it less reliable than a 
single device, raid0 (in general, not just btrfs) is considered only 
useful for data of low enough value that its loss is no big deal, either 
because it's truly of little value (internet cache being a good example), 
or because backups are kept available and updated for whenever the raid0 
array fails.  Because with raid0, it's always a question of when it'll 
fail, not if.

So loss of a filesystem being converted to raid0 isn't a problem, because 
the data on it, by virtue of being in the process of conversion to raid0, 
is defined as of throw-away value in any case.  If it's of higher value 
than that, it's not going to be raid0 (or in the process of conversion to 
it) in the first place.

Of course that's simply an extension of the more general first sysadmin's 
rule of backups, that the true value of data is defined not by arbitrary 
claims, but by the number of backups of that data it's worth having.  
Because "things happen", whether it's fat-fingering, bad hardware, buggy 
software, or simply someone tripping over the power cable or running into 
the power pole outside at the wrong time.

So no backup is simply defining the data as worth less than the time/
trouble/resources necessary to make that backup.

Note that you ALWAYS save what was of most value to you, either the time/
trouble/resources to do the backup, if your actions defined that to be of 
more value than the data, or the data, if you had that backup, thereby 
defining the value of the data to be worth backing up.

Similarly, failure of the only backup isn't a problem because by virtue 
of there being only that one backup, the data is defined as not worth 
having more than one, and likewise, having an outdated backup isn't a 
problem, because that's simply the special case of defining the data in 
the delta between the backup time and the present as not (yet) worth the 
time/hassle/resources to make/refresh that backup.

(And FWIW, the second sysadmin's rule of backups is that it's not a 
backup until you've successfully tested it recoverable in the same sort 
of conditions you're likely to need to recover it in.  Because so many 
people have /thought/ they had backups, that turned out not to be, 
because they never tested that they could actually recover the data from 
them.  For instance, if the backup tools you'll need to recover the 
backup are on the backup itself, how do you get to them?  Can you create 
a filesystem for the new copy of the data and recover it from the backup 
with just the tools and documentation available from your emergency boot 
media?  Untested backup == no backup, or at best, backup still in 
process!)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-12-07  9:21 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 11:31 HELP unmountable partition after btrfs balance to RAID0 Thomas Mohr
2018-12-07  9:19 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.