Re: Recover data from damage disk in "array"

From: "Hérikz Nawarro" <herikz.nawarro@gmail.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Recover data from damage disk in "array"
Date: Sun, 24 Jan 2021 22:41:38 -0300	[thread overview]
Message-ID: <CAD6NJSwMQFx1Mf=w+Vsj=RNNEUKfMHFFscDQ+Ty9Lu-wH6hB2Q@mail.gmail.com> (raw)
In-Reply-To: <20210123172743.GP31381@hungrycats.org>

> OK, that's weird.  Multiple disks should always have metadata in a raid1*
> profile (raid1, raid10, raid1c3, or raid1c4).  dup metadata on multiple
> disks, especially spinners, is going to be slow and brittle with no
> upside.

I didn't know about this.

> There are other ways to do this, but they take longer, in some cases
> orders of magnitude longer (and therefore higher risk):
>
> 1.  convert the metadata to raid1, starting with the faulty drive
> (in these examples I'm just going to call it device 3, use the
> correct device ID for your array):
>
>        # Remove metadata from broken device first
>        btrfs balance start -mdevid=3,convert=raid1,soft /array
>
>        # Continue converting all other metadata in the array:
>        btrfs balance start -mconvert=raid1,soft /array
>
> After metadata is converted to raid1, an intermittent drive connection is
> a much more recoverable problem, and you can replace the broken disk at
> your leisure.  You'll get csum and IO errors when the drive disconnects,
> but these errors will not be fatal to the filesystem as a whole because
> the metadata will be safely written on other devices.
>
> 2.  convert the metadata to raid1 as in option 1, then delete the missing
> device.  This is by far the slowest option, and only works if you have
> sufficient space on the other drives for the new data.
>
> 3.  convert the metadata to raid1 as in option 1, add more disks so that
> there is enough space for the device delete in option 2, then proceed
> with the device delete in option 2.  This is probably worse than option
> 2 in terms of potential failure modes, but I put it here for completeness.
>
> 4.  when the replacement disk arrives, run 'btrfs replace' from the broken
> disk to the new disk, then convert the metadata to raid1 as in option 1
> so you're not using dup metadata any more.  This is as fast as the 'dd'
> solution, but there is a slightly higher risk as the broken disk might
> disconnect during a write and abort the replace operation.

Thanks for the options, i'll try soon.

Em sáb., 23 de jan. de 2021 às 14:27, Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> escreveu:
>
> On Mon, Jan 18, 2021 at 09:00:58PM -0300, Hérikz Nawarro wrote:
> > Hello everyone,
> >
> > I got an array of 4 disks with btrfs configured with data single and
> > metadata dup
>
> OK, that's weird.  Multiple disks should always have metadata in a raid1*
> profile (raid1, raid10, raid1c3, or raid1c4).  dup metadata on multiple
> disks, especially spinners, is going to be slow and brittle with no
> upside.
>
> > , one disk of this array was plugged with a bad sata cable
> > that broke the plastic part of the data port (the pins still intact),
> > i still can read the disk with an adapter, but there's a way to
> > "isolate" this disk, recover all data and later replace the fault disk
> > in the array with a new one?
>
> There's no redundancy in this array, so you will have to keep the broken
> disk online (or the filesystem unmounted) until a solution is implemented.
>
> I wouldn't advise running with a broken connector at all, especially
> without raid1 metadata.
>
> Ideally, boot from rescue media, copy the broken device to a replacement
> disk with dd, then remove the broken disk and mount the filesystem with
> 4 healthy disks.
>
> If you try to operate with a broken connector, you could get disconnects
> and lost writes.  With dup metadata there is no redundancy across
> drives, so a lost metadata write on a single disk is a fatal error.
> That will be a stress-test for btrfs's lost write detection, and even
> if it works, it will force the filesystem read-only whenever it occurs
> in a metadata write.  In the worst case, the disconnection resets the
> drive and prevents its write cache from working properly, so a write is
> lost in metadata, and the filesystem is unrecoverably damaged.
>
> There are other ways to do this, but they take longer, in some cases
> orders of magnitude longer (and therefore higher risk):
>
> 1.  convert the metadata to raid1, starting with the faulty drive
> (in these examples I'm just going to call it device 3, use the
> correct device ID for your array):
>
>         # Remove metadata from broken device first
>         btrfs balance start -mdevid=3,convert=raid1,soft /array
>
>         # Continue converting all other metadata in the array:
>         btrfs balance start -mconvert=raid1,soft /array
>
> After metadata is converted to raid1, an intermittent drive connection is
> a much more recoverable problem, and you can replace the broken disk at
> your leisure.  You'll get csum and IO errors when the drive disconnects,
> but these errors will not be fatal to the filesystem as a whole because
> the metadata will be safely written on other devices.
>
> 2.  convert the metadata to raid1 as in option 1, then delete the missing
> device.  This is by far the slowest option, and only works if you have
> sufficient space on the other drives for the new data.
>
> 3.  convert the metadata to raid1 as in option 1, add more disks so that
> there is enough space for the device delete in option 2, then proceed
> with the device delete in option 2.  This is probably worse than option
> 2 in terms of potential failure modes, but I put it here for completeness.
>
> 4.  when the replacement disk arrives, run 'btrfs replace' from the broken
> disk to the new disk, then convert the metadata to raid1 as in option 1
> so you're not using dup metadata any more.  This is as fast as the 'dd'
> solution, but there is a slightly higher risk as the broken disk might
> disconnect during a write and abort the replace operation.
>
> > Cheers,