All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandre Poux <pums974@gmail.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: multi-device btrfs with single data mode and disk failure
Date: Tue, 20 Sep 2016 23:05:29 +0200	[thread overview]
Message-ID: <c05574ee-7d47-dea6-af71-b60d14aeb129@gmail.com> (raw)
In-Reply-To: <0b29471c-363a-1e2f-d352-1d422c07df64@gmail.com>



Le 20/09/2016 à 22:18, Alexandre Poux a écrit :
>
> Le 20/09/2016 à 21:46, Chris Murphy a écrit :
>> On Tue, Sep 20, 2016 at 1:31 PM, Alexandre Poux <pums974@gmail.com> wrote:
>>> Le 20/09/2016 à 21:11, Chris Murphy a écrit :
>>>> And no backup? Umm, I'd resolve that sooner than anything else.
>>> Yeah you are absolutely right, this was a temporary solution which came
>>> to be not that temporary.
>>> And I regret it already...
>> Well on the bright side, if this were LVM or mdadm linear/concat
>> array, the whole thing would be toast because any other file system
>> would have lost too much fs metadata on the missing device.
>>
>>>>  It
>>>> should be true that it'll tolerate a read only mount indefinitely, but
>>>> read write? Not sure. This sort of edge case isn't well tested at all
>>>> seeing as it required changing the kernel to reduce safe guards. So
>>>> all bets are off the whole thing could become unmountable, not even
>>>> read only, and then it's a scraping job.
>>> I'm not that crazy, I tried the patch inside a virtual machine on
>>> virtual drives...
>>> And since it's only virtual, it may not work on the real partition...
>> Are you sure the virtual setup lacked a CHUNK_ITEM on the missing
>> device? That might be what pinned it in that case.
> In fact in my virtual setup there was more chunk missing (1 metadata 1
> System and 1 Data).
> I will try to do a setup closer to my real one.
Good news, I made a test were in my virtual setup, I was missing no
chunk at all
And in this case, It has no problem to remove it !
What I did is
- make an array with 6 disks (data single, metadata raid1)
- dd if=/dev/zero of=/mnt/somefile bs=64M count=16 # make a 1G file
- use btrfs-debug-tree to identify which device was not used
- shutdown the vm, remove this virtual device, and restart the vm
- mount the array in degraded but with read write thanks to the patched
kernel
- btrfs remove missing
- and voilà !
I will try with something else than /dev/null, but this is very encouraging
Do you think that my test is too trivial ?
Should I try something else before trying on the real partition with the
overlay ?

>> You could try some sort of overlay for your remaining drives.
>> Something like this:
>> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file
>>
>> Make sure you understand the gotcha about cloning which applies here:
>> https://btrfs.wiki.kernel.org/index.php/Gotchas
>>
>> I think it's safe to use blockdev --setro on every real device  you're
>> trying to protect from changes. And when mounting you'll at least need
>> to use device= mount option to explicitly mount each of the overlay
>> devices. Based on the wiki, I'm wincing, I don't really know for sure
>> if device mount option is enough to compel Btrfs to only use those
>> devices and not go off the rails and still use one of the real
>> devices, but at least if they're setro it won't matter (the mount will
>> just fail somehow due to write failures).
>>
>> So now you can try removing the missing device... and see what
>> happens. You could inspect the overlay files and see what changes were
>> made.
> Wow that looks like nice.
> So, if it work, and if we find a way to fix the filesystem inside the vm,
> I can use this over the real partion to check if it works before trying
> the fix for real.
> Nice idea.
>>>> What do you get for btrfs-debug-tree -t 3 <dev>
>>>>
>>>> That should show the chunk tree, and what I'm wondering if if the
>>>> chunk tree has any references to chunks on the missing device. Even if
>>>> there are no extents on that device, if there are chunks, that might
>>>> be one of the safeguards.
>>>>
>>> You'll find it attached.
>>> The missing device is the devid 8 (since it's the only one missing in
>>> btrfs fi show)
>>> I found it only once line 63
>> Yeah bummer. Not used for system, data, or metadata chunks at all.
>>
>>
>



  reply	other threads:[~2016-09-20 21:05 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-15  7:44 multi-device btrfs with single data mode and disk failure Alexandre Poux
2016-09-15 15:38 ` Chris Murphy
2016-09-15 16:30   ` Alexandre Poux
2016-09-15 16:54     ` Chris Murphy
     [not found]       ` <760be1b7-79b2-a25d-7c60-04ceac1b6e40@gmail.com>
2016-09-15 21:54         ` Chris Murphy
2016-09-19 22:05           ` Alexandre Poux
2016-09-20 17:03             ` Alexandre Poux
2016-09-20 17:54               ` Chris Murphy
2016-09-20 18:19                 ` Alexandre Poux
2016-09-20 18:38                   ` Chris Murphy
2016-09-20 18:53                     ` Alexandre Poux
2016-09-20 19:11                       ` Chris Murphy
     [not found]                         ` <4e7ec5eb-7fb6-2d19-f29d-82461e2d0bd2@gmail.com>
2016-09-20 19:46                           ` Chris Murphy
2016-09-20 20:18                             ` Alexandre Poux
2016-09-20 21:05                               ` Alexandre Poux [this message]
2016-09-20 21:15                               ` Chris Murphy
2016-09-29 12:55                                 ` Alexandre Poux
2016-09-30 23:46                                   ` Alexandre Poux
2016-09-20 19:43                       ` Austin S. Hemmelgarn
2016-09-20 19:54                         ` Alexandre Poux
2016-09-20 20:02                           ` Chris Murphy
2016-09-20 19:55                         ` Chris Murphy
2016-09-21 11:07                           ` Austin S. Hemmelgarn
2016-09-20 20:59                       ` Graham Cobb
2016-09-20 18:56                 ` Austin S. Hemmelgarn
2016-09-20 19:06                   ` Alexandre Poux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c05574ee-7d47-dea6-af71-b60d14aeb129@gmail.com \
    --to=pums974@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.