linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Zygo Blaxell <zblaxell@furryterror.org>
Cc: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/3] btrfs: More intelligent degraded chunk allocator
Date: Mon, 2 Dec 2019 12:41:53 +0800	[thread overview]
Message-ID: <78acb42f-071f-8d78-c335-71c2af5da841@gmx.com> (raw)
In-Reply-To: <20191202032259.GN22121@hungrycats.org>


[-- Attachment #1.1: Type: text/plain, Size: 3893 bytes --]



On 2019/12/2 上午11:22, Zygo Blaxell wrote:
> On Tue, Nov 19, 2019 at 07:32:26AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2019/11/19 上午4:18, David Sterba wrote:
>>> On Thu, Nov 07, 2019 at 02:27:07PM +0800, Qu Wenruo wrote:
>>>> This patchset will make btrfs degraded mount more intelligent and
>>>> provide more consistent profile keeping function.
>>>>
>>>> One of the most problematic aspect of degraded mount is, btrfs may
>>>> create unwanted profiles.
>>>>
>>>>  # mkfs.btrfs -f /dev/test/scratch[12] -m raid1 -d raid1
>>>>  # wipefs -fa /dev/test/scratch2
>>>>  # mount -o degraded /dev/test/scratch1 /mnt/btrfs
>>>>  # fallocate -l 1G /mnt/btrfs/foobar
>>>>  # btrfs ins dump-tree -t chunk /dev/test/scratch1
>>>>         item 7 key (FIRST_CHUNK_TREE CHUNK_ITEM 1674575872) itemoff 15511 itemsize 80
>>>>                 length 536870912 owner 2 stripe_len 65536 type DATA
>>>>  New data chunk will fallback to SINGLE or DUP.
>>>>
>>>>
>>>> The cause is pretty simple, when mounted degraded, missing devices can't
>>>> be used for chunk allocation.
>>>> Thus btrfs has to fall back to SINGLE profile.
>>>>
>>>> This patchset will make btrfs to consider missing devices as last resort if
>>>> current rw devices can't fulfil the profile request.
>>>>
>>>> This should provide a good balance between considering all missing
>>>> device as RW and completely ruling out missing devices (current mainline
>>>> behavior).
>>>
>>> Thanks. This is going to change the behaviour with a missing device, so
>>> the question is if we should make this configurable first and then
>>> switch the default.
>>
>> Configurable then switch makes sense for most cases, but for this
>> degraded chunk case, IIRC the new behavior is superior in all cases.
>>
>> For 2 devices RAID1 with one missing device (the main concern), old
>> behavior will create SINGLE/DUP chunk, which has no tolerance for extra
>> missing devices.
>>
>> The new behavior will create degraded RAID1, which still lacks tolerance
>> for extra missing devices.
>>
>> The difference is, for degraded chunk, if we have the device back, and
>> do proper scrub, then we're completely back to proper RAID1.
>> No need to do extra balance/convert, only scrub is needed.
> 
> I think you meant to say "replace" instead of "scrub" above.

"scrub" for missing-then-back case.

As at the time of write, I didn't even take the replace case into
consideration...

> 
>> So the new behavior is kinda of a super set of old behavior, using the
>> new behavior by default should not cause extra concern.
> 
> It sounds OK to me, provided that the missing device is going away
> permanently, and a new device replaces it.
> 
> If the missing device comes back, we end up relying on scrub and 32-bit
> CRCs to figure out which disk has correct data, and it will be wrong
> 1/2^32 of the time.  For nodatasum files there are no CRCs so the data
> will be wrong much more often.  This patch doesn't change that, but
> maybe another patch should.

Yep, the patchset won't change it.

But this also remind me, so far we are all talking about "degraded"
mount option.
Under most case, user is only using "degraded" when they completely
understands that device is missing, not using that option as a daily option.

So that shouldn't be a big problem so far.

Thanks,
Qu

> 
>>> How does this work with scrub? Eg. if there are 2 devices in RAID1, one
>>> goes missing and then scrub is started. It makes no sense to try to
>>> repair the missing blocks, but given the logic in the patches all the
>>> data will be rewritten, right?
>>
>> Scrub is unchanged at all.
>>
>> Missing device will not go through scrub at all, as scrub is per-device
>> based, missing device will be ruled out at very beginning of scrub.
>>
>> Thanks,
>> Qu
>>>
>>
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-12-02  4:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07  6:27 [PATCH 0/3] btrfs: More intelligent degraded chunk allocator Qu Wenruo
2019-11-07  6:27 ` [PATCH 1/3] btrfs: volumes: Refactor device holes gathering into a separate function Qu Wenruo
2019-11-07  9:20   ` Johannes Thumshirn
2019-11-07  9:33     ` Qu Wenruo
2019-11-07  9:45       ` Johannes Thumshirn
2019-11-07  6:27 ` [PATCH 2/3] btrfs: volumes: Add btrfs_fs_devices::missing_list to collect missing devices Qu Wenruo
2019-11-07  9:31   ` Johannes Thumshirn
2019-11-19 10:03   ` Anand Jain
2019-11-19 10:29     ` Qu Wenruo
2019-11-27 19:36       ` David Sterba
2019-11-07  6:27 ` [PATCH 3/3] btrfs: volumes: Allocate degraded chunks if rw devices can't fullfil a chunk Qu Wenruo
2019-11-19 10:05   ` Anand Jain
2019-11-19 10:41     ` Qu Wenruo
2019-11-27 19:23       ` David Sterba
2019-11-27 23:36         ` Qu Wenruo
2019-11-28 11:24           ` David Sterba
2019-11-28 12:29             ` Qu Wenruo
2019-11-28 12:30             ` Qu WenRuo
2019-11-28 12:39               ` Qu Wenruo
2019-11-18 20:18 ` [PATCH 0/3] btrfs: More intelligent degraded chunk allocator David Sterba
2019-11-18 23:32   ` Qu Wenruo
2019-11-19  5:18     ` Alberto Bursi
2019-11-27 19:26       ` David Sterba
2019-12-02  3:22     ` Zygo Blaxell
2019-12-02  4:41       ` Qu Wenruo [this message]
2019-12-02 19:27         ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78acb42f-071f-8d78-c335-71c2af5da841@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    --cc=zblaxell@furryterror.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).