All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Chris Murphy <lists@colorremedies.com>,
	Goffredo Baroncelli <kreijack@inwind.it>
Cc: Qu Wenruo <wqu@suse.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: RAID56 discussion related to RST. (Was "Re: [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree")
Date: Mon, 18 Jul 2022 16:03:38 +0800	[thread overview]
Message-ID: <defbf99f-efc6-0497-2efd-04b9d9134d0c@gmx.com> (raw)
In-Reply-To: <PH0PR04MB741662E24861B573FE93D0DD9B8C9@PH0PR04MB7416.namprd04.prod.outlook.com>



On 2022/7/18 15:33, Johannes Thumshirn wrote:
> On 15.07.22 22:15, Chris Murphy wrote:
>> On Fri, Jul 15, 2022 at 1:55 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>>>
>>> On 14/07/2022 09.46, Johannes Thumshirn wrote:
>>>> On 14.07.22 09:32, Qu Wenruo wrote:
>>>>> [...]
>>>>
>>>> Again if you're doing sub-stripe size writes, you're asking stupid things and
>>>> then there's no reason to not give the user stupid answers.
>>>>
>>>
>>> Qu is right, if we consider only full stripe write the "raid hole" problem
>>> disappear, because if a "full stripe" is not fully written it is not
>>> referenced either.
>>>
>>>
>>> Personally I think that the ZFS variable stripe size, may be interesting
>>> to evaluate. Moreover, because the BTRFS disk format is quite flexible,
>>> we can store different BG with different number of disks. Let me to make an
>>> example: if we have 10 disks, we could allocate:
>>> 1 BG RAID1
>>> 1 BG RAID5, spread over 4 disks only
>>> 1 BG RAID5, spread over 8 disks only
>>> 1 BG RAID5, spread over 10 disks
>>>
>>> So if we have short writes, we could put the extents in the RAID1 BG; for longer
>>> writes we could use a RAID5 BG with 4 or 8 or 10 disks depending by length
>>> of the data.
>>>
>>> Yes this would require a sort of garbage collector to move the data to the biggest
>>> raid5 BG, but this would avoid (or reduce) the fragmentation which affect the
>>> variable stripe size.
>>>
>>> Doing so we don't need any disk format change and it would be backward compatible.
>>
>> My 2 cents...
>>
>> Regarding the current raid56 support, in order of preference:
>>
>> a. Fix the current bugs, without changing format. Zygo has an extensive list.
>> b. Mostly fix the write hole, also without changing the format, by
>> only doing COW with full stripe writes. Yes you could somehow get
>> corrupt parity still and not know it until degraded operation produces
>> a bad reconstruction of data - but checksum will still catch that.
>> This kind of "unreplicated corruption" is not quite the same thing as
>> the write hole, because it isn't pernicious like the write hole.
>> c. A new de-clustered parity raid56 implementation that is not
>> backwards compatible.
>
> c) is what I'm leaning to/working on, simply for the fact, that it is
> the the only solution (I can think of at least) to make raid56 working
> on zoned drives. And given that zoned drives tend to have a higher
> capacity than regular drives, they are appealing for raid arrays.


That's what I can totally agree on.

RST is not an optional, but an essential thing to support RAID profiles
for data.

Thus I'm not against RST on zoned device at all, no matter if it's
RAID56 or not.

Thanks,
Qu

>
>> Ergo, I think it's best to not break the format twice. Even if a new
>> raid implementation is years off.
>
> Agreed.
>
>> Metadata centric workloads suck on parity raid anyway. If Btrfs always
>> does full stripe COW won't matter even if the performance is worse
>> because no one should use parity raid for this workload anyway.
>>
>
> Yup.

  reply	other threads:[~2022-07-18  8:03 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 14:31 [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 1/8] btrfs: add raid stripe tree definitions Johannes Thumshirn
2022-05-17  7:39   ` Qu Wenruo
2022-05-17  7:45     ` Johannes Thumshirn
2022-05-17  7:56       ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 2/8] btrfs: move btrfs_io_context to volumes.h Johannes Thumshirn
2022-05-17  7:42   ` Qu Wenruo
2022-05-17  7:51     ` Johannes Thumshirn
2022-05-17  7:58       ` Qu Wenruo
2022-05-17  8:01         ` Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 3/8] btrfs: read raid-stripe-tree from disk Johannes Thumshirn
2022-05-17  8:09   ` Qu Wenruo
2022-05-17  8:13     ` Johannes Thumshirn
2022-05-17  8:28       ` Qu Wenruo
2022-05-18 11:29         ` Johannes Thumshirn
2022-05-19  8:36           ` Qu Wenruo
2022-05-19  8:39             ` Johannes Thumshirn
2022-05-19 10:37               ` Qu Wenruo
2022-05-19 11:44                 ` Johannes Thumshirn
2022-05-19 11:48                   ` Qu Wenruo
2022-05-19 11:53                     ` Johannes Thumshirn
2022-05-19 13:26                       ` Qu Wenruo
2022-05-19 13:49                         ` Johannes Thumshirn
2022-05-19 22:56                           ` Qu Wenruo
2022-05-20  8:27                             ` Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 4/8] btrfs: add boilerplate code to insert raid extent Johannes Thumshirn
2022-05-17  7:53   ` Qu Wenruo
2022-05-17  8:00   ` Qu Wenruo
2022-05-17  8:05     ` Johannes Thumshirn
2022-05-17  8:09       ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 5/8] btrfs: add code to delete " Johannes Thumshirn
2022-05-17  8:06   ` Qu Wenruo
2022-05-17  8:10     ` Johannes Thumshirn
2022-05-17  8:14       ` Qu Wenruo
2022-05-17  8:20         ` Johannes Thumshirn
2022-05-17  8:31           ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 6/8] btrfs: add code to read " Johannes Thumshirn
2022-05-16 14:55   ` Josef Bacik
2022-05-16 14:31 ` [RFC ONLY 7/8] btrfs: zoned: allow zoned RAID1 Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 8/8] btrfs: add raid stripe tree pretty printer Johannes Thumshirn
2022-05-16 14:58 ` [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree Josef Bacik
2022-05-16 15:04   ` Johannes Thumshirn
2022-05-16 15:10     ` Josef Bacik
2022-05-16 15:47       ` Johannes Thumshirn
2022-05-17  7:23 ` Nikolay Borisov
2022-05-17  7:31   ` Qu Wenruo
2022-05-17  7:41     ` Johannes Thumshirn
2022-05-17  7:32   ` Johannes Thumshirn
2022-07-13 10:54 ` RAID56 discussion related to RST. (Was "Re: [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree") Qu Wenruo
2022-07-13 11:43   ` Johannes Thumshirn
2022-07-13 12:01     ` Qu Wenruo
2022-07-13 12:42       ` Johannes Thumshirn
2022-07-13 13:47         ` Qu Wenruo
2022-07-13 14:01           ` Johannes Thumshirn
2022-07-13 15:24             ` Lukas Straub
2022-07-13 15:28               ` Johannes Thumshirn
2022-07-14  1:08             ` Qu Wenruo
2022-07-14  7:08               ` Johannes Thumshirn
2022-07-14  7:32                 ` Qu Wenruo
2022-07-14  7:46                   ` Johannes Thumshirn
2022-07-14  7:53                     ` Qu Wenruo
2022-07-15 17:54                     ` Goffredo Baroncelli
2022-07-15 19:08                       ` Thiago Ramon
2022-07-16  0:34                         ` Qu Wenruo
2022-07-16 11:11                           ` Qu Wenruo
2022-07-16 13:52                             ` Thiago Ramon
2022-07-16 14:26                               ` Goffredo Baroncelli
2022-07-17 17:58                                 ` Goffredo Baroncelli
2022-07-17  0:30                               ` Qu Wenruo
2022-07-17 15:18                                 ` Thiago Ramon
2022-07-17 22:01                                   ` Qu Wenruo
2022-07-17 23:00                           ` Zygo Blaxell
2022-07-18  1:04                             ` Qu Wenruo
2022-07-15 20:14                       ` Chris Murphy
2022-07-18  7:33                         ` Johannes Thumshirn
2022-07-18  8:03                           ` Qu Wenruo [this message]
2022-07-18 21:49                         ` Forza
2022-07-19  1:19                           ` Qu Wenruo
2022-07-21 14:51                             ` Forza
2022-07-24 11:27                               ` Qu Wenruo
2022-07-25  0:00                             ` Zygo Blaxell
2022-07-25  0:25                               ` Qu Wenruo
2022-07-25  5:41                                 ` Zygo Blaxell
2022-07-25  7:49                                   ` Qu Wenruo
2022-07-25 19:58                               ` Goffredo Baroncelli
2022-07-25 21:29                                 ` Qu Wenruo
2022-07-18  7:30                       ` Johannes Thumshirn
2022-07-19 18:58                         ` Goffredo Baroncelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=defbf99f-efc6-0497-2efd-04b9d9134d0c@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.