All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Nikolay Borisov <nborisov@suse.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree
Date: Tue, 17 May 2022 07:41:32 +0000	[thread overview]
Message-ID: <PH0PR04MB7416D75883472A9B4E33BF1F9BCE9@PH0PR04MB7416.namprd04.prod.outlook.com> (raw)
In-Reply-To: 27870f8d-0b6d-f2a4-2b69-c2d001dd0855@gmx.com

On 17/05/2022 09:32, Qu Wenruo wrote:
> 
> 
> On 2022/5/17 15:23, Nikolay Borisov wrote:
>>
>>
>> On 16.05.22 г. 17:31 ч., Johannes Thumshirn wrote:
>>> Introduce a raid-stripe-tree to record writes in a RAID environment.
>>>
>>> In essence this adds another address translation layer between the
>>> logical
>>> and the physical addresses in btrfs and is designed to close two gaps.
>>> The
>>> first is the ominous RAID-write-hole we suffer from with RAID5/6 and the
>>> second one is the inability of doing RAID with zoned block devices due
>>> to the
>>> constraints we have with REQ_OP_ZONE_APPEND writes.
>>>
>>> Thsi is an RFC/PoC only which just shows how the code will look like
>>> for a
>>> zoned RAID1. Its sole purpose is to facilitate design reviews and is not
>>> intended to be merged yet. Or if merged to be used on an actual
>>> file-system.
>>>
>>> Johannes Thumshirn (8):
>>>    btrfs: add raid stripe tree definitions
>>>    btrfs: move btrfs_io_context to volumes.h
>>>    btrfs: read raid-stripe-tree from disk
>>>    btrfs: add boilerplate code to insert raid extent
>>>    btrfs: add code to delete raid extent
>>>    btrfs: add code to read raid extent
>>>    btrfs: zoned: allow zoned RAID1
>>>    btrfs: add raid stripe tree pretty printer
>>>
>>>   fs/btrfs/Makefile               |   2 +-
>>>   fs/btrfs/ctree.c                |   1 +
>>>   fs/btrfs/ctree.h                |  29 ++++
>>>   fs/btrfs/disk-io.c              |  12 ++
>>>   fs/btrfs/extent-tree.c          |   9 ++
>>>   fs/btrfs/file.c                 |   1 -
>>>   fs/btrfs/print-tree.c           |  21 +++
>>>   fs/btrfs/raid-stripe-tree.c     | 251 ++++++++++++++++++++++++++++++++
>>>   fs/btrfs/raid-stripe-tree.h     |  39 +++++
>>>   fs/btrfs/volumes.c              |  44 +++++-
>>>   fs/btrfs/volumes.h              |  93 ++++++------
>>>   fs/btrfs/zoned.c                |  39 +++++
>>>   include/uapi/linux/btrfs.h      |   1 +
>>>   include/uapi/linux/btrfs_tree.h |  17 +++
>>>   14 files changed, 509 insertions(+), 50 deletions(-)
>>>   create mode 100644 fs/btrfs/raid-stripe-tree.c
>>>   create mode 100644 fs/btrfs/raid-stripe-tree.h
>>>
>>
>>
>> So if we choose to go with raid stripe tree this means we won't need the
>> raid56j code that Qu is working on ? So it's important that these two
>> work streams are synced so we don't duplicate effort, right?
> 
> I believe the stripe tree is going to change the definition of RAID56.
> 
> It's no longer strict RAID56, as it doesn't contain the fixed device
> rotation, thus it's kinda between RAID4 and RAID5.

Well I think it can still contain the device rotation. The stripe tree only
records the on-disk location of each sub-stripe, after it has been written.
The data placement itself doesn't get changed at all. But for this to work,
there's still a lot to do. There's also other plans I have. IIUC btrfs raid56
uses all available drives in a raid set, while raid1,10,0 etc permute the
drives the data is placed. Which is a way better solution IMHO as it reduces
rebuild stress in case we need to do rebuild. Given we have two digit TB
drives these days, rebuilds do a lot of IO which can cause more drives failing
while rebuilding.

> Personally speaking, I think both features can co-exist, especially the
> raid56 stripe tree may need extra development and review, since the
> extra translation layer is a completely different monster when comes to
> RAID56.
> 
> Don't get me wrong, I like stripe-tree too, the only problem is it's
> just too new, thus we may want a backup plan.
> 

Exactly, as I already wrote to Nikolay, raid56j is for sure the simpler
solution and some users might even prefer it for this reason.

Byte,
	Johannes

  reply	other threads:[~2022-05-17  7:42 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 14:31 [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 1/8] btrfs: add raid stripe tree definitions Johannes Thumshirn
2022-05-17  7:39   ` Qu Wenruo
2022-05-17  7:45     ` Johannes Thumshirn
2022-05-17  7:56       ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 2/8] btrfs: move btrfs_io_context to volumes.h Johannes Thumshirn
2022-05-17  7:42   ` Qu Wenruo
2022-05-17  7:51     ` Johannes Thumshirn
2022-05-17  7:58       ` Qu Wenruo
2022-05-17  8:01         ` Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 3/8] btrfs: read raid-stripe-tree from disk Johannes Thumshirn
2022-05-17  8:09   ` Qu Wenruo
2022-05-17  8:13     ` Johannes Thumshirn
2022-05-17  8:28       ` Qu Wenruo
2022-05-18 11:29         ` Johannes Thumshirn
2022-05-19  8:36           ` Qu Wenruo
2022-05-19  8:39             ` Johannes Thumshirn
2022-05-19 10:37               ` Qu Wenruo
2022-05-19 11:44                 ` Johannes Thumshirn
2022-05-19 11:48                   ` Qu Wenruo
2022-05-19 11:53                     ` Johannes Thumshirn
2022-05-19 13:26                       ` Qu Wenruo
2022-05-19 13:49                         ` Johannes Thumshirn
2022-05-19 22:56                           ` Qu Wenruo
2022-05-20  8:27                             ` Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 4/8] btrfs: add boilerplate code to insert raid extent Johannes Thumshirn
2022-05-17  7:53   ` Qu Wenruo
2022-05-17  8:00   ` Qu Wenruo
2022-05-17  8:05     ` Johannes Thumshirn
2022-05-17  8:09       ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 5/8] btrfs: add code to delete " Johannes Thumshirn
2022-05-17  8:06   ` Qu Wenruo
2022-05-17  8:10     ` Johannes Thumshirn
2022-05-17  8:14       ` Qu Wenruo
2022-05-17  8:20         ` Johannes Thumshirn
2022-05-17  8:31           ` Qu Wenruo
2022-05-16 14:31 ` [RFC ONLY 6/8] btrfs: add code to read " Johannes Thumshirn
2022-05-16 14:55   ` Josef Bacik
2022-05-16 14:31 ` [RFC ONLY 7/8] btrfs: zoned: allow zoned RAID1 Johannes Thumshirn
2022-05-16 14:31 ` [RFC ONLY 8/8] btrfs: add raid stripe tree pretty printer Johannes Thumshirn
2022-05-16 14:58 ` [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree Josef Bacik
2022-05-16 15:04   ` Johannes Thumshirn
2022-05-16 15:10     ` Josef Bacik
2022-05-16 15:47       ` Johannes Thumshirn
2022-05-17  7:23 ` Nikolay Borisov
2022-05-17  7:31   ` Qu Wenruo
2022-05-17  7:41     ` Johannes Thumshirn [this message]
2022-05-17  7:32   ` Johannes Thumshirn
2022-07-13 10:54 ` RAID56 discussion related to RST. (Was "Re: [RFC ONLY 0/8] btrfs: introduce raid-stripe-tree") Qu Wenruo
2022-07-13 11:43   ` Johannes Thumshirn
2022-07-13 12:01     ` Qu Wenruo
2022-07-13 12:42       ` Johannes Thumshirn
2022-07-13 13:47         ` Qu Wenruo
2022-07-13 14:01           ` Johannes Thumshirn
2022-07-13 15:24             ` Lukas Straub
2022-07-13 15:28               ` Johannes Thumshirn
2022-07-14  1:08             ` Qu Wenruo
2022-07-14  7:08               ` Johannes Thumshirn
2022-07-14  7:32                 ` Qu Wenruo
2022-07-14  7:46                   ` Johannes Thumshirn
2022-07-14  7:53                     ` Qu Wenruo
2022-07-15 17:54                     ` Goffredo Baroncelli
2022-07-15 19:08                       ` Thiago Ramon
2022-07-16  0:34                         ` Qu Wenruo
2022-07-16 11:11                           ` Qu Wenruo
2022-07-16 13:52                             ` Thiago Ramon
2022-07-16 14:26                               ` Goffredo Baroncelli
2022-07-17 17:58                                 ` Goffredo Baroncelli
2022-07-17  0:30                               ` Qu Wenruo
2022-07-17 15:18                                 ` Thiago Ramon
2022-07-17 22:01                                   ` Qu Wenruo
2022-07-17 23:00                           ` Zygo Blaxell
2022-07-18  1:04                             ` Qu Wenruo
2022-07-15 20:14                       ` Chris Murphy
2022-07-18  7:33                         ` Johannes Thumshirn
2022-07-18  8:03                           ` Qu Wenruo
2022-07-18 21:49                         ` Forza
2022-07-19  1:19                           ` Qu Wenruo
2022-07-21 14:51                             ` Forza
2022-07-24 11:27                               ` Qu Wenruo
2022-07-25  0:00                             ` Zygo Blaxell
2022-07-25  0:25                               ` Qu Wenruo
2022-07-25  5:41                                 ` Zygo Blaxell
2022-07-25  7:49                                   ` Qu Wenruo
2022-07-25 19:58                               ` Goffredo Baroncelli
2022-07-25 21:29                                 ` Qu Wenruo
2022-07-18  7:30                       ` Johannes Thumshirn
2022-07-19 18:58                         ` Goffredo Baroncelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR04MB7416D75883472A9B4E33BF1F9BCE9@PH0PR04MB7416.namprd04.prod.outlook.com \
    --to=johannes.thumshirn@wdc.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.