linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Eric Wong <e@80x24.org>
Cc: kreijack@inwind.it, linux-btrfs@vger.kernel.org
Subject: Re: adding new devices to degraded raid1
Date: Sat, 29 Aug 2020 14:46:10 -0400	[thread overview]
Message-ID: <20200829184610.GW5890@hungrycats.org> (raw)
In-Reply-To: <20200829004240.GA32462@dcvr>

On Sat, Aug 29, 2020 at 12:42:40AM +0000, Eric Wong wrote:
> Zygo Blaxell <ce3g8jdj@umail.furryterror.org> wrote:
> > Remove makes a copy of every extent, updates every reference to the
> > extent, then deletes the original extents.  Very seek-heavy--including
> > seeks between reads and writes on the same drive--and the work is roughly
> > proportional to the number of reflinks, so dedupe and snapshots push
> > the cost up.  About the only advantage of remove (and balance) is that
> > it consists of 95% existing btrfs read and write code, and it can handle
> > any relocation that does not require changing the size or content of an
> > extent (including all possible conversions).
> 
> Does that mean remove speed would be closer to replace on good SSDs?

It will be better, but there is still a cost for reading and writing
non-contiguously.  "Good SSD" depends on what the SSD is good at.
A SSD rated for NAS or caching use would be OK, but a high-performance
desktop SSD could hit big write-multiplication penalties.  A couple of
brand names starting with "S" have 5-second IO stalls when their internal
caches get full.  Proportionally, the ratio between the best and worst
IO latency in these SSD models is as bad as SMR drives.  Also there are
CPU and IO latency costs for 'remove' in the host that don't go away
no matter how good the disks are.

> > Arguably this isn't necessary.  Remove could copy a complete block group,
> > the same way replace does but to a different offset on each drive, and
> > simply update the chunk tree with the new location of the block group
> > at the end.  Trouble is, nobody's implemented this approach in btrfs yet.
> > It would be a whole new code path with its very own new bugs to fix.
> 
> Ah, it seems like a ton of work for a use case that mainly
> affects hobbyists.  I won't hold my breath for it.

Well, by that argument, mdadm and lvm shouldn't be able to do it either,
and yet they have supported this style of reshape for years.

      reply	other threads:[~2020-08-29 18:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-27 12:41 adding new devices to degraded raid1 Eric Wong
2020-08-27 17:14 ` Goffredo Baroncelli
2020-08-28  0:30   ` Zygo Blaxell
2020-08-28  2:34     ` Eric Wong
2020-08-28  4:36       ` Zygo Blaxell
2020-08-28  5:09         ` Andrei Borzenkov
2020-08-28 20:56           ` Zygo Blaxell
2020-08-29  0:42         ` Eric Wong
2020-08-29 18:46           ` Zygo Blaxell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200829184610.GW5890@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=e@80x24.org \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).