All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Filipe Manana <fdmanana@kernel.org>
Cc: dsterba@suse.cz, linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 3/4] Btrfs: check if destination root is read-only for deduplication
Date: Thu, 21 Feb 2019 11:54:50 -0500	[thread overview]
Message-ID: <20190221165326.GH9995@hungrycats.org> (raw)
In-Reply-To: <CAL3q7H455MCpj_-5U=zUzD3Of7aLtG6R_tHdGhZ4Kj-393sWAg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5402 bytes --]

On Wed, Feb 20, 2019 at 04:54:09PM +0000, Filipe Manana wrote:
> On Wed, Feb 20, 2019 at 4:42 PM Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > On Thu, Jan 31, 2019 at 04:39:22PM +0000, Filipe Manana wrote:
> > > On Thu, Dec 13, 2018 at 4:08 PM David Sterba <dsterba@suse.cz> wrote:
> > > >
> > > > On Wed, Dec 12, 2018 at 06:05:58PM +0000, fdmanana@kernel.org wrote:
> > > > > From: Filipe Manana <fdmanana@suse.com>
> > > > >
> > > > > Checking if the destination root is read-only was being performed only for
> > > > > clone operations. Make deduplication check it as well, as it does not make
> > > > > sense to not do it, even if it is an operation that does not change the
> > > > > file contents (such as defrag for example, which checks first if the root
> > > > > is read-only).
> > > >
> > > > And this is also change in user-visible behaviour of dedupe, so this
> > > > needs to be verified if it's not breaking existing tools.
> > >
> > > Have you had the chance to do such verification?
> > >
> > > This actually conflicts with send. Send does not expect a root/tree to
> > > change, and with dedupe on read-only roots happening
> > > in parallel with send is going to cause all sorts of unexpected and
> > > undesired problems...
> >
> > This is a problem bees ran into.  There is a workaround in bees (called
> > --workaround-btrfs-send) that avoids RO subvols as dedupe targets.
> > As the name of the option implies, it works around problems in btrfs send.
> >
> > This kernel change makes the workaround mandatory now, as the default
> > case (without workaround) will fail on every RO subvol even if that
> > behavior is desired by the user.  That breaks an important use case on
> > the receiving side of sends--to dedupe the received subvols together
> > while also protecting them against modification on the target system
> > with the RO flag--and preserving that use case is why the send workaround
> > was optional (and not default) in bees.
> >
> > bees also won't handle the RO/RW/RO transition correctly, as it didn't
> > seem like a sane thing to support at the time.  That is arguably something
> > to be fixed in bees.
> >
> > > This is a problem introduced by dedupe ioctl when it landed, since
> > > send existed for a longer time (when nothing else was
> > > allowed to change read-only roots, including defrag).
> >
> > Is there a reason why incremental send can't simply be fixed?
> 
> This is a problem that affects both incremental and non-incremental (full) send.
> 
> > As far
> > as I can tell, send is failing because of a runtime check that seems to
> > be too strict; however, I haven't tried removing that check to see if
> > it fixes the problem in send, or just hides the next problem.
> 
> The problem is send was designed with the idea that read-only roots
> don't ever change.

Wait...don't ever change, or don't change while send is running?  If the
only thing we have to do is prevent concurrent send and dedupe, then that
might be much easier--and much less intrusive than breaking dedupe of
all RO subvols.

How does concurrent send and balance work now?  Maybe dedupe can use the
same approach.

> The failures that can happen are many and unpredictable, from
> occasionally failing with some error, to invalid memory accesses, use
> after free problems, etc.
> Essentially all caused by races when the nodes/leafs from the
> read-only tree change while send is running.
> 
> I don't know what runtime check you are mentioning that is too strict.
> You can definitely do dedupe on a read-only root and after it finishes
> do a send (either full or incremental), and it will work.

OK, after rereading https://github.com/Zygo/bees/issues/79 it's not
clear to me whether the problem reported was "send fails while bees is
running" or "send fails if bees is run between two incremental sends."
I had concluded it was the latter, which would definitely be a send bug,
but I can't find any statement to that effect from the bug reporter.

If it's the former, then all we need is some way to prevent dedupe while
sending on the same subvol.  Maybe send could set a flag (rwlock?) on the
subvol data structure, and dedupe can check the flag, with this behavior:

	- if dedupe is running while send starts, block the send until
	the dedupe is finished

	- if send is running when dedupe starts, reject the dedupe
	with...EBUSY?

Userspace dedupe could look for EBUSY and reschedule dedupe of the RO
subvol later.

Dedupe could block until the send is finished (similar to the way
that subvol deletes block until balance is finished) but I'd prefer to
have dedupe return immediately with an error so userspace can schedule
something different to dedupe (similar to the way that multiple concurrent
balances exclude each other).


> >
> > More details at:
> >
> >         https://github.com/Zygo/bees/issues/79#issuecomment-429039036
> >
> > > I understand it can break some applications, but adding other solution
> > > such as preventing send and dedupe from running in parallel
> > > (erroring out or block and wait for each other, etc) is going to be
> > > really ugly. There's always the workaround for apps to set the
> > > subvolume
> > > to RW mode, do the dedupe, then switch it back to RO mode.
> > >
> > > Thanks.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  parent reply	other threads:[~2019-02-21 16:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 18:05 [PATCH 0/4] Btrfs: a few more cleanups and fixes for clone/deduplication fdmanana
2018-12-12 18:05 ` [PATCH 1/4] Btrfs: move duplicated nodatasum check into common reflink/dedupe helper fdmanana
2019-01-11 14:55   ` David Sterba
2018-12-12 18:05 ` [PATCH 2/4] Btrfs: use cross mount point check for cloning and deduplication fdmanana
2018-12-13 16:02   ` David Sterba
2019-01-11 14:38     ` David Sterba
2018-12-12 18:05 ` [PATCH 3/4] Btrfs: check if destination root is read-only for deduplication fdmanana
2018-12-13 16:07   ` David Sterba
2019-01-31 16:39     ` Filipe Manana
2019-01-31 16:44       ` Hugo Mills
2019-02-18 15:38         ` David Sterba
2019-02-18 16:55           ` Filipe Manana
2019-02-12 17:59       ` Filipe Manana
2019-02-20 16:41       ` Zygo Blaxell
2019-02-20 16:54         ` Filipe Manana
2019-02-20 17:17           ` Zygo Blaxell
2019-02-22 11:13             ` Filipe Manana
2019-02-22 17:25               ` David Sterba
2019-02-21 16:54           ` Zygo Blaxell [this message]
2019-02-18 16:01   ` David Sterba
2018-12-12 18:05 ` [PATCH 4/4] Btrfs: remove no longer needed range length checks " fdmanana
2018-12-13 12:20   ` Nikolay Borisov
2019-01-31 16:31   ` Filipe Manana
2019-02-12 17:58     ` Filipe Manana
2019-02-18 15:10     ` David Sterba
2018-12-13 12:19 ` [PATCH 0/4] Btrfs: a few more cleanups and fixes for clone/deduplication Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190221165326.GH9995@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=dsterba@suse.cz \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.