Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Filipe Manana <fdmanana@kernel.org>
Cc: dsterba@suse.cz, linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2] Btrfs: fix file corruption after snapshotting due to mix of buffered/DIO writes
Date: Tue, 12 Mar 2019 18:13:43 +0100
Message-ID: <20190312171342.GZ31119@twin.jikos.cz> (raw)
In-Reply-To: <CAL3q7H4eJr5ymDgYFa0DTxPif=7yrOhqMFhvsByijbKjM+=UUg@mail.gmail.com>

On Wed, Feb 27, 2019 at 06:56:10PM +0000, Filipe Manana wrote:
> > > What do you expect by falling back to writeback_inodes_sb()?
> > > It all ends up going through the same btrfs writeback path.
> > > And as I left it, if an error happens for one root, it still tries to
> > > flush writeback for all the remaining roots as well, so I don't get it
> > > why you fallback to writeback_inodes_sb().
> >
> > As I read the changelog, you say that the corruption does not happen
> > with FLUSHONCOMMIT, which does writeback_inodes_sb. Using that would be
> > too heavy, thus you only start the delalloc on snapshotted roots.
> 
> It does not happen with flushoncommit because delalloc is flushed for all roots.
> If flushoncommit (writeback_inodes_sb()) would flush only for roots
> that are about to be snapshotted, the corruption wouldn't happen as
> well.
> 
> >
> > So the idea is to use the same logic of flushoncommit in the unlikely
> > case when btrfs_start_delalloc_snapshot would fail. Only at some
> > performance cost, unless I'm missing something.
> 
> Ok, so I hope the first paragraph above explains why there's no need
> to fallback to the "flush all delalloc for all roots" logic
> (writeback_inodes_sb()).

Yeah, I think I get it now.

> > As for the v2 as you implement it without any error handling, doesn't
> > this allow the corruption to happen? If start_delalloc_inodes has a lot
> > of inodes for which it needs to allocate delalloc_work, the failure is
> > possible. That the list_for_each continues does not affect that
> > particular root.
> 
> Yes, it allows for the corruption to happen, that's why I had the
> error returned in v1.
> 
> writeback_inodes_sb() isn't special in any way - flushing some
> delalloc range can fail the same way, it ends up calling the same
> btrfs writepages() callback. Yes, it doesn't return any error, because
> it's relying on callers either not caring about it, or if they do,
> checking the inode's mapping for an error, which btrfs sets through
> its writepages() callback if any an error happens (by calling
> mapping_set_error()),
> or any other fs specific way of storing/checking for writeback errors.

Ok, so the the error handling needs to stay and there's no simple way
around it.

> If we get an error when flushing the dealloc range from the example in
> the changelog, then the corruption happens, regardless of writeback
> being
> triggered by writeback_inodes_sb() or btrfs_start_delalloc_snapshot().

Thanks for the time discussing that. I'll use code from v1 and the
subject from v2 and add the patch to the queue.

      reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-04 14:28 [PATCH] Btrfs: fix file corruption after snapshotting fdmanana
2019-02-18 17:11 ` David Sterba
2019-02-18 17:27   ` Filipe Manana
2019-02-27 12:47     ` David Sterba
2019-02-27 13:42       ` Filipe Manana
2019-02-27 17:19         ` David Sterba
2019-02-27 13:42 ` [PATCH v2] Btrfs: fix file corruption after snapshotting due to mix of buffered/DIO writes fdmanana
2019-02-27 17:26   ` David Sterba
2019-02-27 17:32     ` Filipe Manana
2019-02-27 18:31       ` David Sterba
2019-02-27 18:56         ` Filipe Manana
2019-03-12 17:13           ` David Sterba [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190312171342.GZ31119@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox