All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction
Date: Thu, 23 Mar 2023 10:00:06 -0700	[thread overview]
Message-ID: <20230323170006.GA28317@zen> (raw)
In-Reply-To: <20230323161529.GA8070@zen>

On Thu, Mar 23, 2023 at 09:15:29AM -0700, Boris Burkov wrote:
> On Thu, Mar 23, 2023 at 01:47:28AM -0700, Christoph Hellwig wrote:
> > This is a bit of a mess.  And the root cause of that is that
> > btrfs_extract_ordered_extent the way it is used right now does
> > the wrong thing in terms of splitting the ordered_extent.  What
> > we want is to allocate a new one for the beginning of the range,
> > and leave the rest alone.
> > 
> > I did run into this a while ago during my (nt yet submitted) work
> > to keep an ordered_extent pointer in the btrfs_bio, and I have some
> > patches to sort it out.
> > 
> > I've rebased your fix on top of those, can you check if this tree
> > makes sense to you;
> > 
> >    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-dio-fix-hch
> > 
> > it passes basic testing so far.
> 
> Nice, this is great!
> 
> I actually also made the same changes as in your branch while working on
> my fix, but didn't know enough about the zoned use case to realize that
> the simpler "extract from beginning" constraint also applied to the
> zoned case. So what happened in my branch was I implemented the three
> way split as two "split at starts" which ultimately felt too messy and I
> opted for returning the new split objects from the the existing model.
> 
> If it's true that we can always do a "split from front" then I'm all
> aboard and think this is the way forward. Given that I found what I
> think is a serious bug in the case where pre>0, I suspect you are right,
> and we aren't hitting that case.
> 
> I will check that this passes my testing for the various dio cases (I
> have one modified xfstests case I haven't sent yet for the meanest
> version of the deadlock I have come up with so far) and the other tests
> that I saw races/flakiness on, but from a quick look, your branch looks
> correct to me. I believe the most non-obvious property my fix relies on
> is dio_data->ordered having the leftovers from the partial after
> submission so that it can be cancelled, which your branch looks to
> maintain.

Your branch as-is does not pass the existing tests, It's missing a fix
from my V5. We need to avoid splitting partial OEs when doing NOCOW dio
writes, because iomap_begin() does not create a fresh pinned em in that
case, since it reuses the existing extent.

e.g.,

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8cb61f4daec0..bbc89a0872e7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7719,7 +7719,7 @@ static void btrfs_dio_submit_io(const struct iomap_iter *iter, struct bio *bio,
         * cancelled in iomap_end to avoid a deadlock wherein faulting the
         * remaining pages is blocked on the outstanding ordered extent.
         */
-       if (iter->flags & IOMAP_WRITE) {
+       if (iter->flags & IOMAP_WRITE && !test_bit(BTRFS_ORDERED_NOCOW, &dio_data->ordered->flags)) {
                int err;

                err = btrfs_extract_ordered_extent(bbio, dio_data->ordered);

With that patch, I pass 10x of btrfs/250, so running the full suite next.

> 
> Assuming the tests pass, I do want to get this in sooner than later,
> since downstream is still waiting on a fix. Would you be willing to send
> your stack soon for my fix to land atop? I don't mind if you just send a
> patch series with my patches mixed in, either. If, OTOH, your patches
> are still a while out, or depend on something else that's underway,
> maybe we could land mine, then gut them for your improvements. I'm fine
> with it either way.
> 
> Thanks,
> Boris

  reply	other threads:[~2023-03-23 17:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22 19:11 [PATCH v5 0/5] btrfs: fix corruption caused by partial dio writes Boris Burkov
2023-03-22 19:11 ` [PATCH v5 1/5] btrfs: add function to create and return an ordered extent Boris Burkov
2023-03-22 19:11 ` [PATCH v5 2/5] btrfs: stash ordered extent in dio_data during iomap dio Boris Burkov
2023-03-22 19:11 ` [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction Boris Burkov
2023-03-23  8:47   ` Christoph Hellwig
2023-03-23 16:15     ` Boris Burkov
2023-03-23 17:00       ` Boris Burkov [this message]
2023-03-23 17:45         ` Boris Burkov
2023-03-23 21:29         ` Christoph Hellwig
2023-03-23 22:43           ` Boris Burkov
2023-03-24  0:24             ` Christoph Hellwig
2023-03-22 19:11 ` [PATCH v5 4/5] btrfs: fix crash with non-zero pre in btrfs_split_ordered_extent Boris Burkov
2023-03-23  8:36   ` Naohiro Aota
2023-03-23 16:22     ` Boris Burkov
2023-03-22 19:11 ` [PATCH v5 5/5] btrfs: split partial dio bios before submit Boris Burkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230323170006.GA28317@zen \
    --to=boris@bur.io \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.