All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: Eric Wheeler <bcachefs@lists.ewheeler.net>
Cc: Demi Marie Obenour <demi@invisiblethingslab.com>,
	linux-bcachefs@vger.kernel.org
Subject: Re: bcachefs loop devs (was: Comparison to ZFS and BTRFS)
Date: Mon, 18 Apr 2022 21:41:40 -0400	[thread overview]
Message-ID: <20220419014140.5jz4hahhkfksulce@moria.home.lan> (raw)
In-Reply-To: <1f3290c6-535a-a15f-c02f-325099ecc4e0@ewheeler.net>

On Mon, Apr 18, 2022 at 06:16:09PM -0700, Eric Wheeler wrote:
> On Fri, 15 Apr 2022, Kent Overstreet wrote:
> > On Wed, Apr 06, 2022 at 02:55:04AM -0400, Demi Marie Obenour wrote:
> > > - How does an O_DIRECT loop device on bcachefs compare to a zvol on ZFS?
> > 
> > I'd have to benchmark/profile it. It appears there's some bugs in the way the
> > loop driver in O_DIRECT mode interacts with bcachefs according to xfstests, and
> > the loopback driver is implemented in a more heavyweight way that it needs to be
> > - there's room for improvement.
> 
> Hi Kent, regarding loop devs:
> 
> We wrote this up before realizing that REQ_OP_FLUSH does not order writes 
> like REQ_FLUSH once did, so my premise for the email linked below was 
> incorrect---but perhaps the concept is relevant.
> 
> I wonder if something is going on between (1) filesystem above loop.c 
> (bcachefs in this case), (2) the block layer re-ordering, and (3) the 
> kiocb ki_complete callback in loop.c that could create out-of-order 
> journal commits in the filesystem above the loop device (eg, xfs from #1):
> 
> 	https://www.spinics.net/lists/linux-block/msg82730.html
> 
>   From loop.c in lo_rw_aio():
> 	[...]
> 	cmd->iocb.ki_pos = pos;
> 	cmd->iocb.ki_filp = file;
> 	cmd->iocb.ki_complete = lo_rw_aio_complete; 
> 	cmd->iocb.ki_flags = IOCB_DIRECT;
> 	cmd->iocb.ki_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
> 
>   A more detailed loop.c call tree summary is here:
> 	https://lore.kernel.org/all/59a58637-837-fc28-6cb9-d584aa21d60@ewheeler.net/T/ 
> 
> If bcachefs immediately calls .ki_complete() after queueing the IO within 
> bcachefs but before it commits to bcachefs's disk, then loop.c will mark 
> the IO as complete (blk_mq_complete_request via lo_rw_aio_complete) too 
> soon after .write_iter is called, thus breaking the expected ordering in 
> the filesystem (eg, xfs) atop of the loop device.

We don't call .ki_complete (in DIO mode) until the write has been complete,
including the btree update - this is necessary for read-after-write consistency. 

If your description of the loopback code is correct that does sound suspicious
though - queuing every IO to work item shouldn't hurt anything from a
correctness POV but it definitely shouldn't be needed or wanted from a
performance POV.

What are you seeing?

  reply	other threads:[~2022-04-19  1:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06  6:55 Comparison to ZFS and BTRFS Demi Marie Obenour
2022-04-13 22:43 ` Eric Wheeler
2022-04-15 19:11 ` Kent Overstreet
2022-04-18 14:07   ` Demi Marie Obenour
2022-04-19  1:35     ` Kent Overstreet
2022-04-19 13:16       ` Demi Marie Obenour
2022-04-19  1:16   ` bcachefs loop devs (was: Comparison to ZFS and BTRFS) Eric Wheeler
2022-04-19  1:41     ` Kent Overstreet [this message]
2022-04-19 20:42       ` bcachefs loop devs Eric Wheeler
2022-06-02  8:45         ` Demi Marie Obenour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220419014140.5jz4hahhkfksulce@moria.home.lan \
    --to=kent.overstreet@gmail.com \
    --cc=bcachefs@lists.ewheeler.net \
    --cc=demi@invisiblethingslab.com \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.