linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	avi@scylladb.com
Subject: Re: [PATCH 09/10] iomap: add a IOMAP_DIO_NOALLOC flag
Date: Wed, 13 Jan 2021 10:32:15 -0500	[thread overview]
Message-ID: <20210113153215.GA1284163@bfoster> (raw)
In-Reply-To: <20210112232923.GD331610@dread.disaster.area>

On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
> On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> > Add a flag to request that the iomap instances do not allocate blocks
> > by translating it to another new IOMAP_NOALLOC flag.
> 
> Except "no allocation" that is not what XFS needs for concurrent
> sub-block DIO.
> 
> We are trying to avoid external sub-block IO outside the range of
> the user data IO (COW, sub-block zeroing, etc) so that we don't
> trash adjacent sub-block IO in flight. This means we can't do
> sub-block zeroing and that then means we can't map unwritten extents
> or allocate new extents for the sub-block IO.  It also means the IO
> range cannot span EOF because that triggers unconditional sub-block
> zeroing in iomap_dio_rw_actor().
> 
> And because we may have to map multiple extents to fully span an IO
> range, we have to guarantee that subsequent extents for the IO are
> also written otherwise we have a partial write abort case. Hence we
> have single extent limitations as well.
> 
> So "no allocation" really doesn't describe what we want this flag to
> at all.
> 
> If we're going to use a flag for this specific functionality, let's
> call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
> things with it.
> 
> 	1. Make unaligned IO a formal part of the iomap_dio_rw()
> 	behaviour so it can do the common checks to for things that
> 	need exclusive serialisation for unaligned IO (i.e. avoid IO
> 	spanning EOF, abort if there are cached pages over the
> 	range, etc).
> 
> 	2. require the filesystem mapping callback do only allow
> 	unaligned IO into ranges that are contiguous and don't
> 	require mapping state changes or sub-block zeroing to be
> 	performed during the sub-block IO.
> 
> 

Something I hadn't thought about before is whether applications might
depend on current unaligned dio serialization for coherency and thus
break if the kernel suddenly allows concurrent unaligned dio to pass
through. Should this be something that is explicitly requested by
userspace?

That aside, I agree that the DIO_UNALIGNED approach seems a bit more
clear than NOALLOC, but TBH the more I look at this the more Christoph's
first approach seems cleanest to me. It is a bit unfortunate to
duplicate the mapping lookups and have the extra ILOCK cycle, but the
lock is shared and only taken when I/O is unaligned. I don't really see
why that is a show stopper yet it's acceptable to fall back to exclusive
dio if the target range happens to be discontiguous (but otherwise
mapped/written).

So I dunno... to me, I would start with that approach and then as the
implementation soaks, perhaps see if we can find a way to optimize away
the extra cycle and lookup. In the meantime, performance should still be
improved significantly and the behavior fairly predictable. Anyways, I
suspect Dave disagrees so that's just my .02. ;) I'll let you guys find
some common ground and make a pass at whatever falls out...

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


  reply	other threads:[~2021-01-13 15:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12 16:26 [RFC] another attempt to reduce sub-block DIO serialisation Christoph Hellwig
2021-01-12 16:26 ` [PATCH 01/10] xfs: factor out a xfs_ilock_iocb helper Christoph Hellwig
2021-01-12 22:41   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 02/10] xfs: make xfs_file_aio_write_checks IOCB_NOWAIT-aware Christoph Hellwig
2021-01-12 22:42   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 03/10] xfs: cleanup the read/write helper naming Christoph Hellwig
2021-01-12 22:43   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 04/10] xfs: remove the buffered I/O fallback assert Christoph Hellwig
2021-01-12 22:44   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 05/10] xfs: simplify the read/write tracepoints Christoph Hellwig
2021-01-12 22:54   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 06/10] xfs: improve the reflink_bounce_dio_write tracepoint Christoph Hellwig
2021-01-12 22:56   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 07/10] xfs: split unaligned DIO write code out Christoph Hellwig
2021-01-12 23:00   ` Dave Chinner
2021-01-12 16:26 ` [PATCH 08/10] iomap: pass a flags argument to iomap_dio_rw Christoph Hellwig
2021-01-12 16:26 ` [PATCH 09/10] iomap: add a IOMAP_DIO_NOALLOC flag Christoph Hellwig
2021-01-12 23:29   ` Dave Chinner
2021-01-13 15:32     ` Brian Foster [this message]
2021-01-13 22:49       ` Dave Chinner
2021-01-14 10:23         ` Brian Foster
2021-01-14 10:43           ` Avi Kivity
2021-01-14 17:29       ` Christoph Hellwig
2021-01-14 17:26     ` Christoph Hellwig
2021-01-12 16:26 ` [PATCH 10/10] xfs: reduce exclusive locking on unaligned dio Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210113153215.GA1284163@bfoster \
    --to=bfoster@redhat.com \
    --cc=avi@scylladb.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).