All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, avi@scylladb.com, andres@anarazel.de
Subject: [RFC] xfs: reduce sub-block DIO serialisation
Date: Tue, 12 Jan 2021 12:07:40 +1100	[thread overview]
Message-ID: <20210112010746.1154363-1-david@fromorbit.com> (raw)

Hi folks,

This is the XFS implementation on the sub-block DIO optimisations
for written extents that I've mentioned on #xfs and a couple of
times now on the XFS mailing list.

It takes the approach of using the IOMAP_NOWAIT non-blocking
IO submission infrastructure to optimistically dispatch sub-block
DIO without exclusive locking. If the extent mapping callback
decides that it can't do the unaligned IO without extent
manipulation, sub-block zeroing, blocking or splitting the IO into
multiple parts, it aborts the IO with -EAGAIN. This allows the high
level filesystem code to then take exclusive locks and resubmit the
IO once it has guaranteed no other IO is in progress on the inode
(the current implementation).

This requires moving the IOMAP_NOWAIT setup decisions up into the
filesystem, adding yet another parameter to iomap_dio_rw(). So first
I convert iomap_dio_rw() to take an args structure so that we don't
have to modify the API every time we want to add another setup
parameter to the DIO submission code.

I then include Christophs IOCB_NOWAIT fxies and cleanups to the XFS
code, because they needed to be done regardless of the unaligned DIO
issues and they make the changes simpler. Then I split the unaligned
DIO path out from the aligned path, because all the extra complexity
to support better unaligned DIO submission concurrency is not
necessary for the block aligned path. Finally, I modify the
unaligned IO path to first submit the unaligned IO using
non-blocking semantics and provide a fallback to run the IO
exclusively if that fails.

This means that we consider sub-block dio into written a fast path
that should almost always succeed with minimal overhead and we put
all the overhead of failure into the slow path where exclusive
locking is required. Unlike Christoph's proposed patch, this means
we don't require an extra ILOCK cycle in the sub-block DIO setup
fast path, so it should perform almost identically to the block
aligned fast path.

Tested using fio with AIO+DIO randrw to a written file. Performance
increases from about 20k IOPS to 150k IOPS, which is the limit of
the setup I was using for testing. Also passed fstests auto group
on a both v4 and v5 XFS filesystems.

Thoughts, comments?

-Dave.



             reply	other threads:[~2021-01-12  1:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12  1:07 Dave Chinner [this message]
2021-01-12  1:07 ` [PATCH 1/6] iomap: convert iomap_dio_rw() to an args structure Dave Chinner
2021-01-12  1:22   ` Damien Le Moal
2021-01-12  1:40   ` Darrick J. Wong
2021-01-12  1:53     ` Dave Chinner
2021-01-12 10:31   ` Christoph Hellwig
2021-01-12  1:07 ` [PATCH 2/6] iomap: move DIO NOWAIT setup up into filesystems Dave Chinner
2021-01-12  1:07 ` [PATCH 3/6] xfs: factor out a xfs_ilock_iocb helper Dave Chinner
2021-01-12  1:07 ` [PATCH 4/6] xfs: make xfs_file_aio_write_checks IOCB_NOWAIT-aware Dave Chinner
2021-01-12  1:07 ` [PATCH 5/6] xfs: split unaligned DIO write code out Dave Chinner
2021-01-12 10:37   ` Christoph Hellwig
2021-01-12  1:07 ` [PATCH 6/6] xfs: reduce exclusive locking on unaligned dio Dave Chinner
2021-01-12 10:42   ` Christoph Hellwig
2021-01-12 17:01     ` Brian Foster
2021-01-12 17:10       ` Christoph Hellwig
2021-01-12 22:06       ` Dave Chinner
2021-01-12  8:01 ` [RFC] xfs: reduce sub-block DIO serialisation Avi Kivity
2021-01-12 22:13   ` Dave Chinner
2021-01-13  8:00     ` Avi Kivity
2021-01-13 20:38       ` Dave Chinner
2021-01-14  6:48         ` Avi Kivity
2021-01-17 21:34           ` Dave Chinner
2021-01-18  7:41             ` Avi Kivity
     [not found] ` <CACz=WechdgSnVHQsg0LKjMiG8kHLujBshmc270yrdjxfpffmDQ@mail.gmail.com>
2021-01-17 21:36   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210112010746.1154363-1-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=andres@anarazel.de \
    --cc=avi@scylladb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.