From: Andreas Gruenbacher <agruenba@redhat.com>
To: cluster-devel@redhat.com, Christoph Hellwig <hch@lst.de>
Cc: linux-fsdevel@vger.kernel.org, Andreas Gruenbacher <agruenba@redhat.com>
Subject: [PATCH v7 03/12] iomap: Complete partial direct I/O writes synchronously
Date: Mon, 4 Jun 2018 14:37:20 +0200 [thread overview]
Message-ID: <20180604123729.23414-4-agruenba@redhat.com> (raw)
In-Reply-To: <20180604123729.23414-1-agruenba@redhat.com>
According to xfstest generic/240, applications seem to expect direct I/O
writes to either complete as a whole or to fail; short direct I/O writes
are apparently not appreciated. This means that when only part of an
asynchronous direct I/O write succeeds, we can either fail the entire
write, or we can wait for the partial write to complete and retry the
remaining write as buffered I/O. The old __blockdev_direct_IO helper
has code for waiting for partial writes to complete; the new
iomap_dio_rw iomap helper does not.
The above mentioned fallback mode is needed for gfs2, which doesn't
allow block allocations under direct I/O to avoid taking cluster-wide
exclusive locks. As a consequence, an asynchronous direct I/O write to
a file range that contains a hole will result in a short write. In that
case, wait for the short write to complete to allow gfs2 to recover.
This will make xfstest generic/240 work on gfs2.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/iomap.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/iomap.c b/fs/iomap.c
index 54b693da3a35..a0d3b7742060 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -696,6 +696,7 @@ struct iomap_dio {
atomic_t ref;
unsigned flags;
int error;
+ bool wait_for_completion;
union {
/* used during submission and for synchronous completion: */
@@ -797,9 +798,8 @@ static void iomap_dio_bio_end_io(struct bio *bio)
iomap_dio_set_error(dio, blk_status_to_errno(bio->bi_status));
if (atomic_dec_and_test(&dio->ref)) {
- if (is_sync_kiocb(dio->iocb)) {
+ if (dio->wait_for_completion) {
struct task_struct *waiter = dio->submit.waiter;
-
WRITE_ONCE(dio->submit.waiter, NULL);
wake_up_process(waiter);
} else if (dio->flags & IOMAP_DIO_WRITE) {
@@ -990,13 +990,12 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
dio->end_io = end_io;
dio->error = 0;
dio->flags = 0;
+ dio->wait_for_completion = is_sync_kiocb(iocb);
dio->submit.iter = iter;
- if (is_sync_kiocb(iocb)) {
- dio->submit.waiter = current;
- dio->submit.cookie = BLK_QC_T_NONE;
- dio->submit.last_queue = NULL;
- }
+ dio->submit.waiter = current;
+ dio->submit.cookie = BLK_QC_T_NONE;
+ dio->submit.last_queue = NULL;
if (iov_iter_rw(iter) == READ) {
if (pos >= dio->i_size)
@@ -1033,7 +1032,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
dio_warn_stale_pagecache(iocb->ki_filp);
ret = 0;
- if (iov_iter_rw(iter) == WRITE && !is_sync_kiocb(iocb) &&
+ if (iov_iter_rw(iter) == WRITE && !dio->wait_for_completion &&
!inode->i_sb->s_dio_done_wq) {
ret = sb_init_dio_done_wq(inode->i_sb);
if (ret < 0)
@@ -1048,8 +1047,10 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
iomap_dio_actor);
if (ret <= 0) {
/* magic error code to fall back to buffered I/O */
- if (ret == -ENOTBLK)
+ if (ret == -ENOTBLK) {
+ dio->wait_for_completion = true;
ret = 0;
+ }
break;
}
pos += ret;
@@ -1062,8 +1063,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
if (ret < 0)
iomap_dio_set_error(dio, ret);
+ smp_mb__before_atomic();
if (!atomic_dec_and_test(&dio->ref)) {
- if (!is_sync_kiocb(iocb))
+ if (!dio->wait_for_completion)
return -EIOCBQUEUED;
for (;;) {
--
2.17.0
next prev parent reply other threads:[~2018-06-04 12:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-04 12:37 [PATCH v7 00/12] gfs2 iomap write support Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 01/12] iomap: inline data should be an iomap type, not a flag Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 02/12] iomap: Mark newly allocated buffer heads as new Andreas Gruenbacher
2018-06-04 12:37 ` Andreas Gruenbacher [this message]
2018-06-04 12:37 ` [PATCH v7 04/12] fs: factor out a __generic_write_end helper Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 05/12] fs: allow to always dirty inode in __generic_write_end Andreas Gruenbacher
2018-06-04 12:48 ` Christoph Hellwig
2018-06-04 16:24 ` Andreas Grünbacher
2018-06-04 12:37 ` [PATCH v7 06/12] iomap: Generic inline data handling Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 07/12] iomap: Add page_write_end iomap hook Andreas Gruenbacher
2018-06-04 12:50 ` Christoph Hellwig
2018-06-04 16:40 ` Andreas Grünbacher
2018-06-04 12:37 ` [PATCH v7 08/12] gfs2: iomap buffered write support Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 09/12] gfs2: gfs2_extent_length cleanup Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 10/12] gfs2: iomap direct I/O support Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 11/12] gfs2: Remove gfs2_write_{begin,end} Andreas Gruenbacher
2018-06-04 12:37 ` [PATCH v7 12/12] iomap: Put struct iomap_ops into struct iomap Andreas Gruenbacher
2018-06-04 12:52 ` Christoph Hellwig
2018-06-04 17:00 ` Andreas Grünbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180604123729.23414-4-agruenba@redhat.com \
--to=agruenba@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).