linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 V2] iomap: Use FUA for O_DSYNC DIO writes
@ 2018-03-27  7:07 Dave Chinner
  2018-03-27  7:07 ` [PATCH 1/3] xfs: move generic_write_sync calls inwards Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Dave Chinner @ 2018-03-27  7:07 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-fsdevel, linux-block, hch, rdorr

Hi folks,

This is a followup on my original patch to enable use of FUA writes
for pure data O_DSYNC writes through the XFS and iomap based direct
IO paths. This version has all of the changes christoph asked for,
and splits it up into simpler patches. The performance improvements
are detailed in the commit description of the third patch.

Changes from the original patch:
- put XFS generic_write_sync call de-factoring into it's own patch
- the change for iomap_dio_rw() to handle all aspects of O_DSYNC
  writes was pushed into it's own patch
- changed the flag and logic for tracking if FUA is in use though
  the iomap DIO paths
- only enable use of FUA is the underlying device supports FUA
  operations.

Thoughts and comments welcome.

Cheers,

Dave.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] xfs: move generic_write_sync calls inwards
  2018-03-27  7:07 [PATCH 0/3 V2] iomap: Use FUA for O_DSYNC DIO writes Dave Chinner
@ 2018-03-27  7:07 ` Dave Chinner
  2018-03-28  7:36   ` Christoph Hellwig
  2018-03-27  7:07 ` [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes Dave Chinner
  2018-03-27  7:07 ` [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
  2 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2018-03-27  7:07 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-fsdevel, linux-block, hch, rdorr

From: Dave Chinner <dchinner@redhat.com>

To prepare for iomap iinfrastructure based DSYNC optimisations.

While moving the code araound, move the XFS write bytes metric
update for direct IO into xfs_dio_write_end_io callback so that we
always capture the amount of data written via AIO+DIO. This fixes
the problem where queued AIO+DIO writes are not accounted to this
metric.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_file.c | 48 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index a5b6973bf15a..6c30f410ca0c 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -416,6 +416,12 @@ xfs_dio_write_end_io(
 	if (size <= 0)
 		return size;
 
+	/*
+	 * capture amount written on completion as we can't reliably account
+	 * for it on submission
+	 */
+	XFS_STATS_ADD(ip->i_mount, xs_write_bytes, size);
+
 	if (flags & IOMAP_DIO_COW) {
 		error = xfs_reflink_end_cow(ip, offset, size);
 		if (error)
@@ -564,6 +570,11 @@ xfs_file_dio_aio_write(
 	 * complete fully or fail.
 	 */
 	ASSERT(ret < 0 || ret == count);
+
+	if (ret > 0) {
+		/* Handle various SYNC-type writes */
+		ret = generic_write_sync(iocb, ret);
+	}
 	return ret;
 }
 
@@ -601,7 +612,16 @@ xfs_file_dax_write(
 	}
 out:
 	xfs_iunlock(ip, iolock);
-	return error ? error : ret;
+	if (error)
+		return error;
+
+	if (ret > 0) {
+		XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret);
+
+		/* Handle various SYNC-type writes */
+		ret = generic_write_sync(iocb, ret);
+	}
+	return ret;
 }
 
 STATIC ssize_t
@@ -671,6 +691,12 @@ xfs_file_buffered_aio_write(
 out:
 	if (iolock)
 		xfs_iunlock(ip, iolock);
+
+	if (ret > 0) {
+		XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret);
+		/* Handle various SYNC-type writes */
+		ret = generic_write_sync(iocb, ret);
+	}
 	return ret;
 }
 
@@ -695,8 +721,9 @@ xfs_file_write_iter(
 		return -EIO;
 
 	if (IS_DAX(inode))
-		ret = xfs_file_dax_write(iocb, from);
-	else if (iocb->ki_flags & IOCB_DIRECT) {
+		return xfs_file_dax_write(iocb, from);
+
+	if (iocb->ki_flags & IOCB_DIRECT) {
 		/*
 		 * Allow a directio write to fall back to a buffered
 		 * write *only* in the case that we're doing a reflink
@@ -704,20 +731,11 @@ xfs_file_write_iter(
 		 * allow an operation to fall back to buffered mode.
 		 */
 		ret = xfs_file_dio_aio_write(iocb, from);
-		if (ret == -EREMCHG)
-			goto buffered;
-	} else {
-buffered:
-		ret = xfs_file_buffered_aio_write(iocb, from);
+		if (ret != -EREMCHG)
+			return ret;
 	}
 
-	if (ret > 0) {
-		XFS_STATS_ADD(ip->i_mount, xs_write_bytes, ret);
-
-		/* Handle various SYNC-type writes */
-		ret = generic_write_sync(iocb, ret);
-	}
-	return ret;
+	return xfs_file_buffered_aio_write(iocb, from);
 }
 
 #define	XFS_FALLOC_FL_SUPPORTED						\
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes
  2018-03-27  7:07 [PATCH 0/3 V2] iomap: Use FUA for O_DSYNC DIO writes Dave Chinner
  2018-03-27  7:07 ` [PATCH 1/3] xfs: move generic_write_sync calls inwards Dave Chinner
@ 2018-03-27  7:07 ` Dave Chinner
  2018-03-28  7:38   ` Christoph Hellwig
  2018-03-28  7:44   ` Christoph Hellwig
  2018-03-27  7:07 ` [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
  2 siblings, 2 replies; 8+ messages in thread
From: Dave Chinner @ 2018-03-27  7:07 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-fsdevel, linux-block, hch, rdorr

From: Dave Chinner <dchinner@redhat.com>

Currently iomap_dio_rw() only handles (data)sync write completions
for AIO. This means we can't optimised non-AIO IO to minimise device
flushes as we can't tell the caller whether a flush is required or
not.

To solve this problem and enable further optimisations, make
iomap_dio_rw responsible for data sync behaviour for all IO, not
just AIO.

In doing so, the sync operation is now accounted as part of the DIO
IO by inode_dio_end(), hence post-IO data stability updates will no
long race against operations that serialise via inode_dio_wait()
such as truncate or hole punch.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 fs/iomap.c        | 17 ++++++++++++++---
 fs/xfs/xfs_file.c |  5 -----
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/iomap.c b/fs/iomap.c
index afd163586aa0..f5d4e348bc9b 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -685,6 +685,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_data);
  * Private flags for iomap_dio, must not overlap with the public ones in
  * iomap.h:
  */
+#define IOMAP_DIO_WRITE_SYNC	(1 << 29)
 #define IOMAP_DIO_WRITE		(1 << 30)
 #define IOMAP_DIO_DIRTY		(1 << 31)
 
@@ -759,6 +760,13 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 			dio_warn_stale_pagecache(iocb->ki_filp);
 	}
 
+	/*
+	 * If this is a DSYNC write, make sure we push it to stable storage now
+	 * that we've written data.
+	 */
+	if (ret > 0 && (dio->flags & IOMAP_DIO_WRITE_SYNC))
+		ret = generic_write_sync(iocb, ret);
+
 	inode_dio_end(file_inode(iocb->ki_filp));
 	kfree(dio);
 
@@ -769,12 +777,9 @@ static void iomap_dio_complete_work(struct work_struct *work)
 {
 	struct iomap_dio *dio = container_of(work, struct iomap_dio, aio.work);
 	struct kiocb *iocb = dio->iocb;
-	bool is_write = (dio->flags & IOMAP_DIO_WRITE);
 	ssize_t ret;
 
 	ret = iomap_dio_complete(dio);
-	if (is_write && ret > 0)
-		ret = generic_write_sync(iocb, ret);
 	iocb->ki_complete(iocb, ret, 0);
 }
 
@@ -961,6 +966,10 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 	return copied;
 }
 
+/*
+ * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO
+ * is being issued as AIO or not.
+ */
 ssize_t
 iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops, iomap_dio_end_io_t end_io)
@@ -1006,6 +1015,8 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			dio->flags |= IOMAP_DIO_DIRTY;
 	} else {
 		dio->flags |= IOMAP_DIO_WRITE;
+		if (iocb->ki_flags & IOCB_DSYNC)
+			dio->flags |= IOMAP_DIO_WRITE_SYNC;
 		flags |= IOMAP_WRITE;
 	}
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 6c30f410ca0c..33b1e8b5dc5c 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -570,11 +570,6 @@ xfs_file_dio_aio_write(
 	 * complete fully or fail.
 	 */
 	ASSERT(ret < 0 || ret == count);
-
-	if (ret > 0) {
-		/* Handle various SYNC-type writes */
-		ret = generic_write_sync(iocb, ret);
-	}
 	return ret;
 }
 
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes
  2018-03-27  7:07 [PATCH 0/3 V2] iomap: Use FUA for O_DSYNC DIO writes Dave Chinner
  2018-03-27  7:07 ` [PATCH 1/3] xfs: move generic_write_sync calls inwards Dave Chinner
  2018-03-27  7:07 ` [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes Dave Chinner
@ 2018-03-27  7:07 ` Dave Chinner
  2018-03-28  7:48   ` Christoph Hellwig
  2 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2018-03-27  7:07 UTC (permalink / raw)
  To: linux-xfs; +Cc: linux-fsdevel, linux-block, hch, rdorr

From: Dave Chinner <dchinner@redhat.com>

If we are doing direct IO writes with datasync semantics, we often
have to flush metadata changes along with the data write. However,
if we are overwriting existing data, there are no metadata changes
that we need to flush. In this case, optimising the IO by using
FUA write makes sense.

We know from teh IOMAP_F_DIRTY flag as to whether a specific inode
requires a metadata flush - this is currently used by DAX to ensure
extent modification as stable in page fault operations. For direct
IO writes, we can use it to determine if we need to flush metadata
or not once the data is on disk.

Hence if we have been returned a mapped extent that is not new and
the IO mapping is not dirty, then we can use a FUA write to provide
datasync semantics. This allows us to short-cut the
generic_write_sync() call in IO completion and hence avoid
unnecessary operations. This makes pure direct IO data write
behaviour identical to the way block devices use REQ_FUA to provide
datasync semantics.

Now that iomap_dio_rw() is determining if REQ_FUA can be used, we
have to stop issuing generic_write_sync() calls from the XFS code
when REQ_FUA is issued, otherwise it will still throw a cache flush
to the device via xfs_file_fsync(). To do this, we need to make
iomap_dio_rw() always responsible for issuing generic_write_sync()
when necessary, not just for AIO calls. This means the filesystem
doesn't have to guess when cache flushes are necessary now.

On a FUA enabled device, a synchronous direct IO write workload
(sequential 4k overwrites in 32MB file) had the following results:

# xfs_io -fd -c "pwrite -V 1 -D 0 32m" /mnt/scratch/boo

kernel		time	write()s	write iops	Write b/w
------		----	--------	----------	---------
(no dsync)	 4s	2173/s		2173		8.5MB/s
vanilla		22s	 370/s		 750		1.4MB/s
patched		19s	 420/s		 420		1.6MB/s

The patched code clearly doesn't send cache flushes anymore, but
instead uses FUA (confirmed via blktrace), and performance improves
a bit as a result. However, the benefits will be higher on workloads
that mix O_DSYNC overwrites with other write IO as we won't be
flushing the entire device cache on every DSYNC overwrite IO
anymore.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
 fs/iomap.c             | 44 +++++++++++++++++++++++++++++++++++++++++---
 include/linux/blkdev.h |  1 +
 2 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/fs/iomap.c b/fs/iomap.c
index f5d4e348bc9b..92f88ce8e2d4 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -685,6 +685,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_data);
  * Private flags for iomap_dio, must not overlap with the public ones in
  * iomap.h:
  */
+#define IOMAP_DIO_WRITE_FUA	(1 << 28)
 #define IOMAP_DIO_WRITE_SYNC	(1 << 29)
 #define IOMAP_DIO_WRITE		(1 << 30)
 #define IOMAP_DIO_DIRTY		(1 << 31)
@@ -863,6 +864,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 	struct iov_iter iter;
 	struct bio *bio;
 	bool need_zeroout = false;
+	bool use_fua = false;
 	int nr_pages, ret;
 	size_t copied = 0;
 
@@ -888,6 +890,18 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 			dio->flags |= IOMAP_DIO_COW;
 		if (iomap->flags & IOMAP_F_NEW)
 			need_zeroout = true;
+		else {
+			/*
+			 * Use a FUA write if we need datasync semantics, this
+			 * is a pure data IO that doesn't require any metadata
+			 * updates and the underlying device supports FUA. This
+			 * allows us to avoid cache flushes on IO completion.
+			 */
+			if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) &&
+			    (dio->flags & IOMAP_DIO_WRITE_FUA) &&
+			    blk_queue_fua(bdev_get_queue(iomap->bdev)))
+				use_fua = true;
+		}
 		break;
 	default:
 		WARN_ON_ONCE(1);
@@ -935,7 +949,13 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 
 		n = bio->bi_iter.bi_size;
 		if (dio->flags & IOMAP_DIO_WRITE) {
-			bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE);
+			int op_flags = REQ_SYNC | REQ_IDLE;
+
+			if (use_fua)
+				op_flags |= REQ_FUA;
+			else
+				dio->flags &= ~IOMAP_DIO_WRITE_FUA;
+			bio_set_op_attrs(bio, REQ_OP_WRITE, op_flags);
 			task_io_account_write(n);
 		} else {
 			bio_set_op_attrs(bio, REQ_OP_READ, 0);
@@ -968,7 +988,12 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
 
 /*
  * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO
- * is being issued as AIO or not.
+ * is being issued as AIO or not.  This allows us to optimise pure data writes
+ * to use REQ_FUA rather than requiring generic_write_sync() to issue a
+ * REQ_FLUSH post write. This is slightly tricky because a single request here
+ * can be mapped into multiple disjoint IOs and only a subset of the IOs issued
+ * may be pure data writes. In that case, we still need to do a full data sync
+ * completion.
  */
 ssize_t
 iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
@@ -1015,8 +1040,14 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			dio->flags |= IOMAP_DIO_DIRTY;
 	} else {
 		dio->flags |= IOMAP_DIO_WRITE;
-		if (iocb->ki_flags & IOCB_DSYNC)
+		if (iocb->ki_flags & IOCB_DSYNC) {
 			dio->flags |= IOMAP_DIO_WRITE_SYNC;
+			/*
+			 * Any non-FUA write will clear this flag, hence we know
+			 * before completion whether a cache flush is necessary.
+			 */
+			dio->flags |= IOMAP_DIO_WRITE_FUA;
+		}
 		flags |= IOMAP_WRITE;
 	}
 
@@ -1073,6 +1104,13 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 	if (ret < 0)
 		iomap_dio_set_error(dio, ret);
 
+	/*
+	 * If all the writes we issued were FUA, we don't need to flush the
+	 * cache on IO completion. Clear the sync flag for this case.
+	 */
+	if (dio->flags & IOMAP_DIO_WRITE_FUA)
+		dio->flags &= ~IOMAP_DIO_WRITE_SYNC;
+
 	if (!atomic_dec_and_test(&dio->ref)) {
 		if (!is_sync_kiocb(iocb))
 			return -EIOCBQUEUED;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ed63f3b69c12..e853f1748a4b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -800,6 +800,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_quiesced(q)	test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags)
 #define blk_queue_preempt_only(q)				\
 	test_bit(QUEUE_FLAG_PREEMPT_ONLY, &(q)->queue_flags)
+#define blk_queue_fua(q)	test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags)
 
 extern int blk_set_preempt_only(struct request_queue *q);
 extern void blk_clear_preempt_only(struct request_queue *q);
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] xfs: move generic_write_sync calls inwards
  2018-03-27  7:07 ` [PATCH 1/3] xfs: move generic_write_sync calls inwards Dave Chinner
@ 2018-03-28  7:36   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:36 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, linux-fsdevel, linux-block, hch, rdorr

> +	/*
> +	 * capture amount written on completion as we can't reliably account
> +	 * for it on submission

Captialize the first c, and a . at the end of the sentence please.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes
  2018-03-27  7:07 ` [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes Dave Chinner
@ 2018-03-28  7:38   ` Christoph Hellwig
  2018-03-28  7:44   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, linux-fsdevel, linux-block, hch, rdorr

> @@ -769,12 +777,9 @@ static void iomap_dio_complete_work(struct work_struct *work)
>  {
>  	struct iomap_dio *dio = container_of(work, struct iomap_dio, aio.work);
>  	struct kiocb *iocb = dio->iocb;
> -	bool is_write = (dio->flags & IOMAP_DIO_WRITE);
>  	ssize_t ret;
>  
>  	ret = iomap_dio_complete(dio);
> -	if (is_write && ret > 0)
> -		ret = generic_write_sync(iocb, ret);
>  	iocb->ki_complete(iocb, ret, 0);

Could be simplified to:

{
 	struct iomap_dio *dio = container_of(work, struct iomap_dio, aio.work);

 	iocb->ki_complete(dio->iocb, iomap_dio_complete(dio); 0);
}

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes
  2018-03-27  7:07 ` [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes Dave Chinner
  2018-03-28  7:38   ` Christoph Hellwig
@ 2018-03-28  7:44   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, linux-fsdevel, linux-block, hch, rdorr

> +#define IOMAP_DIO_WRITE_SYNC	(1 << 29)

Actually based on the next patch can we rename this to something like:
IOMAP_DIO_NEED_SYNC?  That makes the usage a little more clear.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes
  2018-03-27  7:07 ` [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
@ 2018-03-28  7:48   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, linux-fsdevel, linux-block, hch, rdorr

>  		if (iomap->flags & IOMAP_F_NEW)
>  			need_zeroout = true;
> +		else {

Curly braces on both sides of the else, please.

>  		if (dio->flags & IOMAP_DIO_WRITE) {
> -			bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE);
> +			int op_flags = REQ_SYNC | REQ_IDLE;
> +
> +			if (use_fua)
> +				op_flags |= REQ_FUA;
> +			else
> +				dio->flags &= ~IOMAP_DIO_WRITE_FUA;
> +			bio_set_op_attrs(bio, REQ_OP_WRITE, op_flags);

Please kill the bio_set_op_attrs flags while you are at it and assign
directly to bio->bi_opf:

			bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_IDLE;
			if (use_fua)
				bio->bi_opf |= REQ_FUA;
			else
				dio->flags &= ~IOMAP_DIO_WRITE_FUA;

>  		dio->flags |= IOMAP_DIO_WRITE;
> -		if (iocb->ki_flags & IOCB_DSYNC)
> +		if (iocb->ki_flags & IOCB_DSYNC) {
>  			dio->flags |= IOMAP_DIO_WRITE_SYNC;
> +			/*
> +			 * Any non-FUA write will clear this flag, hence we know
> +			 * before completion whether a cache flush is necessary.
> +			 */
> +			dio->flags |= IOMAP_DIO_WRITE_FUA;

I'd mention that we optimistically try fua first in the comment, and then
go on with what you wrote.

> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index ed63f3b69c12..e853f1748a4b 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -800,6 +800,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
>  #define blk_queue_quiesced(q)	test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags)
>  #define blk_queue_preempt_only(q)				\
>  	test_bit(QUEUE_FLAG_PREEMPT_ONLY, &(q)->queue_flags)
> +#define blk_queue_fua(q)	test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags)

Separate patch, please.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-28  7:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-27  7:07 [PATCH 0/3 V2] iomap: Use FUA for O_DSYNC DIO writes Dave Chinner
2018-03-27  7:07 ` [PATCH 1/3] xfs: move generic_write_sync calls inwards Dave Chinner
2018-03-28  7:36   ` Christoph Hellwig
2018-03-27  7:07 ` [PATCH 2/3] iomap: iomap_dio_rw() handles all sync writes Dave Chinner
2018-03-28  7:38   ` Christoph Hellwig
2018-03-28  7:44   ` Christoph Hellwig
2018-03-27  7:07 ` [PATCH 3/3] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
2018-03-28  7:48   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).