linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] zonefs: fix IOCB_NOWAIT handling
@ 2020-02-21 14:37 Christoph Hellwig
  2020-02-24 13:48 ` Damien Le Moal
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2020-02-21 14:37 UTC (permalink / raw)
  To: damien.lemoal, naohiro.aota; +Cc: linux-fsdevel

IOCB_NOWAIT can't just be ignored as it breaks applications expecting
it not to block.  Just refuse the operation as applications must handle
that (e.g. by falling back to a thread pool).

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/zonefs/super.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 8bc6ef82d693..69aee3dfb660 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -601,13 +601,13 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
 	ssize_t ret;
 
 	/*
-	 * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
+	 * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT
 	 * as this can cause write reordering (e.g. the first aio gets EAGAIN
 	 * on the inode lock but the second goes through but is now unaligned).
 	 */
-	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
-	    && (iocb->ki_flags & IOCB_NOWAIT))
-		iocb->ki_flags &= ~IOCB_NOWAIT;
+	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) &&
+	    (iocb->ki_flags & IOCB_NOWAIT))
+		return -EOPNOTSUPP;
 
 	if (iocb->ki_flags & IOCB_NOWAIT) {
 		if (!inode_trylock(inode))
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] zonefs: fix IOCB_NOWAIT handling
  2020-02-21 14:37 [PATCH] zonefs: fix IOCB_NOWAIT handling Christoph Hellwig
@ 2020-02-24 13:48 ` Damien Le Moal
  2020-02-24 20:18   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Damien Le Moal @ 2020-02-24 13:48 UTC (permalink / raw)
  To: Christoph Hellwig, Naohiro Aota; +Cc: linux-fsdevel

Hi Christoph,

On 2020/02/21 23:37, Christoph Hellwig wrote:
> IOCB_NOWAIT can't just be ignored as it breaks applications expecting
> it not to block.  Just refuse the operation as applications must handle
> that (e.g. by falling back to a thread pool).
> 
> Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/zonefs/super.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
> index 8bc6ef82d693..69aee3dfb660 100644
> --- a/fs/zonefs/super.c
> +++ b/fs/zonefs/super.c
> @@ -601,13 +601,13 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
>  	ssize_t ret;
>  
>  	/*
> -	 * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
> +	 * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT
>  	 * as this can cause write reordering (e.g. the first aio gets EAGAIN
>  	 * on the inode lock but the second goes through but is now unaligned).
>  	 */
> -	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
> -	    && (iocb->ki_flags & IOCB_NOWAIT))
> -		iocb->ki_flags &= ~IOCB_NOWAIT;
> +	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) &&
> +	    (iocb->ki_flags & IOCB_NOWAIT))
> +		return -EOPNOTSUPP;
>  
>  	if (iocb->ki_flags & IOCB_NOWAIT) {
>  		if (!inode_trylock(inode))
> 

The main problem with allowing IOCB_NOWAIT is that for an application that sends
multiple write AIOs with a single io_submit(), all write AIOs after one that is
failed with -EAGAIN will be failed with -EINVAL (not -EIO !) since they will
have an unaligned write offset. I am wondering if that is really a problem at
all since the application can catch the -EAGAIN and -EINVAL, indicating that the
write stream was stopped. So maybe it is reasonable to simply allow IOCB_NOWAIT ?

Dave did not think it is. I was split on this. Would be happy to hear your
opinion. Another solution I was thinking of is something like the hack below,
which may be cleaner and easier for the application to handle as that reduces
the number of errors for a multi-io io_submit() call with RWF_NOWAIT. To do so,
this introduces the IOCB_WRITE_FAIL_FAST kiocb flag that is set in
zonefs_file_dio_write() for a nowait kiocb to a sequential zone file. If this
function fails (with -EAGAIN or any other error), this flag instruct aio_write()
to return the error for the failed kiocb instead of 0. This error return of
aio_write() then stops the io_submit() loop on the first failed iocb.


diff --git a/fs/aio.c b/fs/aio.c
index 5f3d3d814928..9f01541c8ecc 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1568,6 +1568,8 @@ static int aio_write(struct kiocb *req, const struct iocb
*iocb,
 		return ret;
 	ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter));
 	if (!ret) {
+		ssize_t err;
+
 		/*
 		 * Open-code file_start_write here to grab freeze protection,
 		 * which will be released by another thread in
@@ -1580,7 +1582,12 @@ static int aio_write(struct kiocb *req, const struct iocb
*iocb,
 			__sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE);
 		}
 		req->ki_flags |= IOCB_WRITE;
-		aio_rw_done(req, call_write_iter(file, req, &iter));
+		err = call_write_iter(file, req, &iter);
+		if (err != -EIOCBQUEUED) {
+		    aio_rw_done(req, err);
+		    if ((req->ki_flags & IOCB_WRITE_FAIL_FAST) && err < 0)
+			    ret = err;
+		}
 	}
 	kfree(iovec);
 	return ret;
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 8bc6ef82d693..0fa098354e38 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -601,13 +601,14 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb,
struct iov_iter *from)
 	ssize_t ret;

 	/*
-	 * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
-	 * as this can cause write reordering (e.g. the first aio gets EAGAIN
-	 * on the inode lock but the second goes through but is now unaligned).
+	 * For async direct IOs to sequential zone files, IOCB_NOWAIT can cause
+	 * unaligned writes in case of EAGAIN error or any failure to issue the
+	 * direct IO. So tell vfs to stop io_submit() on the first failure to
+	 * avoid more failed aios than necessary.
 	 */
 	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
 	    && (iocb->ki_flags & IOCB_NOWAIT))
-		iocb->ki_flags &= ~IOCB_NOWAIT;
+		iocb->ki_flags |= IOCB_WRITE_FAIL_FAST;

 	if (iocb->ki_flags & IOCB_NOWAIT) {
 		if (!inode_trylock(inode))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3cd4fe6b845e..8e7df1cc92e4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -314,6 +314,7 @@ enum rw_hint {
 #define IOCB_SYNC		(1 << 5)
 #define IOCB_WRITE		(1 << 6)
 #define IOCB_NOWAIT		(1 << 7)
+#define IOCB_WRITE_FAIL_FAST	(1 << 8)

 struct kiocb {
 	struct file		*ki_filp;


Of note is that checking the code path for iomap_direct_io(), I noticed that
there are at least 2 places places where the code can block: dio allocation with
GFP_KERNEL and the BIOs allocations for that dio also with GFP_KERNEL. While the
latter may be necessary to avoid partial failures of large direct IOs that need
to be split into multiple BIOs, shouldn't the dio allocation be done with
GFP_NOWAIT when IOMAP_NOWAIT is set ?

Best regards.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] zonefs: fix IOCB_NOWAIT handling
  2020-02-24 13:48 ` Damien Le Moal
@ 2020-02-24 20:18   ` Christoph Hellwig
  2020-02-25  9:54     ` Damien Le Moal
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2020-02-24 20:18 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Christoph Hellwig, Naohiro Aota, linux-fsdevel

On Mon, Feb 24, 2020 at 01:48:48PM +0000, Damien Le Moal wrote:
> 
> The main problem with allowing IOCB_NOWAIT is that for an application that sends
> multiple write AIOs with a single io_submit(), all write AIOs after one that is
> failed with -EAGAIN will be failed with -EINVAL (not -EIO !) since they will
> have an unaligned write offset. I am wondering if that is really a problem at
> all since the application can catch the -EAGAIN and -EINVAL, indicating that the
> write stream was stopped. So maybe it is reasonable to simply allow IOCB_NOWAIT ?

I don't think supporting IOCB_NOWAIT with current zonefs is very useful.
It will be very useful once Zone Append is supported.

But more importantly my patch fixes a bug in the current zonefs
implementation.  We need to fix that before 5.6 is released.  Any
additional functionaity can still come later if we think that it is
useful.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] zonefs: fix IOCB_NOWAIT handling
  2020-02-24 20:18   ` Christoph Hellwig
@ 2020-02-25  9:54     ` Damien Le Moal
  0 siblings, 0 replies; 4+ messages in thread
From: Damien Le Moal @ 2020-02-25  9:54 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Naohiro Aota, linux-fsdevel

On 2020/02/25 5:19, Christoph Hellwig wrote:
> On Mon, Feb 24, 2020 at 01:48:48PM +0000, Damien Le Moal wrote:
>>
>> The main problem with allowing IOCB_NOWAIT is that for an application that sends
>> multiple write AIOs with a single io_submit(), all write AIOs after one that is
>> failed with -EAGAIN will be failed with -EINVAL (not -EIO !) since they will
>> have an unaligned write offset. I am wondering if that is really a problem at
>> all since the application can catch the -EAGAIN and -EINVAL, indicating that the
>> write stream was stopped. So maybe it is reasonable to simply allow IOCB_NOWAIT ?
> 
> I don't think supporting IOCB_NOWAIT with current zonefs is very useful.
> It will be very useful once Zone Append is supported.
> 
> But more importantly my patch fixes a bug in the current zonefs
> implementation.  We need to fix that before 5.6 is released.  Any
> additional functionaity can still come later if we think that it is
> useful.
> 

OK. Fair enough. I will apply your patch.
Thanks !

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-02-25  9:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-21 14:37 [PATCH] zonefs: fix IOCB_NOWAIT handling Christoph Hellwig
2020-02-24 13:48 ` Damien Le Moal
2020-02-24 20:18   ` Christoph Hellwig
2020-02-25  9:54     ` Damien Le Moal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).