All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
@ 2021-02-09  2:30 Jens Axboe
  2021-02-09  2:30 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Jens Axboe @ 2021-02-09  2:30 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: hch, akpm

Hi,

For v1, see:

https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/

tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
entries for the given range. This causes unnecessary work from the callers
side, when the IO could have been issued totally fine without blocking on
writeback when there is none.

 fs/iomap/direct-io.c | 23 ++++++++++++++--------
 include/linux/fs.h   |  2 ++
 mm/filemap.c         | 47 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 62 insertions(+), 10 deletions(-)

Since v1:

- Simplify the filemap_range_needs_writeback() loop (Willy)
- Drop the write side (Chinner)

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
  2021-02-09  2:30 [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
@ 2021-02-09  2:30 ` Jens Axboe
  2021-02-09  7:47   ` Christoph Hellwig
  2021-02-09  2:30 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2021-02-09  2:30 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: hch, akpm, Jens Axboe

For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range. The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.

Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the
range.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/fs.h |  2 ++
 mm/filemap.c       | 43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index fd47deea7c17..def89222dfe9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2633,6 +2633,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
 
 extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
 				  loff_t lend);
+extern bool filemap_range_needs_writeback(struct address_space *,
+					  loff_t lstart, loff_t lend);
 extern int filemap_write_and_wait_range(struct address_space *mapping,
 				        loff_t lstart, loff_t lend);
 extern int __filemap_fdatawrite_range(struct address_space *mapping,
diff --git a/mm/filemap.c b/mm/filemap.c
index aa0e0fb04670..6a58d50fbd31 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -633,6 +633,49 @@ static bool mapping_needs_writeback(struct address_space *mapping)
 	return mapping->nrpages;
 }
 
+/**
+ * filemap_range_needs_writeback - check if range potentially needs writeback
+ * @mapping:           address space within which to check
+ * @start_byte:        offset in bytes where the range starts
+ * @end_byte:          offset in bytes where the range ends (inclusive)
+ *
+ * Find at least one page in the range supplied, usually used to check if
+ * direct writing in this range will trigger a writeback. Used by O_DIRECT
+ * read/write with IOCB_NOWAIT, to see if the caller needs to do
+ * filemap_write_and_wait_range() before proceeding.
+ *
+ * Return: %true if the caller should do filemap_write_and_wait_range() before
+ * doing O_DIRECT to a page in this range, %false otherwise.
+ */
+bool filemap_range_needs_writeback(struct address_space *mapping,
+				   loff_t start_byte, loff_t end_byte)
+{
+	XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
+	pgoff_t max = end_byte >> PAGE_SHIFT;
+	struct page *page;
+
+	if (!mapping_needs_writeback(mapping))
+		return false;
+	if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
+	    !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
+		return false;
+	if (end_byte < start_byte)
+		return false;
+
+	rcu_read_lock();
+	xas_for_each(&xas, page, max) {
+		if (xas_retry(&xas, page))
+			continue;
+		if (xa_is_value(page))
+			continue;
+		if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
+			break;
+	}
+	rcu_read_unlock();
+	return page != NULL;
+}
+EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
+
 /**
  * filemap_write_and_wait_range - write out & wait on a file range
  * @mapping:	the address_space for the pages
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  2:30 [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
  2021-02-09  2:30 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
@ 2021-02-09  2:30 ` Jens Axboe
  2021-02-09  7:48   ` Christoph Hellwig
  2021-02-09  2:30 ` [PATCH 3/3] iomap: " Jens Axboe
  2021-02-09 19:55 ` [PATCHSET v2 0/3] Improve IOCB_NOWAIT " Andrew Morton
  3 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2021-02-09  2:30 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: hch, akpm, Jens Axboe

For the generic page cache read helper, use the better variant of checking
for the need to call filemap_write_and_wait_range() when doing O_DIRECT
reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
are no pages to wait for (or write out).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/filemap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 6a58d50fbd31..c80acb80e8f7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2643,8 +2643,8 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 
 		size = i_size_read(inode);
 		if (iocb->ki_flags & IOCB_NOWAIT) {
-			if (filemap_range_has_page(mapping, iocb->ki_pos,
-						   iocb->ki_pos + count - 1))
+			if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
+							  iocb->ki_pos + count - 1))
 				return -EAGAIN;
 		} else {
 			retval = filemap_write_and_wait_range(mapping,
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  2:30 [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
  2021-02-09  2:30 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
  2021-02-09  2:30 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
@ 2021-02-09  2:30 ` Jens Axboe
  2021-02-09  7:51   ` Christoph Hellwig
  2021-02-09 19:55 ` [PATCHSET v2 0/3] Improve IOCB_NOWAIT " Andrew Morton
  3 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2021-02-09  2:30 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: hch, akpm, Jens Axboe

For reads, use the better variant of checking for the need to call
filemap_write_and_wait_range() when doing O_DIRECT. This avoids falling
back to the slow path for IOCB_NOWAIT, if there are no pages to wait for
(or write out).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/iomap/direct-io.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 933f234d5bec..5b111d1635ab 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -458,9 +458,24 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		if (pos >= dio->i_size)
 			goto out_free_dio;
 
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (filemap_range_needs_writeback(mapping, pos, end)) {
+				ret = -EAGAIN;
+				goto out_free_dio;
+			}
+			flags |= IOMAP_NOWAIT;
+		}
 		if (iter_is_iovec(iter))
 			dio->flags |= IOMAP_DIO_DIRTY;
 	} else {
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (filemap_range_has_page(mapping, pos, end)) {
+				ret = -EAGAIN;
+				goto out_free_dio;
+			}
+			flags |= IOMAP_NOWAIT;
+		}
+
 		flags |= IOMAP_WRITE;
 		dio->flags |= IOMAP_DIO_WRITE;
 
@@ -478,14 +493,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			dio->flags |= IOMAP_DIO_WRITE_FUA;
 	}
 
-	if (iocb->ki_flags & IOCB_NOWAIT) {
-		if (filemap_range_has_page(mapping, pos, end)) {
-			ret = -EAGAIN;
-			goto out_free_dio;
-		}
-		flags |= IOMAP_NOWAIT;
-	}
-
 	ret = filemap_write_and_wait_range(mapping, pos, end);
 	if (ret)
 		goto out_free_dio;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
  2021-02-09  2:30 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
@ 2021-02-09  7:47   ` Christoph Hellwig
  2021-02-09 14:30     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2021-02-09  7:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, hch, akpm

> +extern bool filemap_range_needs_writeback(struct address_space *,
> +					  loff_t lstart, loff_t lend);

no need for the extern.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  2:30 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
@ 2021-02-09  7:48   ` Christoph Hellwig
  2021-02-09 14:27     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2021-02-09  7:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, hch, akpm

On Mon, Feb 08, 2021 at 07:30:07PM -0700, Jens Axboe wrote:
> For the generic page cache read helper, use the better variant of checking
> for the need to call filemap_write_and_wait_range() when doing O_DIRECT
> reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
> are no pages to wait for (or write out).
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  mm/filemap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 6a58d50fbd31..c80acb80e8f7 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2643,8 +2643,8 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>  
>  		size = i_size_read(inode);
>  		if (iocb->ki_flags & IOCB_NOWAIT) {
> -			if (filemap_range_has_page(mapping, iocb->ki_pos,
> -						   iocb->ki_pos + count - 1))
> +			if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
> +							  iocb->ki_pos + count - 1))

Please avoid the overy long line, which is trivial to do by using the
normal two tab indent for the continuation.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  2:30 ` [PATCH 3/3] iomap: " Jens Axboe
@ 2021-02-09  7:51   ` Christoph Hellwig
  2021-02-09 14:29     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2021-02-09  7:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, hch, akpm

On Mon, Feb 08, 2021 at 07:30:08PM -0700, Jens Axboe wrote:
> +		if (iocb->ki_flags & IOCB_NOWAIT) {
> +			if (filemap_range_needs_writeback(mapping, pos, end)) {
> +				ret = -EAGAIN;
> +				goto out_free_dio;
> +			}
> +			flags |= IOMAP_NOWAIT;
> +		}
>  		if (iter_is_iovec(iter))
>  			dio->flags |= IOMAP_DIO_DIRTY;
>  	} else {
> +		if (iocb->ki_flags & IOCB_NOWAIT) {
> +			if (filemap_range_has_page(mapping, pos, end)) {
> +				ret = -EAGAIN;
> +				goto out_free_dio;
> +			}
> +			flags |= IOMAP_NOWAIT;
> +		}
> +
>  		flags |= IOMAP_WRITE;
>  		dio->flags |= IOMAP_DIO_WRITE;
>  
> @@ -478,14 +493,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  			dio->flags |= IOMAP_DIO_WRITE_FUA;
>  	}
>  
> -	if (iocb->ki_flags & IOCB_NOWAIT) {
> -		if (filemap_range_has_page(mapping, pos, end)) {
> -			ret = -EAGAIN;
> -			goto out_free_dio;
> -		}
> -		flags |= IOMAP_NOWAIT;
> -	}

looking at this I really hate the scheme with the potential racyness
and duplicated page looksups.

Why can't we pass a nonblock flag to filemap_write_and_wait_range
and invalidate_inode_pages2_range that makes them return -EAGAIN
when they would block to clean this whole mess up?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  7:48   ` Christoph Hellwig
@ 2021-02-09 14:27     ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2021-02-09 14:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-mm, akpm

On 2/9/21 12:48 AM, Christoph Hellwig wrote:
> On Mon, Feb 08, 2021 at 07:30:07PM -0700, Jens Axboe wrote:
>> For the generic page cache read helper, use the better variant of checking
>> for the need to call filemap_write_and_wait_range() when doing O_DIRECT
>> reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
>> are no pages to wait for (or write out).
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  mm/filemap.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 6a58d50fbd31..c80acb80e8f7 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2643,8 +2643,8 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>>  
>>  		size = i_size_read(inode);
>>  		if (iocb->ki_flags & IOCB_NOWAIT) {
>> -			if (filemap_range_has_page(mapping, iocb->ki_pos,
>> -						   iocb->ki_pos + count - 1))
>> +			if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
>> +							  iocb->ki_pos + count - 1))
> 
> Please avoid the overy long line, which is trivial to do by using the
> normal two tab indent for the continuation.

Sure, fixed.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-09  7:51   ` Christoph Hellwig
@ 2021-02-09 14:29     ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2021-02-09 14:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-mm, akpm

On 2/9/21 12:51 AM, Christoph Hellwig wrote:
> On Mon, Feb 08, 2021 at 07:30:08PM -0700, Jens Axboe wrote:
>> +		if (iocb->ki_flags & IOCB_NOWAIT) {
>> +			if (filemap_range_needs_writeback(mapping, pos, end)) {
>> +				ret = -EAGAIN;
>> +				goto out_free_dio;
>> +			}
>> +			flags |= IOMAP_NOWAIT;
>> +		}
>>  		if (iter_is_iovec(iter))
>>  			dio->flags |= IOMAP_DIO_DIRTY;
>>  	} else {
>> +		if (iocb->ki_flags & IOCB_NOWAIT) {
>> +			if (filemap_range_has_page(mapping, pos, end)) {
>> +				ret = -EAGAIN;
>> +				goto out_free_dio;
>> +			}
>> +			flags |= IOMAP_NOWAIT;
>> +		}
>> +
>>  		flags |= IOMAP_WRITE;
>>  		dio->flags |= IOMAP_DIO_WRITE;
>>  
>> @@ -478,14 +493,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>>  			dio->flags |= IOMAP_DIO_WRITE_FUA;
>>  	}
>>  
>> -	if (iocb->ki_flags & IOCB_NOWAIT) {
>> -		if (filemap_range_has_page(mapping, pos, end)) {
>> -			ret = -EAGAIN;
>> -			goto out_free_dio;
>> -		}
>> -		flags |= IOMAP_NOWAIT;
>> -	}
> 
> looking at this I really hate the scheme with the potential racyness
> and duplicated page looksups.

Me too

> Why can't we pass a nonblock flag to filemap_write_and_wait_range
> and invalidate_inode_pages2_range that makes them return -EAGAIN
> when they would block to clean this whole mess up?

We could, but that's a _lot_ of surgery. I'd rather live with the
slight race for now instead of teaching writepages, page laundering,
etc about IOCB_NOWAIT.

I do think that's a worthy long term goal, but we dio read situation
is bad enough that it warrants a quicker fix.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
  2021-02-09  7:47   ` Christoph Hellwig
@ 2021-02-09 14:30     ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2021-02-09 14:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-mm, akpm

On 2/9/21 12:47 AM, Christoph Hellwig wrote:
>> +extern bool filemap_range_needs_writeback(struct address_space *,
>> +					  loff_t lstart, loff_t lend);
> 
> no need for the extern.

Just following the style in the file.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
  2021-02-09  2:30 [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
                   ` (2 preceding siblings ...)
  2021-02-09  2:30 ` [PATCH 3/3] iomap: " Jens Axboe
@ 2021-02-09 19:55 ` Andrew Morton
  2021-02-09 20:11   ` Jens Axboe
  3 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2021-02-09 19:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, hch

On Mon,  8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@kernel.dk> wrote:

> Hi,
> 
> For v1, see:
> 
> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/
> 
> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
> entries for the given range. This causes unnecessary work from the callers
> side, when the IO could have been issued totally fine without blocking on
> writeback when there is none.
> 

Seems a good idea.  Obviously we'll do more work in the case where some
writeback needs doing, but we'll be doing synchronous writeout in that
case anyway so who cares.

Please remind me what prevents pages from becoming dirty during or
immediately after the filemap_range_needs_writeback() check?  Perhaps
filemap_range_needs_writeback() could have a comment explaining what it
is that keeps its return value true after it has returned it!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
  2021-02-09 19:55 ` [PATCHSET v2 0/3] Improve IOCB_NOWAIT " Andrew Morton
@ 2021-02-09 20:11   ` Jens Axboe
  2021-02-10  8:07       ` Sedat Dilek
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2021-02-09 20:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-mm, hch

On 2/9/21 12:55 PM, Andrew Morton wrote:
> On Mon,  8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@kernel.dk> wrote:
> 
>> Hi,
>>
>> For v1, see:
>>
>> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/
>>
>> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
>> entries for the given range. This causes unnecessary work from the callers
>> side, when the IO could have been issued totally fine without blocking on
>> writeback when there is none.
>>
> 
> Seems a good idea.  Obviously we'll do more work in the case where some
> writeback needs doing, but we'll be doing synchronous writeout in that
> case anyway so who cares.

Right, I think that'll be a round two on top of this, so we can make the
write side happier too. That's a bit more involved...

> Please remind me what prevents pages from becoming dirty during or
> immediately after the filemap_range_needs_writeback() check?  Perhaps
> filemap_range_needs_writeback() could have a comment explaining what it
> is that keeps its return value true after it has returned it!

It's inherently racy, just like it is now. There's really no difference
there, and I don't think there's a way to close that. Even if you
modified filemap_write_and_wait_range() to be non-block friendly,
there's nothing stopping anyone from adding dirty page cache right after
that call.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
  2021-02-09 20:11   ` Jens Axboe
@ 2021-02-10  8:07       ` Sedat Dilek
  0 siblings, 0 replies; 17+ messages in thread
From: Sedat Dilek @ 2021-02-10  8:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, linux-fsdevel, linux-mm, hch

On Tue, Feb 9, 2021 at 10:25 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 2/9/21 12:55 PM, Andrew Morton wrote:
> > On Mon,  8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@kernel.dk> wrote:
> >
> >> Hi,
> >>
> >> For v1, see:
> >>
> >> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/
> >>
> >> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
> >> entries for the given range. This causes unnecessary work from the callers
> >> side, when the IO could have been issued totally fine without blocking on
> >> writeback when there is none.
> >>
> >
> > Seems a good idea.  Obviously we'll do more work in the case where some
> > writeback needs doing, but we'll be doing synchronous writeout in that
> > case anyway so who cares.
>
> Right, I think that'll be a round two on top of this, so we can make the
> write side happier too. That's a bit more involved...
>
> > Please remind me what prevents pages from becoming dirty during or
> > immediately after the filemap_range_needs_writeback() check?  Perhaps
> > filemap_range_needs_writeback() could have a comment explaining what it
> > is that keeps its return value true after it has returned it!
>
> It's inherently racy, just like it is now. There's really no difference
> there, and I don't think there's a way to close that. Even if you
> modified filemap_write_and_wait_range() to be non-block friendly,
> there's nothing stopping anyone from adding dirty page cache right after
> that call.
>

Jens, do you have some numbers before and after your patchset is applied?

And kindly a test "profile" for FIO :-)?

Thanks.

- Sedat -

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
@ 2021-02-10  8:07       ` Sedat Dilek
  0 siblings, 0 replies; 17+ messages in thread
From: Sedat Dilek @ 2021-02-10  8:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, linux-fsdevel, linux-mm, hch

On Tue, Feb 9, 2021 at 10:25 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 2/9/21 12:55 PM, Andrew Morton wrote:
> > On Mon,  8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@kernel.dk> wrote:
> >
> >> Hi,
> >>
> >> For v1, see:
> >>
> >> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/
> >>
> >> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
> >> entries for the given range. This causes unnecessary work from the callers
> >> side, when the IO could have been issued totally fine without blocking on
> >> writeback when there is none.
> >>
> >
> > Seems a good idea.  Obviously we'll do more work in the case where some
> > writeback needs doing, but we'll be doing synchronous writeout in that
> > case anyway so who cares.
>
> Right, I think that'll be a round two on top of this, so we can make the
> write side happier too. That's a bit more involved...
>
> > Please remind me what prevents pages from becoming dirty during or
> > immediately after the filemap_range_needs_writeback() check?  Perhaps
> > filemap_range_needs_writeback() could have a comment explaining what it
> > is that keeps its return value true after it has returned it!
>
> It's inherently racy, just like it is now. There's really no difference
> there, and I don't think there's a way to close that. Even if you
> modified filemap_write_and_wait_range() to be non-block friendly,
> there's nothing stopping anyone from adding dirty page cache right after
> that call.
>

Jens, do you have some numbers before and after your patchset is applied?

And kindly a test "profile" for FIO :-)?

Thanks.

- Sedat -


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads
  2021-02-10  8:07       ` Sedat Dilek
  (?)
@ 2021-02-10 14:47       ` Jens Axboe
  -1 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2021-02-10 14:47 UTC (permalink / raw)
  To: sedat.dilek; +Cc: Andrew Morton, linux-fsdevel, linux-mm, hch

On 2/10/21 1:07 AM, Sedat Dilek wrote:
> On Tue, Feb 9, 2021 at 10:25 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 2/9/21 12:55 PM, Andrew Morton wrote:
>>> On Mon,  8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> Hi,
>>>>
>>>> For v1, see:
>>>>
>>>> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@kernel.dk/
>>>>
>>>> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
>>>> entries for the given range. This causes unnecessary work from the callers
>>>> side, when the IO could have been issued totally fine without blocking on
>>>> writeback when there is none.
>>>>
>>>
>>> Seems a good idea.  Obviously we'll do more work in the case where some
>>> writeback needs doing, but we'll be doing synchronous writeout in that
>>> case anyway so who cares.
>>
>> Right, I think that'll be a round two on top of this, so we can make the
>> write side happier too. That's a bit more involved...
>>
>>> Please remind me what prevents pages from becoming dirty during or
>>> immediately after the filemap_range_needs_writeback() check?  Perhaps
>>> filemap_range_needs_writeback() could have a comment explaining what it
>>> is that keeps its return value true after it has returned it!
>>
>> It's inherently racy, just like it is now. There's really no difference
>> there, and I don't think there's a way to close that. Even if you
>> modified filemap_write_and_wait_range() to be non-block friendly,
>> there's nothing stopping anyone from adding dirty page cache right after
>> that call.
>>
> 
> Jens, do you have some numbers before and after your patchset is applied?

I don't, the load was pretty light for the test case - it was just doing
33-34K of O_DIRECT 4k random reads in a pretty small range of the device.
When you end up having page cache in that range, that means you end up
punting a LOT of requests to the async worker. So it wasn't as much a
performance win for this particular case, but an efficiency win. You get
rid of a worker using 40% CPU, and reduce the latencies.

> And kindly a test "profile" for FIO :-)?

To reproduce this, have a small range dio rand reads and then have
something else that does a few buffered reads from the same range.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-24 16:44 ` [PATCH 3/3] iomap: use filemap_range_needs_writeback() for " Jens Axboe
@ 2021-02-24 17:25   ` Jan Kara
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kara @ 2021-02-24 17:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm

On Wed 24-02-21 09:44:55, Jens Axboe wrote:
> For reads, use the better variant of checking for the need to call
> filemap_write_and_wait_range() when doing O_DIRECT. This avoids falling
> back to the slow path for IOCB_NOWAIT, if there are no pages to wait for
> (or write out).
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/iomap/direct-io.c | 24 ++++++++++++++++--------
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index e2c4991833b8..8c35a0041f0f 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -487,12 +487,28 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		if (pos >= dio->i_size)
>  			goto out_free_dio;
>  
> +		if (iocb->ki_flags & IOCB_NOWAIT) {
> +			if (filemap_range_needs_writeback(mapping, pos, end)) {
> +				ret = -EAGAIN;
> +				goto out_free_dio;
> +			}
> +			iomap_flags |= IOMAP_NOWAIT;
> +		}
> +
>  		if (iter_is_iovec(iter))
>  			dio->flags |= IOMAP_DIO_DIRTY;
>  	} else {
>  		iomap_flags |= IOMAP_WRITE;
>  		dio->flags |= IOMAP_DIO_WRITE;
>  
> +		if (iocb->ki_flags & IOCB_NOWAIT) {
> +			if (filemap_range_has_page(mapping, pos, end)) {
> +				ret = -EAGAIN;
> +				goto out_free_dio;
> +			}
> +			iomap_flags |= IOMAP_NOWAIT;
> +		}
> +
>  		/* for data sync or sync, we need sync completion processing */
>  		if (iocb->ki_flags & IOCB_DSYNC)
>  			dio->flags |= IOMAP_DIO_NEED_SYNC;
> @@ -507,14 +523,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  			dio->flags |= IOMAP_DIO_WRITE_FUA;
>  	}
>  
> -	if (iocb->ki_flags & IOCB_NOWAIT) {
> -		if (filemap_range_has_page(mapping, pos, end)) {
> -			ret = -EAGAIN;
> -			goto out_free_dio;
> -		}
> -		iomap_flags |= IOMAP_NOWAIT;
> -	}
> -
>  	if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) {
>  		ret = -EAGAIN;
>  		if (pos >= dio->i_size || pos + count > dio->i_size)
> -- 
> 2.30.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
  2021-02-24 16:44 [PATCHSET v3 " Jens Axboe
@ 2021-02-24 16:44 ` Jens Axboe
  2021-02-24 17:25   ` Jan Kara
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2021-02-24 16:44 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: akpm, Jens Axboe

For reads, use the better variant of checking for the need to call
filemap_write_and_wait_range() when doing O_DIRECT. This avoids falling
back to the slow path for IOCB_NOWAIT, if there are no pages to wait for
(or write out).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/iomap/direct-io.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index e2c4991833b8..8c35a0041f0f 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -487,12 +487,28 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		if (pos >= dio->i_size)
 			goto out_free_dio;
 
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (filemap_range_needs_writeback(mapping, pos, end)) {
+				ret = -EAGAIN;
+				goto out_free_dio;
+			}
+			iomap_flags |= IOMAP_NOWAIT;
+		}
+
 		if (iter_is_iovec(iter))
 			dio->flags |= IOMAP_DIO_DIRTY;
 	} else {
 		iomap_flags |= IOMAP_WRITE;
 		dio->flags |= IOMAP_DIO_WRITE;
 
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (filemap_range_has_page(mapping, pos, end)) {
+				ret = -EAGAIN;
+				goto out_free_dio;
+			}
+			iomap_flags |= IOMAP_NOWAIT;
+		}
+
 		/* for data sync or sync, we need sync completion processing */
 		if (iocb->ki_flags & IOCB_DSYNC)
 			dio->flags |= IOMAP_DIO_NEED_SYNC;
@@ -507,14 +523,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			dio->flags |= IOMAP_DIO_WRITE_FUA;
 	}
 
-	if (iocb->ki_flags & IOCB_NOWAIT) {
-		if (filemap_range_has_page(mapping, pos, end)) {
-			ret = -EAGAIN;
-			goto out_free_dio;
-		}
-		iomap_flags |= IOMAP_NOWAIT;
-	}
-
 	if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) {
 		ret = -EAGAIN;
 		if (pos >= dio->i_size || pos + count > dio->i_size)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-02-24 17:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09  2:30 [PATCHSET v2 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
2021-02-09  2:30 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
2021-02-09  7:47   ` Christoph Hellwig
2021-02-09 14:30     ` Jens Axboe
2021-02-09  2:30 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
2021-02-09  7:48   ` Christoph Hellwig
2021-02-09 14:27     ` Jens Axboe
2021-02-09  2:30 ` [PATCH 3/3] iomap: " Jens Axboe
2021-02-09  7:51   ` Christoph Hellwig
2021-02-09 14:29     ` Jens Axboe
2021-02-09 19:55 ` [PATCHSET v2 0/3] Improve IOCB_NOWAIT " Andrew Morton
2021-02-09 20:11   ` Jens Axboe
2021-02-10  8:07     ` Sedat Dilek
2021-02-10  8:07       ` Sedat Dilek
2021-02-10 14:47       ` Jens Axboe
2021-02-24 16:44 [PATCHSET v3 " Jens Axboe
2021-02-24 16:44 ` [PATCH 3/3] iomap: use filemap_range_needs_writeback() for " Jens Axboe
2021-02-24 17:25   ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.