* [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
2021-02-24 16:44 [PATCHSET v3 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
@ 2021-02-24 16:44 ` Jens Axboe
2021-02-24 16:50 ` Matthew Wilcox
2021-02-24 17:22 ` Jan Kara
2021-02-24 16:44 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
2021-02-24 16:44 ` [PATCH 3/3] iomap: " Jens Axboe
2 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2021-02-24 16:44 UTC (permalink / raw)
To: linux-fsdevel, linux-mm; +Cc: akpm, Jens Axboe
For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range. The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.
Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the
range.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
include/linux/fs.h | 2 ++
mm/filemap.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6d8b1e7337e4..4925275e6365 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2633,6 +2633,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
loff_t lend);
+extern bool filemap_range_needs_writeback(struct address_space *,
+ loff_t lstart, loff_t lend);
extern int filemap_write_and_wait_range(struct address_space *mapping,
loff_t lstart, loff_t lend);
extern int __filemap_fdatawrite_range(struct address_space *mapping,
diff --git a/mm/filemap.c b/mm/filemap.c
index 6ff2a3fb0dc7..13338f877677 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -635,6 +635,49 @@ static bool mapping_needs_writeback(struct address_space *mapping)
return mapping->nrpages;
}
+/**
+ * filemap_range_needs_writeback - check if range potentially needs writeback
+ * @mapping: address space within which to check
+ * @start_byte: offset in bytes where the range starts
+ * @end_byte: offset in bytes where the range ends (inclusive)
+ *
+ * Find at least one page in the range supplied, usually used to check if
+ * direct writing in this range will trigger a writeback. Used by O_DIRECT
+ * read/write with IOCB_NOWAIT, to see if the caller needs to do
+ * filemap_write_and_wait_range() before proceeding.
+ *
+ * Return: %true if the caller should do filemap_write_and_wait_range() before
+ * doing O_DIRECT to a page in this range, %false otherwise.
+ */
+bool filemap_range_needs_writeback(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte)
+{
+ XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
+ pgoff_t max = end_byte >> PAGE_SHIFT;
+ struct page *page;
+
+ if (!mapping_needs_writeback(mapping))
+ return false;
+ if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
+ !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
+ return false;
+ if (end_byte < start_byte)
+ return false;
+
+ rcu_read_lock();
+ xas_for_each(&xas, page, max) {
+ if (xas_retry(&xas, page))
+ continue;
+ if (xa_is_value(page))
+ continue;
+ if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
+ break;
+ }
+ rcu_read_unlock();
+ return page != NULL;
+}
+EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
+
/**
* filemap_write_and_wait_range - write out & wait on a file range
* @mapping: the address_space for the pages
--
2.30.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
2021-02-24 16:44 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
@ 2021-02-24 16:50 ` Matthew Wilcox
2021-02-24 16:53 ` Jens Axboe
2021-02-24 17:22 ` Jan Kara
1 sibling, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2021-02-24 16:50 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm
On Wed, Feb 24, 2021 at 09:44:53AM -0700, Jens Axboe wrote:
> +++ b/include/linux/fs.h
> @@ -2633,6 +2633,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
>
> extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
> loff_t lend);
> +extern bool filemap_range_needs_writeback(struct address_space *,
> + loff_t lstart, loff_t lend);
> extern int filemap_write_and_wait_range(struct address_space *mapping,
> loff_t lstart, loff_t lend);
> extern int __filemap_fdatawrite_range(struct address_space *mapping,
These prototypes should all be in pagemap.h, not fs.h. I can do
a followup patch if you don't want to do it as part of this set.
Also we're deprecating the use of 'extern' for prototypes.
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
2021-02-24 16:50 ` Matthew Wilcox
@ 2021-02-24 16:53 ` Jens Axboe
0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2021-02-24 16:53 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel, linux-mm, akpm
On 2/24/21 9:50 AM, Matthew Wilcox wrote:
> On Wed, Feb 24, 2021 at 09:44:53AM -0700, Jens Axboe wrote:
>> +++ b/include/linux/fs.h
>> @@ -2633,6 +2633,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
>>
>> extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
>> loff_t lend);
>> +extern bool filemap_range_needs_writeback(struct address_space *,
>> + loff_t lstart, loff_t lend);
>> extern int filemap_write_and_wait_range(struct address_space *mapping,
>> loff_t lstart, loff_t lend);
>> extern int __filemap_fdatawrite_range(struct address_space *mapping,
>
> These prototypes should all be in pagemap.h, not fs.h. I can do
> a followup patch if you don't want to do it as part of this set.
> Also we're deprecating the use of 'extern' for prototypes.
Agree on both of those, but that should probably be a separate patch
for both of those. extern is just following the existing style, and
fs.h is the existing location...
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Thanks!
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper
2021-02-24 16:44 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
2021-02-24 16:50 ` Matthew Wilcox
@ 2021-02-24 17:22 ` Jan Kara
1 sibling, 0 replies; 10+ messages in thread
From: Jan Kara @ 2021-02-24 17:22 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm
On Wed 24-02-21 09:44:53, Jens Axboe wrote:
> For O_DIRECT reads/writes, we check if we need to issue a call to
> filemap_write_and_wait_range() to issue and/or wait for writeback for any
> page in the given range. The existing mechanism just checks for a page in
> the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
> slow path (and needing retry) if there's just a clean page cache page in
> the range.
>
> Provide filemap_range_needs_writeback() which tries a little harder to
> check if we actually need to issue and/or wait for writeback in the
> range.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> include/linux/fs.h | 2 ++
> mm/filemap.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 6d8b1e7337e4..4925275e6365 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2633,6 +2633,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
>
> extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
> loff_t lend);
> +extern bool filemap_range_needs_writeback(struct address_space *,
> + loff_t lstart, loff_t lend);
> extern int filemap_write_and_wait_range(struct address_space *mapping,
> loff_t lstart, loff_t lend);
> extern int __filemap_fdatawrite_range(struct address_space *mapping,
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 6ff2a3fb0dc7..13338f877677 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -635,6 +635,49 @@ static bool mapping_needs_writeback(struct address_space *mapping)
> return mapping->nrpages;
> }
>
> +/**
> + * filemap_range_needs_writeback - check if range potentially needs writeback
> + * @mapping: address space within which to check
> + * @start_byte: offset in bytes where the range starts
> + * @end_byte: offset in bytes where the range ends (inclusive)
> + *
> + * Find at least one page in the range supplied, usually used to check if
> + * direct writing in this range will trigger a writeback. Used by O_DIRECT
> + * read/write with IOCB_NOWAIT, to see if the caller needs to do
> + * filemap_write_and_wait_range() before proceeding.
> + *
> + * Return: %true if the caller should do filemap_write_and_wait_range() before
> + * doing O_DIRECT to a page in this range, %false otherwise.
> + */
> +bool filemap_range_needs_writeback(struct address_space *mapping,
> + loff_t start_byte, loff_t end_byte)
> +{
> + XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
> + pgoff_t max = end_byte >> PAGE_SHIFT;
> + struct page *page;
> +
> + if (!mapping_needs_writeback(mapping))
> + return false;
> + if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
> + !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> + return false;
> + if (end_byte < start_byte)
> + return false;
> +
> + rcu_read_lock();
> + xas_for_each(&xas, page, max) {
> + if (xas_retry(&xas, page))
> + continue;
> + if (xa_is_value(page))
> + continue;
> + if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
> + break;
> + }
> + rcu_read_unlock();
> + return page != NULL;
> +}
> +EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
> +
> /**
> * filemap_write_and_wait_range - write out & wait on a file range
> * @mapping: the address_space for the pages
> --
> 2.30.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
2021-02-24 16:44 [PATCHSET v3 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
2021-02-24 16:44 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
@ 2021-02-24 16:44 ` Jens Axboe
2021-02-24 16:57 ` Matthew Wilcox
2021-02-24 17:23 ` Jan Kara
2021-02-24 16:44 ` [PATCH 3/3] iomap: " Jens Axboe
2 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2021-02-24 16:44 UTC (permalink / raw)
To: linux-fsdevel, linux-mm; +Cc: akpm, Jens Axboe
For the generic page cache read helper, use the better variant of checking
for the need to call filemap_write_and_wait_range() when doing O_DIRECT
reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
are no pages to wait for (or write out).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
mm/filemap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 13338f877677..77f1b527541e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2645,8 +2645,8 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
size = i_size_read(inode);
if (iocb->ki_flags & IOCB_NOWAIT) {
- if (filemap_range_has_page(mapping, iocb->ki_pos,
- iocb->ki_pos + count - 1))
+ if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
+ iocb->ki_pos + count - 1))
return -EAGAIN;
} else {
retval = filemap_write_and_wait_range(mapping,
--
2.30.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
2021-02-24 16:44 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
@ 2021-02-24 16:57 ` Matthew Wilcox
2021-02-24 17:23 ` Jan Kara
1 sibling, 0 replies; 10+ messages in thread
From: Matthew Wilcox @ 2021-02-24 16:57 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm
On Wed, Feb 24, 2021 at 09:44:54AM -0700, Jens Axboe wrote:
> For the generic page cache read helper, use the better variant of checking
> for the need to call filemap_write_and_wait_range() when doing O_DIRECT
> reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
> are no pages to wait for (or write out).
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads
2021-02-24 16:44 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
2021-02-24 16:57 ` Matthew Wilcox
@ 2021-02-24 17:23 ` Jan Kara
1 sibling, 0 replies; 10+ messages in thread
From: Jan Kara @ 2021-02-24 17:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm
On Wed 24-02-21 09:44:54, Jens Axboe wrote:
> For the generic page cache read helper, use the better variant of checking
> for the need to call filemap_write_and_wait_range() when doing O_DIRECT
> reads. This avoids falling back to the slow path for IOCB_NOWAIT, if there
> are no pages to wait for (or write out).
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> mm/filemap.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 13338f877677..77f1b527541e 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2645,8 +2645,8 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>
> size = i_size_read(inode);
> if (iocb->ki_flags & IOCB_NOWAIT) {
> - if (filemap_range_has_page(mapping, iocb->ki_pos,
> - iocb->ki_pos + count - 1))
> + if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
> + iocb->ki_pos + count - 1))
> return -EAGAIN;
> } else {
> retval = filemap_write_and_wait_range(mapping,
> --
> 2.30.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
2021-02-24 16:44 [PATCHSET v3 0/3] Improve IOCB_NOWAIT O_DIRECT reads Jens Axboe
2021-02-24 16:44 ` [PATCH 1/3] mm: provide filemap_range_needs_writeback() helper Jens Axboe
2021-02-24 16:44 ` [PATCH 2/3] mm: use filemap_range_needs_writeback() for O_DIRECT reads Jens Axboe
@ 2021-02-24 16:44 ` Jens Axboe
2021-02-24 17:25 ` Jan Kara
2 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-02-24 16:44 UTC (permalink / raw)
To: linux-fsdevel, linux-mm; +Cc: akpm, Jens Axboe
For reads, use the better variant of checking for the need to call
filemap_write_and_wait_range() when doing O_DIRECT. This avoids falling
back to the slow path for IOCB_NOWAIT, if there are no pages to wait for
(or write out).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
fs/iomap/direct-io.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index e2c4991833b8..8c35a0041f0f 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -487,12 +487,28 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
if (pos >= dio->i_size)
goto out_free_dio;
+ if (iocb->ki_flags & IOCB_NOWAIT) {
+ if (filemap_range_needs_writeback(mapping, pos, end)) {
+ ret = -EAGAIN;
+ goto out_free_dio;
+ }
+ iomap_flags |= IOMAP_NOWAIT;
+ }
+
if (iter_is_iovec(iter))
dio->flags |= IOMAP_DIO_DIRTY;
} else {
iomap_flags |= IOMAP_WRITE;
dio->flags |= IOMAP_DIO_WRITE;
+ if (iocb->ki_flags & IOCB_NOWAIT) {
+ if (filemap_range_has_page(mapping, pos, end)) {
+ ret = -EAGAIN;
+ goto out_free_dio;
+ }
+ iomap_flags |= IOMAP_NOWAIT;
+ }
+
/* for data sync or sync, we need sync completion processing */
if (iocb->ki_flags & IOCB_DSYNC)
dio->flags |= IOMAP_DIO_NEED_SYNC;
@@ -507,14 +523,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
dio->flags |= IOMAP_DIO_WRITE_FUA;
}
- if (iocb->ki_flags & IOCB_NOWAIT) {
- if (filemap_range_has_page(mapping, pos, end)) {
- ret = -EAGAIN;
- goto out_free_dio;
- }
- iomap_flags |= IOMAP_NOWAIT;
- }
-
if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) {
ret = -EAGAIN;
if (pos >= dio->i_size || pos + count > dio->i_size)
--
2.30.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] iomap: use filemap_range_needs_writeback() for O_DIRECT reads
2021-02-24 16:44 ` [PATCH 3/3] iomap: " Jens Axboe
@ 2021-02-24 17:25 ` Jan Kara
0 siblings, 0 replies; 10+ messages in thread
From: Jan Kara @ 2021-02-24 17:25 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, linux-mm, akpm
On Wed 24-02-21 09:44:55, Jens Axboe wrote:
> For reads, use the better variant of checking for the need to call
> filemap_write_and_wait_range() when doing O_DIRECT. This avoids falling
> back to the slow path for IOCB_NOWAIT, if there are no pages to wait for
> (or write out).
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/iomap/direct-io.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index e2c4991833b8..8c35a0041f0f 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -487,12 +487,28 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> if (pos >= dio->i_size)
> goto out_free_dio;
>
> + if (iocb->ki_flags & IOCB_NOWAIT) {
> + if (filemap_range_needs_writeback(mapping, pos, end)) {
> + ret = -EAGAIN;
> + goto out_free_dio;
> + }
> + iomap_flags |= IOMAP_NOWAIT;
> + }
> +
> if (iter_is_iovec(iter))
> dio->flags |= IOMAP_DIO_DIRTY;
> } else {
> iomap_flags |= IOMAP_WRITE;
> dio->flags |= IOMAP_DIO_WRITE;
>
> + if (iocb->ki_flags & IOCB_NOWAIT) {
> + if (filemap_range_has_page(mapping, pos, end)) {
> + ret = -EAGAIN;
> + goto out_free_dio;
> + }
> + iomap_flags |= IOMAP_NOWAIT;
> + }
> +
> /* for data sync or sync, we need sync completion processing */
> if (iocb->ki_flags & IOCB_DSYNC)
> dio->flags |= IOMAP_DIO_NEED_SYNC;
> @@ -507,14 +523,6 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> dio->flags |= IOMAP_DIO_WRITE_FUA;
> }
>
> - if (iocb->ki_flags & IOCB_NOWAIT) {
> - if (filemap_range_has_page(mapping, pos, end)) {
> - ret = -EAGAIN;
> - goto out_free_dio;
> - }
> - iomap_flags |= IOMAP_NOWAIT;
> - }
> -
> if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) {
> ret = -EAGAIN;
> if (pos >= dio->i_size || pos + count > dio->i_size)
> --
> 2.30.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 10+ messages in thread