* [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-15 9:06 ` Christoph Hellwig
2020-10-09 14:30 ` [PATCH v2 02/16] mm: Inline wait_on_page_read into its one caller Matthew Wilcox (Oracle)
` (15 subsequent siblings)
16 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
Allow synchronous ->readpage implementations to execute more
efficiently by skipping the re-locking of the page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
Documentation/filesystems/locking.rst | 7 ++++---
Documentation/filesystems/vfs.rst | 21 ++++++++++++++-------
include/linux/fs.h | 5 +++++
mm/filemap.c | 15 +++++++++++++--
4 files changed, 36 insertions(+), 12 deletions(-)
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 64f94a18d97e..06a7a8bf2362 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -269,7 +269,7 @@ locking rules:
ops PageLocked(page) i_rwsem
====================== ======================== =========
writepage: yes, unlocks (see below)
-readpage: yes, unlocks
+readpage: yes, may unlock
writepages:
set_page_dirty no
readahead: yes, unlocks
@@ -294,8 +294,9 @@ swap_deactivate: no
->write_begin(), ->write_end() and ->readpage() may be called from
the request handler (/dev/loop).
-->readpage() unlocks the page, either synchronously or via I/O
-completion.
+->readpage() may return AOP_UPDATED_PAGE if the page is now Uptodate
+or 0 if the page will be unlocked asynchronously by I/O completion.
+If it returns -errno, it should unlock the page.
->readahead() unlocks the pages that I/O is attempted on like ->readpage().
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index ca52c82e5bb5..16248c299aaa 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -643,7 +643,7 @@ set_page_dirty to write data into the address_space, and writepage and
writepages to writeback data to storage.
Adding and removing pages to/from an address_space is protected by the
-inode's i_mutex.
+inode's i_rwsem held exclusively.
When data is written to a page, the PG_Dirty flag should be set. It
typically remains set until writepage asks for it to be written. This
@@ -757,12 +757,19 @@ cache in your filesystem. The following members are defined:
``readpage``
called by the VM to read a page from backing store. The page
- will be Locked when readpage is called, and should be unlocked
- and marked uptodate once the read completes. If ->readpage
- discovers that it needs to unlock the page for some reason, it
- can do so, and then return AOP_TRUNCATED_PAGE. In this case,
- the page will be relocated, relocked and if that all succeeds,
- ->readpage will be called again.
+ will be Locked and !Uptodate when readpage is called. Ideally,
+ the filesystem will bring the page Uptodate and return
+ AOP_UPDATED_PAGE. If the filesystem encounters an error, it
+ should unlock the page and return a negative errno without marking
+ the page Uptodate. It does not need to mark the page as Error.
+ If the filesystem returns 0, this means the page will be unlocked
+ asynchronously by I/O completion. The VFS will wait for the
+ page to be unlocked, so there is no advantage to executing this
+ operation asynchronously.
+
+ The filesystem can also return AOP_TRUNCATED_PAGE to indicate
+ that it had to unlock the page to avoid a deadlock. The caller
+ will re-check the page cache and call ->readpage again.
``writepages``
called by the VM to write out pages associated with the
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a08..badf80e133fd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -273,6 +273,10 @@ struct iattr {
* reference, it should drop it before retrying. Returned
* by readpage().
*
+ * @AOP_UPDATED_PAGE: The readpage method has brought the page Uptodate
+ * without releasing the page lock. This is suitable for synchronous
+ * implementations of readpage.
+ *
* address_space_operation functions return these large constants to indicate
* special semantics to the caller. These are much larger than the bytes in a
* page to allow for functions that return the number of bytes operated on in a
@@ -282,6 +286,7 @@ struct iattr {
enum positive_aop_returns {
AOP_WRITEPAGE_ACTIVATE = 0x80000,
AOP_TRUNCATED_PAGE = 0x80001,
+ AOP_UPDATED_PAGE = 0x80002,
};
#define AOP_FLAG_CONT_EXPAND 0x0001 /* called from cont_expand */
diff --git a/mm/filemap.c b/mm/filemap.c
index 1aaea26556cc..95b68ec1f22c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2254,8 +2254,13 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
* PG_error will be set again if readpage fails.
*/
ClearPageError(page);
- /* Start the actual read. The read will unlock the page. */
+ /* Start the actual read. The read may unlock the page. */
error = mapping->a_ops->readpage(filp, page);
+ if (error == AOP_UPDATED_PAGE) {
+ unlock_page(page);
+ error = 0;
+ goto page_ok;
+ }
if (unlikely(error)) {
if (error == AOP_TRUNCATED_PAGE) {
@@ -2619,7 +2624,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
*/
if (unlikely(!PageUptodate(page)))
goto page_not_uptodate;
-
+page_ok:
/*
* We've made it this far and we had to drop our mmap_lock, now is the
* time to return to the upper layer and have it re-find the vma and
@@ -2654,6 +2659,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
ClearPageError(page);
fpin = maybe_unlock_mmap_for_io(vmf, fpin);
error = mapping->a_ops->readpage(file, page);
+ if (error == AOP_UPDATED_PAGE)
+ goto page_ok;
if (!error) {
wait_on_page_locked(page);
if (!PageUptodate(page))
@@ -2867,6 +2874,10 @@ static struct page *do_read_cache_page(struct address_space *mapping,
err = filler(data, page);
else
err = mapping->a_ops->readpage(data, page);
+ if (err == AOP_UPDATED_PAGE) {
+ unlock_page(page);
+ goto out;
+ }
if (err < 0) {
put_page(page);
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value
2020-10-09 14:30 ` [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value Matthew Wilcox (Oracle)
@ 2020-10-15 9:06 ` Christoph Hellwig
2020-10-15 14:06 ` Matthew Wilcox
0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 9:06 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
Don't we also need to handle the new return value in a few other places
like cachefiles_read_reissue swap_readpage? Maybe those don't get
called on the currently converted instances, but just leaving them
without handling AOP_UPDATED_PAGE seems like a time bomb.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value
2020-10-15 9:06 ` Christoph Hellwig
@ 2020-10-15 14:06 ` Matthew Wilcox
0 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox @ 2020-10-15 14:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 10:06:51AM +0100, Christoph Hellwig wrote:
> Don't we also need to handle the new return value in a few other places
> like cachefiles_read_reissue swap_readpage? Maybe those don't get
> called on the currently converted instances, but just leaving them
> without handling AOP_UPDATED_PAGE seems like a time bomb.
Er, right. And nobh_truncate_page(), and read_page(). And then I
noticed the bug in cachefiles_read_reissue(). Sigh.
Updated patch series coming soon.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2 02/16] mm: Inline wait_on_page_read into its one caller
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-15 9:08 ` Christoph Hellwig
2020-10-09 14:30 ` [PATCH v2 03/16] 9p: Tell the VFS that readpage was synchronous Matthew Wilcox (Oracle)
` (14 subsequent siblings)
16 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
Having this code inline helps the function read more easily.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/filemap.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 95b68ec1f22c..0ef06d515532 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2834,18 +2834,6 @@ EXPORT_SYMBOL(filemap_page_mkwrite);
EXPORT_SYMBOL(generic_file_mmap);
EXPORT_SYMBOL(generic_file_readonly_mmap);
-static struct page *wait_on_page_read(struct page *page)
-{
- if (!IS_ERR(page)) {
- wait_on_page_locked(page);
- if (!PageUptodate(page)) {
- put_page(page);
- page = ERR_PTR(-EIO);
- }
- }
- return page;
-}
-
static struct page *do_read_cache_page(struct address_space *mapping,
pgoff_t index,
int (*filler)(void *, struct page *),
@@ -2876,7 +2864,10 @@ static struct page *do_read_cache_page(struct address_space *mapping,
err = mapping->a_ops->readpage(data, page);
if (err == AOP_UPDATED_PAGE) {
unlock_page(page);
- goto out;
+ } else if (!err) {
+ wait_on_page_locked(page);
+ if (!PageUptodate(page))
+ err = -EIO;
}
if (err < 0) {
@@ -2884,9 +2875,6 @@ static struct page *do_read_cache_page(struct address_space *mapping,
return ERR_PTR(err);
}
- page = wait_on_page_read(page);
- if (IS_ERR(page))
- return page;
goto out;
}
if (PageUptodate(page))
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 02/16] mm: Inline wait_on_page_read into its one caller
2020-10-09 14:30 ` [PATCH v2 02/16] mm: Inline wait_on_page_read into its one caller Matthew Wilcox (Oracle)
@ 2020-10-15 9:08 ` Christoph Hellwig
0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 9:08 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2 03/16] 9p: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 01/16] mm: Add AOP_UPDATED_PAGE return value Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 02/16] mm: Inline wait_on_page_read into its one caller Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 04/16] afs: " Matthew Wilcox (Oracle)
` (13 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs, Dominique Martinet
The 9p readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Dominique Martinet <asmadeus@codewreck.org>
---
fs/9p/vfs_addr.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index cce9ace651a2..506ca0ba2ec7 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -65,7 +65,7 @@ static int v9fs_fid_readpage(void *data, struct page *page)
SetPageUptodate(page);
v9fs_readpage_to_fscache(inode, page);
- retval = 0;
+ return AOP_UPDATED_PAGE;
done:
unlock_page(page);
@@ -280,6 +280,10 @@ static int v9fs_write_begin(struct file *filp, struct address_space *mapping,
goto out;
retval = v9fs_fid_readpage(v9inode->writeback_fid, page);
+ if (retval == AOP_UPDATED_PAGE) {
+ retval = 0;
+ goto out;
+ }
put_page(page);
if (!retval)
goto start;
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 04/16] afs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 03/16] 9p: Tell the VFS that readpage was synchronous Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 05/16] ceph: " Matthew Wilcox (Oracle)
` (12 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The afs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/afs/file.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 371d1488cc54..4aa2ad3bea6a 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -367,7 +367,8 @@ int afs_page_filler(void *data, struct page *page)
BUG_ON(PageFsCache(page));
}
#endif
- unlock_page(page);
+ _leave(" = AOP_UPDATED_PAGE");
+ return AOP_UPDATED_PAGE;
}
_leave(" = 0");
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 05/16] ceph: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (3 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 04/16] afs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 06/16] cifs: " Matthew Wilcox (Oracle)
` (11 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs, Jeff Layton
The ceph readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
fs/ceph/addr.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6ea761c84494..b2bf8bf7a312 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -291,10 +291,11 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
static int ceph_readpage(struct file *filp, struct page *page)
{
int r = ceph_do_readpage(filp, page);
- if (r != -EINPROGRESS)
- unlock_page(page);
- else
- r = 0;
+ if (r == -EINPROGRESS)
+ return 0;
+ if (r == 0)
+ return AOP_UPDATED_PAGE;
+ unlock_page(page);
return r;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 06/16] cifs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (4 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 05/16] ceph: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 07/16] cramfs: " Matthew Wilcox (Oracle)
` (10 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The cifs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/cifs/file.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index be46fab4c96d..533b151a9143 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -4537,7 +4537,8 @@ static int cifs_readpage_worker(struct file *file, struct page *page,
/* send this page to the cache */
cifs_readpage_to_fscache(file_inode(file), page);
- rc = 0;
+ kunmap(page);
+ return AOP_UPDATED_PAGE;
io_error:
kunmap(page);
@@ -4677,7 +4678,10 @@ static int cifs_write_begin(struct file *file, struct address_space *mapping,
* an error, we don't need to return it. cifs_write_end will
* do a sync write instead since PG_uptodate isn't set.
*/
- cifs_readpage_worker(file, page, &page_start);
+ int err = cifs_readpage_worker(file, page, &page_start);
+
+ if (err == AOP_UPDATED_PAGE)
+ goto out;
put_page(page);
oncethru = 1;
goto start;
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 07/16] cramfs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (5 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 06/16] cifs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 08/16] ecryptfs: " Matthew Wilcox (Oracle)
` (9 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The cramfs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/cramfs/inode.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 912308600d39..7a642146c074 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -916,15 +916,14 @@ static int cramfs_readpage(struct file *file, struct page *page)
flush_dcache_page(page);
kunmap(page);
SetPageUptodate(page);
- unlock_page(page);
- return 0;
+ return AOP_UPDATED_PAGE;
err:
kunmap(page);
ClearPageUptodate(page);
SetPageError(page);
unlock_page(page);
- return 0;
+ return -EIO;
}
static const struct address_space_operations cramfs_aops = {
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 08/16] ecryptfs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (6 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 07/16] cramfs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 09/16] fuse: " Matthew Wilcox (Oracle)
` (8 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The ecryptfs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/ecryptfs/mmap.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
index 019572c6b39a..dee35181d789 100644
--- a/fs/ecryptfs/mmap.c
+++ b/fs/ecryptfs/mmap.c
@@ -219,12 +219,13 @@ static int ecryptfs_readpage(struct file *file, struct page *page)
}
}
out:
- if (rc)
- ClearPageUptodate(page);
- else
- SetPageUptodate(page);
- ecryptfs_printk(KERN_DEBUG, "Unlocking page with index = [0x%.16lx]\n",
+ ecryptfs_printk(KERN_DEBUG, "Returning page with index = [0x%.16lx]\n",
page->index);
+ if (!rc) {
+ SetPageUptodate(page);
+ return AOP_UPDATED_PAGE;
+ }
+ ClearPageUptodate(page);
unlock_page(page);
return rc;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 09/16] fuse: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (7 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 08/16] ecryptfs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 10/16] hostfs: " Matthew Wilcox (Oracle)
` (7 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The fuse readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/fuse/file.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 6611ef3269a8..7aa5626bc582 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -850,6 +850,8 @@ static int fuse_readpage(struct file *file, struct page *page)
err = fuse_do_readpage(file, page);
fuse_invalidate_atime(inode);
+ if (!err)
+ return AOP_UPDATED_PAGE;
out:
unlock_page(page);
return err;
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 10/16] hostfs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (8 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 09/16] fuse: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:30 ` [PATCH v2 11/16] jffs2: " Matthew Wilcox (Oracle)
` (6 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The hostfs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Richard Weinberger <richard@nod.at>
---
fs/hostfs/hostfs_kern.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index c070c0d8e3e9..c49221c09c4b 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -455,6 +455,8 @@ static int hostfs_readpage(struct file *file, struct page *page)
out:
flush_dcache_page(page);
kunmap(page);
+ if (!ret)
+ return AOP_UPDATED_PAGE;
unlock_page(page);
return ret;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 11/16] jffs2: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (9 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 10/16] hostfs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:30 ` Matthew Wilcox (Oracle)
2020-10-09 14:31 ` [PATCH v2 12/16] ubifs: " Matthew Wilcox (Oracle)
` (5 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:30 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The jffs2 readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Richard Weinberger <richard@nod.at>
---
fs/jffs2/file.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c
index f8fb89b10227..959a74027041 100644
--- a/fs/jffs2/file.c
+++ b/fs/jffs2/file.c
@@ -116,15 +116,17 @@ int jffs2_do_readpage_unlock(void *data, struct page *pg)
return ret;
}
-
static int jffs2_readpage (struct file *filp, struct page *pg)
{
struct jffs2_inode_info *f = JFFS2_INODE_INFO(pg->mapping->host);
int ret;
mutex_lock(&f->sem);
- ret = jffs2_do_readpage_unlock(pg->mapping->host, pg);
+ ret = jffs2_do_readpage_nolock(pg->mapping->host, pg);
mutex_unlock(&f->sem);
+ if (!ret)
+ return AOP_UPDATED_PAGE;
+ unlock_page(pg);
return ret;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 12/16] ubifs: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (10 preceding siblings ...)
2020-10-09 14:30 ` [PATCH v2 11/16] jffs2: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:31 ` Matthew Wilcox (Oracle)
2020-10-09 14:31 ` [PATCH v2 13/16] udf: " Matthew Wilcox (Oracle)
` (4 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:31 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The ubifs readpage implementation was already synchronous, so use
AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Richard Weinberger <richard@nod.at>
---
fs/ubifs/file.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index b77d1637bbbc..82633509c45e 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -772,7 +772,6 @@ static int ubifs_do_bulk_read(struct ubifs_info *c, struct bu_info *bu,
if (err)
goto out_warn;
- unlock_page(page1);
ret = 1;
isize = i_size_read(inode);
@@ -892,11 +891,16 @@ static int ubifs_bulk_read(struct page *page)
static int ubifs_readpage(struct file *file, struct page *page)
{
- if (ubifs_bulk_read(page))
- return 0;
- do_readpage(page);
- unlock_page(page);
- return 0;
+ int err;
+
+ err = ubifs_bulk_read(page);
+ if (err == 0)
+ err = do_readpage(page);
+ if (err < 0) {
+ unlock_page(page);
+ return err;
+ }
+ return AOP_UPDATED_PAGE;
}
static int do_writepage(struct page *page, int len)
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 13/16] udf: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (11 preceding siblings ...)
2020-10-09 14:31 ` [PATCH v2 12/16] ubifs: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:31 ` Matthew Wilcox (Oracle)
2020-10-09 14:31 ` [PATCH v2 14/16] vboxsf: " Matthew Wilcox (Oracle)
` (3 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:31 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs, Jan Kara
The udf inline data readpage implementation was already synchronous,
so use AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/udf/file.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/udf/file.c b/fs/udf/file.c
index 628941a6b79a..52bbe92d7c43 100644
--- a/fs/udf/file.c
+++ b/fs/udf/file.c
@@ -61,9 +61,8 @@ static int udf_adinicb_readpage(struct file *file, struct page *page)
{
BUG_ON(!PageLocked(page));
__udf_adinicb_readpage(page);
- unlock_page(page);
- return 0;
+ return AOP_UPDATED_PAGE;
}
static int udf_adinicb_writepage(struct page *page,
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 14/16] vboxsf: Tell the VFS that readpage was synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (12 preceding siblings ...)
2020-10-09 14:31 ` [PATCH v2 13/16] udf: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:31 ` Matthew Wilcox (Oracle)
2020-10-09 14:31 ` [PATCH v2 15/16] iomap: Inline iomap_iop_set_range_uptodate into its one caller Matthew Wilcox (Oracle)
` (2 subsequent siblings)
16 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:31 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
The vboxsf inline data readpage implementation was already synchronous,
so use AOP_UPDATED_PAGE to avoid cycling the page lock.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/vboxsf/file.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index c4ab5996d97a..c2a144e5cb5a 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -228,6 +228,8 @@ static int vboxsf_readpage(struct file *file, struct page *page)
}
kunmap(page);
+ if (!err)
+ return AOP_UPDATED_PAGE;
unlock_page(page);
return err;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 15/16] iomap: Inline iomap_iop_set_range_uptodate into its one caller
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (13 preceding siblings ...)
2020-10-09 14:31 ` [PATCH v2 14/16] vboxsf: " Matthew Wilcox (Oracle)
@ 2020-10-09 14:31 ` Matthew Wilcox (Oracle)
2020-10-15 9:08 ` Christoph Hellwig
2020-10-09 14:31 ` [PATCH v2 16/16] iomap: Make readpage synchronous Matthew Wilcox (Oracle)
2020-10-15 9:02 ` [PATCH v2 00/16] Allow readpage to return a locked page Christoph Hellwig
16 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:31 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
iomap_set_range_uptodate() is the only caller of
iomap_iop_set_range_uptodate() and it makes future patches easier to
have it inline.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/iomap/buffered-io.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 8180061b9e16..e60f572e1590 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -141,8 +141,8 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
*lenp = plen;
}
-static void
-iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct page *page, unsigned off,
+ unsigned len)
{
struct iomap_page *iop = to_iomap_page(page);
struct inode *inode = page->mapping->host;
@@ -150,6 +150,14 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
unsigned last = (off + len - 1) >> inode->i_blkbits;
unsigned long flags;
+ if (PageError(page))
+ return;
+
+ if (!iop) {
+ SetPageUptodate(page);
+ return;
+ }
+
spin_lock_irqsave(&iop->uptodate_lock, flags);
bitmap_set(iop->uptodate, first, last - first + 1);
if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page)))
@@ -157,18 +165,6 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
spin_unlock_irqrestore(&iop->uptodate_lock, flags);
}
-static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
-{
- if (PageError(page))
- return;
-
- if (page_has_private(page))
- iomap_iop_set_range_uptodate(page, off, len);
- else
- SetPageUptodate(page);
-}
-
static void
iomap_read_page_end_io(struct bio_vec *bvec, int error)
{
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 15/16] iomap: Inline iomap_iop_set_range_uptodate into its one caller
2020-10-09 14:31 ` [PATCH v2 15/16] iomap: Inline iomap_iop_set_range_uptodate into its one caller Matthew Wilcox (Oracle)
@ 2020-10-15 9:08 ` Christoph Hellwig
0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 9:08 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (14 preceding siblings ...)
2020-10-09 14:31 ` [PATCH v2 15/16] iomap: Inline iomap_iop_set_range_uptodate into its one caller Matthew Wilcox (Oracle)
@ 2020-10-09 14:31 ` Matthew Wilcox (Oracle)
2020-10-15 9:42 ` Christoph Hellwig
2020-10-15 9:02 ` [PATCH v2 00/16] Allow readpage to return a locked page Christoph Hellwig
16 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-10-09 14:31 UTC (permalink / raw)
To: linux-fsdevel
Cc: Matthew Wilcox (Oracle),
linux-mm, v9fs-developer, linux-kernel, linux-afs, ceph-devel,
linux-cifs, ecryptfs, linux-um, linux-mtd, Richard Weinberger,
linux-xfs
A synchronous readpage lets us report the actual errno instead of
ineffectively setting PageError.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/iomap/buffered-io.c | 74 ++++++++++++++++++++++++------------------
1 file changed, 42 insertions(+), 32 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e60f572e1590..887bf871ca9b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -150,9 +150,6 @@ static void iomap_set_range_uptodate(struct page *page, unsigned off,
unsigned last = (off + len - 1) >> inode->i_blkbits;
unsigned long flags;
- if (PageError(page))
- return;
-
if (!iop) {
SetPageUptodate(page);
return;
@@ -165,42 +162,50 @@ static void iomap_set_range_uptodate(struct page *page, unsigned off,
spin_unlock_irqrestore(&iop->uptodate_lock, flags);
}
-static void
-iomap_read_page_end_io(struct bio_vec *bvec, int error)
+static void iomap_read_page_end_io(struct bio_vec *bvec,
+ struct completion *done, bool error)
{
struct page *page = bvec->bv_page;
struct iomap_page *iop = to_iomap_page(page);
- if (unlikely(error)) {
- ClearPageUptodate(page);
- SetPageError(page);
- } else {
+ if (!error)
iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
- }
- if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
- unlock_page(page);
+ if (!iop ||
+ atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending)) {
+ if (done)
+ complete(done);
+ else
+ unlock_page(page);
+ }
}
+struct iomap_readpage_ctx {
+ struct page *cur_page;
+ bool cur_page_in_bio;
+ blk_status_t status;
+ struct bio *bio;
+ struct readahead_control *rac;
+ struct completion done;
+};
+
static void
iomap_read_end_io(struct bio *bio)
{
- int error = blk_status_to_errno(bio->bi_status);
+ struct iomap_readpage_ctx *ctx = bio->bi_private;
struct bio_vec *bvec;
struct bvec_iter_all iter_all;
+ /* Capture the first error */
+ if (ctx && ctx->status == BLK_STS_OK)
+ ctx->status = bio->bi_status;
+
bio_for_each_segment_all(bvec, bio, iter_all)
- iomap_read_page_end_io(bvec, error);
+ iomap_read_page_end_io(bvec, ctx ? &ctx->done : NULL,
+ bio->bi_status != BLK_STS_OK);
bio_put(bio);
}
-struct iomap_readpage_ctx {
- struct page *cur_page;
- bool cur_page_in_bio;
- struct bio *bio;
- struct readahead_control *rac;
-};
-
static void
iomap_read_inline_data(struct inode *inode, struct page *page,
struct iomap *iomap)
@@ -292,6 +297,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
ctx->bio->bi_opf = REQ_OP_READ;
if (ctx->rac)
ctx->bio->bi_opf |= REQ_RAHEAD;
+ else
+ ctx->bio->bi_private = ctx;
ctx->bio->bi_iter.bi_sector = sector;
bio_set_dev(ctx->bio, iomap->bdev);
ctx->bio->bi_end_io = iomap_read_end_io;
@@ -318,15 +325,17 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
trace_iomap_readpage(page->mapping->host, 1);
+ ctx.status = BLK_STS_OK;
+ init_completion(&ctx.done);
+
for (poff = 0; poff < PAGE_SIZE; poff += ret) {
ret = iomap_apply(inode, page_offset(page) + poff,
PAGE_SIZE - poff, 0, ops, &ctx,
iomap_readpage_actor);
- if (ret <= 0) {
- WARN_ON_ONCE(ret == 0);
- SetPageError(page);
+ if (WARN_ON_ONCE(ret == 0))
+ ret = -EIO;
+ if (ret < 0)
break;
- }
}
if (ctx.bio) {
@@ -334,15 +343,16 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
WARN_ON_ONCE(!ctx.cur_page_in_bio);
} else {
WARN_ON_ONCE(ctx.cur_page_in_bio);
- unlock_page(page);
+ complete(&ctx.done);
}
- /*
- * Just like mpage_readahead and block_read_full_page we always
- * return 0 and just mark the page as PageError on errors. This
- * should be cleaned up all through the stack eventually.
- */
- return 0;
+ wait_for_completion(&ctx.done);
+ if (ret >= 0)
+ ret = blk_status_to_errno(ctx.status);
+ if (ret == 0)
+ return AOP_UPDATED_PAGE;
+ unlock_page(page);
+ return ret;
}
EXPORT_SYMBOL_GPL(iomap_readpage);
--
2.28.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-09 14:31 ` [PATCH v2 16/16] iomap: Make readpage synchronous Matthew Wilcox (Oracle)
@ 2020-10-15 9:42 ` Christoph Hellwig
2020-10-15 16:43 ` Matthew Wilcox
0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 9:42 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
> +static void iomap_read_page_end_io(struct bio_vec *bvec,
> + struct completion *done, bool error)
I really don't like the parameters here. Part of the problem is
that ctx is only assigned to bi_private conditionally, which can
easily be fixed. The other part is the strange bool error when
we can just pass on bi_stats. See the patch at the end of what
I'd do intead.
> @@ -318,15 +325,17 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
>
> trace_iomap_readpage(page->mapping->host, 1);
>
> + ctx.status = BLK_STS_OK;
This should move into the initializer for ctx. Or we could just drop
it given that BLK_STS_OK is and must always be 0.
> } else {
> WARN_ON_ONCE(ctx.cur_page_in_bio);
> - unlock_page(page);
> + complete(&ctx.done);
> }
>
> + wait_for_completion(&ctx.done);
I don't think we need the complete / wait_for_completion dance in
this case.
> + if (ret >= 0)
> + ret = blk_status_to_errno(ctx.status);
> + if (ret == 0)
> + return AOP_UPDATED_PAGE;
> + unlock_page(page);
> + return ret;
Nipick, but I'd rather have a goto out_unlock for both error case
and have the AOP_UPDATED_PAGE for the normal path straight in line.
Here is an untested patch with my suggestions:
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 887bf871ca9bba..81d34725565d7e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -162,33 +162,34 @@ static void iomap_set_range_uptodate(struct page *page, unsigned off,
spin_unlock_irqrestore(&iop->uptodate_lock, flags);
}
-static void iomap_read_page_end_io(struct bio_vec *bvec,
- struct completion *done, bool error)
+struct iomap_readpage_ctx {
+ struct page *cur_page;
+ bool cur_page_in_bio;
+ blk_status_t status;
+ struct bio *bio;
+ struct readahead_control *rac;
+ struct completion done;
+};
+
+static void
+iomap_read_page_end_io(struct iomap_readpage_ctx *ctx, struct bio_vec *bvec,
+ blk_status_t status)
{
struct page *page = bvec->bv_page;
struct iomap_page *iop = to_iomap_page(page);
- if (!error)
+ if (status == BLK_STS_OK)
iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
if (!iop ||
atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending)) {
- if (done)
- complete(done);
- else
+ if (ctx->rac)
unlock_page(page);
+ else
+ complete(&ctx->done);
}
}
-struct iomap_readpage_ctx {
- struct page *cur_page;
- bool cur_page_in_bio;
- blk_status_t status;
- struct bio *bio;
- struct readahead_control *rac;
- struct completion done;
-};
-
static void
iomap_read_end_io(struct bio *bio)
{
@@ -197,12 +198,11 @@ iomap_read_end_io(struct bio *bio)
struct bvec_iter_all iter_all;
/* Capture the first error */
- if (ctx && ctx->status == BLK_STS_OK)
+ if (ctx->status == BLK_STS_OK)
ctx->status = bio->bi_status;
bio_for_each_segment_all(bvec, bio, iter_all)
- iomap_read_page_end_io(bvec, ctx ? &ctx->done : NULL,
- bio->bi_status != BLK_STS_OK);
+ iomap_read_page_end_io(ctx, bvec, bio->bi_status);
bio_put(bio);
}
@@ -297,8 +297,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
ctx->bio->bi_opf = REQ_OP_READ;
if (ctx->rac)
ctx->bio->bi_opf |= REQ_RAHEAD;
- else
- ctx->bio->bi_private = ctx;
+ ctx->bio->bi_private = ctx;
ctx->bio->bi_iter.bi_sector = sector;
bio_set_dev(ctx->bio, iomap->bdev);
ctx->bio->bi_end_io = iomap_read_end_io;
@@ -318,14 +317,16 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
int
iomap_readpage(struct page *page, const struct iomap_ops *ops)
{
- struct iomap_readpage_ctx ctx = { .cur_page = page };
+ struct iomap_readpage_ctx ctx = {
+ .cur_page = page,
+ .status = BLK_STS_OK,
+ };
struct inode *inode = page->mapping->host;
unsigned poff;
loff_t ret;
trace_iomap_readpage(page->mapping->host, 1);
- ctx.status = BLK_STS_OK;
init_completion(&ctx.done);
for (poff = 0; poff < PAGE_SIZE; poff += ret) {
@@ -340,17 +341,16 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
if (ctx.bio) {
submit_bio(ctx.bio);
- WARN_ON_ONCE(!ctx.cur_page_in_bio);
- } else {
- WARN_ON_ONCE(ctx.cur_page_in_bio);
- complete(&ctx.done);
+ wait_for_completion(&ctx.done);
}
- wait_for_completion(&ctx.done);
- if (ret >= 0)
- ret = blk_status_to_errno(ctx.status);
- if (ret == 0)
- return AOP_UPDATED_PAGE;
+ if (ret < 0)
+ goto out_unlock;
+ ret = blk_status_to_errno(ctx.status);
+ if (ret < 0)
+ goto out_unlock;
+ return AOP_UPDATED_PAGE;
+out_unlock:
unlock_page(page);
return ret;
}
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-15 9:42 ` Christoph Hellwig
@ 2020-10-15 16:43 ` Matthew Wilcox
2020-10-15 17:58 ` Christoph Hellwig
0 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox @ 2020-10-15 16:43 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 10:42:03AM +0100, Christoph Hellwig wrote:
> > +static void iomap_read_page_end_io(struct bio_vec *bvec,
> > + struct completion *done, bool error)
>
> I really don't like the parameters here. Part of the problem is
> that ctx is only assigned to bi_private conditionally, which can
> easily be fixed. The other part is the strange bool error when
> we can just pass on bi_stats. See the patch at the end of what
> I'd do intead.
I prefer assigning ctx conditionally to propagating the knowledge
that !rac means synchronous. I've gone with this:
static void iomap_read_page_end_io(struct bio_vec *bvec,
- struct completion *done, bool error)
+ struct iomap_readpage_ctx *ctx, blk_status_t status)
{
struct page *page = bvec->bv_page;
struct iomap_page *iop = to_iomap_page(page);
- if (!error)
+ if (status == BLK_STS_OK) {
iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
+ } else if (ctx && ctx->status == BLK_STS_OK) {
+ ctx->status = status;
+ }
if (!iop ||
atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending)) {
- if (done)
- complete(done);
+ if (ctx)
+ complete(&ctx->done);
else
unlock_page(page);
}
> > } else {
> > WARN_ON_ONCE(ctx.cur_page_in_bio);
> > - unlock_page(page);
> > + complete(&ctx.done);
> > }
> >
> > + wait_for_completion(&ctx.done);
>
> I don't think we need the complete / wait_for_completion dance in
> this case.
>
> > + if (ret >= 0)
> > + ret = blk_status_to_errno(ctx.status);
> > + if (ret == 0)
> > + return AOP_UPDATED_PAGE;
> > + unlock_page(page);
> > + return ret;
>
> Nipick, but I'd rather have a goto out_unlock for both error case
> and have the AOP_UPDATED_PAGE for the normal path straight in line.
>
> Here is an untested patch with my suggestions:
I think we can go a little further here:
@@ -340,16 +335,12 @@ iomap_readpage(struct page *page, const struct iomap_ops *
ops)
if (ctx.bio) {
submit_bio(ctx.bio);
- WARN_ON_ONCE(!ctx.cur_page_in_bio);
- } else {
- WARN_ON_ONCE(ctx.cur_page_in_bio);
- complete(&ctx.done);
+ wait_for_completion(&ctx.done);
+ if (ret > 0)
+ ret = blk_status_to_errno(ctx.status);
}
- wait_for_completion(&ctx.done);
if (ret >= 0)
- ret = blk_status_to_errno(ctx.status);
- if (ret == 0)
return AOP_UPDATED_PAGE;
unlock_page(page);
return ret;
... there's no need to call blk_status_to_errno if we never submitted a bio.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-15 16:43 ` Matthew Wilcox
@ 2020-10-15 17:58 ` Christoph Hellwig
2020-10-15 19:03 ` Matthew Wilcox
0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 17:58 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Christoph Hellwig, linux-fsdevel, linux-mm, v9fs-developer,
linux-kernel, linux-afs, ceph-devel, linux-cifs, ecryptfs,
linux-um, linux-mtd, Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 05:43:33PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 15, 2020 at 10:42:03AM +0100, Christoph Hellwig wrote:
> > > +static void iomap_read_page_end_io(struct bio_vec *bvec,
> > > + struct completion *done, bool error)
> >
> > I really don't like the parameters here. Part of the problem is
> > that ctx is only assigned to bi_private conditionally, which can
> > easily be fixed. The other part is the strange bool error when
> > we can just pass on bi_stats. See the patch at the end of what
> > I'd do intead.
>
> I prefer assigning ctx conditionally to propagating the knowledge
> that !rac means synchronous. I've gone with this:
And I really hate these kinds of conditional assignments. If the
->rac check is too raw please just add an explicit
bool synchronous : 1;
flag.
> @@ -340,16 +335,12 @@ iomap_readpage(struct page *page, const struct iomap_ops *
> ops)
>
> if (ctx.bio) {
> submit_bio(ctx.bio);
> + if (ret > 0)
> + ret = blk_status_to_errno(ctx.status);
> }
>
> - wait_for_completion(&ctx.done);
> if (ret >= 0)
> - ret = blk_status_to_errno(ctx.status);
> - if (ret == 0)
> return AOP_UPDATED_PAGE;
> unlock_page(page);
> return ret;
>
>
> ... there's no need to call blk_status_to_errno if we never submitted a bio.
True. I'd still prefer the AOP_UPDATED_PAGE as the fallthrough case
and an explicit goto out_unlock, though.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-15 17:58 ` Christoph Hellwig
@ 2020-10-15 19:03 ` Matthew Wilcox
2020-10-16 6:35 ` Christoph Hellwig
0 siblings, 1 reply; 28+ messages in thread
From: Matthew Wilcox @ 2020-10-15 19:03 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 06:58:48PM +0100, Christoph Hellwig wrote:
> On Thu, Oct 15, 2020 at 05:43:33PM +0100, Matthew Wilcox wrote:
> > I prefer assigning ctx conditionally to propagating the knowledge
> > that !rac means synchronous. I've gone with this:
>
> And I really hate these kinds of conditional assignments. If the
> ->rac check is too raw please just add an explicit
>
> bool synchronous : 1;
>
> flag.
I honestly don't see the problem. We have to assign the status
conditionally anyway so we don't overwrite an error with a subsequent
success.
> True. I'd still prefer the AOP_UPDATED_PAGE as the fallthrough case
> and an explicit goto out_unlock, though.
So this?
if (ctx.bio) {
submit_bio(ctx.bio);
wait_for_completion(&ctx.done);
if (ret < 0)
goto err;
ret = blk_status_to_errno(ctx.status);
}
if (ret < 0)
goto err;
return AOP_UPDATED_PAGE;
err:
unlock_page(page);
return ret;
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 16/16] iomap: Make readpage synchronous
2020-10-15 19:03 ` Matthew Wilcox
@ 2020-10-16 6:35 ` Christoph Hellwig
0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-16 6:35 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Christoph Hellwig, linux-fsdevel, linux-mm, v9fs-developer,
linux-kernel, linux-afs, ceph-devel, linux-cifs, ecryptfs,
linux-um, linux-mtd, Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 08:03:12PM +0100, Matthew Wilcox wrote:
> I honestly don't see the problem. We have to assign the status
> conditionally anyway so we don't overwrite an error with a subsequent
> success.
Yes, but having a potential NULL pointer to a common structure is just
waiting for trouble.
>
> > True. I'd still prefer the AOP_UPDATED_PAGE as the fallthrough case
> > and an explicit goto out_unlock, though.
>
> So this?
>
> if (ctx.bio) {
> submit_bio(ctx.bio);
> wait_for_completion(&ctx.done);
> if (ret < 0)
> goto err;
> ret = blk_status_to_errno(ctx.status);
> }
>
> if (ret < 0)
> goto err;
> return AOP_UPDATED_PAGE;
> err:
> unlock_page(page);
> return ret;
>
Looks good.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 00/16] Allow readpage to return a locked page
2020-10-09 14:30 [PATCH v2 00/16] Allow readpage to return a locked page Matthew Wilcox (Oracle)
` (15 preceding siblings ...)
2020-10-09 14:31 ` [PATCH v2 16/16] iomap: Make readpage synchronous Matthew Wilcox (Oracle)
@ 2020-10-15 9:02 ` Christoph Hellwig
2020-10-15 11:49 ` Matthew Wilcox
16 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2020-10-15 9:02 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
On Fri, Oct 09, 2020 at 03:30:48PM +0100, Matthew Wilcox (Oracle) wrote:
> Ideally all filesystems would return from ->readpage with the page
> Uptodate and Locked, but it's a bit painful to convert all the
> asynchronous readpage implementations to synchronous. The first 14
> filesystems converted are already synchronous. The last two patches
> convert iomap to synchronous readpage.
Is it really that bad? It seems like a lot of the remainig file systems
use the generic mpage/buffer/nobh helpers.
But I guess this series is a good first step.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 00/16] Allow readpage to return a locked page
2020-10-15 9:02 ` [PATCH v2 00/16] Allow readpage to return a locked page Christoph Hellwig
@ 2020-10-15 11:49 ` Matthew Wilcox
0 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox @ 2020-10-15 11:49 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-mm, v9fs-developer, linux-kernel, linux-afs,
ceph-devel, linux-cifs, ecryptfs, linux-um, linux-mtd,
Richard Weinberger, linux-xfs
On Thu, Oct 15, 2020 at 10:02:42AM +0100, Christoph Hellwig wrote:
> On Fri, Oct 09, 2020 at 03:30:48PM +0100, Matthew Wilcox (Oracle) wrote:
> > Ideally all filesystems would return from ->readpage with the page
> > Uptodate and Locked, but it's a bit painful to convert all the
> > asynchronous readpage implementations to synchronous. The first 14
> > filesystems converted are already synchronous. The last two patches
> > convert iomap to synchronous readpage.
>
> Is it really that bad? It seems like a lot of the remainig file systems
> use the generic mpage/buffer/nobh helpers.
>
> But I guess this series is a good first step.
I'm just testing a patch to mpage_readpage():
+++ b/fs/mpage.c
@@ -406,11 +406,17 @@ int mpage_readpage(struct page *page, get_block_t get_block)
.nr_pages = 1,
.get_block = get_block,
};
+ int err;
args.bio = do_mpage_readpage(&args);
- if (args.bio)
- mpage_bio_submit(REQ_OP_READ, 0, args.bio);
- return 0;
+ if (!args.bio)
+ return 0;
+ bio_set_op_attrs(args.bio, REQ_OP_READ, 0);
+ guard_bio_eod(args.bio);
+ err = submit_bio_wait(args.bio);
+ if (!err)
+ err = AOP_UPDATED_PAGE;
+ return err;
}
EXPORT_SYMBOL(mpage_readpage);
but I'm not looking forward to block_read_full_page().
^ permalink raw reply [flat|nested] 28+ messages in thread