Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 00/13] Change readahead API
@ 2020-02-11  1:03 Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
                   ` (12 more replies)
  0 siblings, 13 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

This series adds a readahead address_space operation to eventually
replace the readpages operation.  The key difference is that
pages are added to the page cache as they are allocated (and
then looked up by the filesystem) instead of passing them on a
list to the readpages operation and having the filesystem add
them to the page cache.  It's a net reduction in code for each
implementation, more efficient than walking a list, and solves
the direct-write vs buffered-read problem reported by yu kuai at
https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/

v5 switches to passing a readahead_control struct (mirroring the
writepages_control struct passed to writepages).  This has a number of
advantages:
 - It fixes a number of bugs in various implementations, eg forgetting to
   increment 'start', an off-by-one error in 'nr_pages' or treating 'start'
   as a byte offset instead of a page offset.
 - It allows us to change the arguments without changing all the
   implementations of ->readahead which just call mpage_readahead() or
   iomap_readahead()
 - Figuring out which pages haven't been attempted by the implementation
   is more natural this way.
 - There's less code in each implementation.

This version deletes a lot more lines than previous versions of the patch
-- we're net -97 lines instead of -17 with v4.  It'll be even more when
we can finish the conversion and remove all the ->readpages support code,
including read_cache_pages().

Also new in v5 is patch 5 which adds page_cache_readahead_limit() and
converts ext4 and f2fs to use it for their Merkel trees.

Matthew Wilcox (Oracle) (13):
  mm: Fix the return type of __do_page_cache_readahead
  mm: Ignore return value of ->readpages
  mm: Put readahead pages in cache earlier
  mm: Add readahead address space operation
  mm: Add page_cache_readahead_limit
  fs: Convert mpage_readpages to mpage_readahead
  btrfs: Convert from readpages to readahead
  erofs: Convert uncompressed files from readpages to readahead
  erofs: Convert compressed files from readpages to readahead
  ext4: Convert from readpages to readahead
  f2fs: Convert from readpages to readahead
  fuse: Convert from readpages to readahead
  iomap: Convert from readpages to readahead

 Documentation/filesystems/locking.rst |   6 +-
 Documentation/filesystems/vfs.rst     |  13 +++
 drivers/staging/exfat/exfat_super.c   |   7 +-
 fs/block_dev.c                        |   7 +-
 fs/btrfs/extent_io.c                  |  48 +++------
 fs/btrfs/extent_io.h                  |   3 +-
 fs/btrfs/inode.c                      |  16 ++-
 fs/erofs/data.c                       |  39 +++----
 fs/erofs/zdata.c                      |  29 ++----
 fs/ext2/inode.c                       |  10 +-
 fs/ext4/ext4.h                        |   3 +-
 fs/ext4/inode.c                       |  23 ++---
 fs/ext4/readpage.c                    |  22 ++--
 fs/ext4/verity.c                      |  35 +------
 fs/f2fs/data.c                        |  50 ++++-----
 fs/f2fs/f2fs.h                        |   5 +-
 fs/f2fs/verity.c                      |  35 +------
 fs/fat/inode.c                        |   7 +-
 fs/fuse/file.c                        |  46 ++++-----
 fs/gfs2/aops.c                        |  23 ++---
 fs/hpfs/file.c                        |   7 +-
 fs/iomap/buffered-io.c                | 103 ++++++-------------
 fs/iomap/trace.h                      |   2 +-
 fs/isofs/inode.c                      |   7 +-
 fs/jfs/inode.c                        |   7 +-
 fs/mpage.c                            |  38 +++----
 fs/nilfs2/inode.c                     |  15 +--
 fs/ocfs2/aops.c                       |  34 +++---
 fs/omfs/file.c                        |   7 +-
 fs/qnx6/inode.c                       |   7 +-
 fs/reiserfs/inode.c                   |   8 +-
 fs/udf/inode.c                        |   7 +-
 fs/xfs/xfs_aops.c                     |  13 +--
 fs/zonefs/super.c                     |   7 +-
 include/linux/fs.h                    |   2 +
 include/linux/iomap.h                 |   3 +-
 include/linux/mpage.h                 |   4 +-
 include/linux/pagemap.h               |  84 +++++++++++++++
 include/trace/events/erofs.h          |   6 +-
 include/trace/events/f2fs.h           |   6 +-
 mm/internal.h                         |   2 +-
 mm/migrate.c                          |   2 +-
 mm/readahead.c                        | 143 ++++++++++++++++----------
 43 files changed, 422 insertions(+), 519 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-11  8:19   ` Johannes Thumshirn
                     ` (3 more replies)
  2020-02-11  1:03 ` [PATCH v5 02/13] mm: Ignore return value of ->readpages Matthew Wilcox
                   ` (11 subsequent siblings)
  12 siblings, 4 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

ra_submit() which is a wrapper around __do_page_cache_readahead() already
returns an unsigned long, and the 'nr_to_read' parameter is an unsigned
long, so fix __do_page_cache_readahead() to return an unsigned long,
even though I'm pretty sure we're not going to readahead more than 2^32
pages ever.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/internal.h  | 2 +-
 mm/readahead.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 3cf20ab3ca01..41b93c4b3ab7 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -49,7 +49,7 @@ void unmap_page_range(struct mmu_gather *tlb,
 			     unsigned long addr, unsigned long end,
 			     struct zap_details *details);
 
-extern unsigned int __do_page_cache_readahead(struct address_space *mapping,
+extern unsigned long __do_page_cache_readahead(struct address_space *mapping,
 		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
 		unsigned long lookahead_size);
 
diff --git a/mm/readahead.c b/mm/readahead.c
index 2fe72cd29b47..6bf73ef33b7e 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -152,7 +152,7 @@ static int read_pages(struct address_space *mapping, struct file *filp,
  *
  * Returns the number of pages requested, or the maximum amount of I/O allowed.
  */
-unsigned int __do_page_cache_readahead(struct address_space *mapping,
+unsigned long __do_page_cache_readahead(struct address_space *mapping,
 		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
 		unsigned long lookahead_size)
 {
@@ -161,7 +161,7 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping,
 	unsigned long end_index;	/* The last page we want to read */
 	LIST_HEAD(page_pool);
 	int page_idx;
-	unsigned int nr_pages = 0;
+	unsigned long nr_pages = 0;
 	loff_t isize = i_size_read(inode);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 02/13] mm: Ignore return value of ->readpages
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-12 18:13   ` Christoph Hellwig
  2020-02-11  1:03 ` [PATCH v5 03/13] mm: Put readahead pages in cache earlier Matthew Wilcox
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

We used to assign the return value to a variable, which we then ignored.
Remove the pretence of caring.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/readahead.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 6bf73ef33b7e..fc77d13af556 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -113,17 +113,16 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 
 EXPORT_SYMBOL(read_cache_pages);
 
-static int read_pages(struct address_space *mapping, struct file *filp,
+static void read_pages(struct address_space *mapping, struct file *filp,
 		struct list_head *pages, unsigned int nr_pages, gfp_t gfp)
 {
 	struct blk_plug plug;
 	unsigned page_idx;
-	int ret;
 
 	blk_start_plug(&plug);
 
 	if (mapping->a_ops->readpages) {
-		ret = mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
+		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
 		/* Clean up the remaining pages */
 		put_pages_list(pages);
 		goto out;
@@ -136,12 +135,9 @@ static int read_pages(struct address_space *mapping, struct file *filp,
 			mapping->a_ops->readpage(filp, page);
 		put_page(page);
 	}
-	ret = 0;
 
 out:
 	blk_finish_plug(&plug);
-
-	return ret;
 }
 
 /*
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 03/13] mm: Put readahead pages in cache earlier
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 02/13] mm: Ignore return value of ->readpages Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-14  3:36   ` John Hubbard
  2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

At allocation time, put the pages in the cache unless we're using
->readpages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/readahead.c | 66 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 42 insertions(+), 24 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index fc77d13af556..96c6ca68a174 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -114,10 +114,10 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 EXPORT_SYMBOL(read_cache_pages);
 
 static void read_pages(struct address_space *mapping, struct file *filp,
-		struct list_head *pages, unsigned int nr_pages, gfp_t gfp)
+		struct list_head *pages, pgoff_t start,
+		unsigned int nr_pages)
 {
 	struct blk_plug plug;
-	unsigned page_idx;
 
 	blk_start_plug(&plug);
 
@@ -125,18 +125,17 @@ static void read_pages(struct address_space *mapping, struct file *filp,
 		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
 		/* Clean up the remaining pages */
 		put_pages_list(pages);
-		goto out;
-	}
+	} else {
+		struct page *page;
+		unsigned long index;
 
-	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
-		struct page *page = lru_to_page(pages);
-		list_del(&page->lru);
-		if (!add_to_page_cache_lru(page, mapping, page->index, gfp))
+		xa_for_each_range(&mapping->i_pages, index, page, start,
+				start + nr_pages - 1) {
 			mapping->a_ops->readpage(filp, page);
-		put_page(page);
+			put_page(page);
+		}
 	}
 
-out:
 	blk_finish_plug(&plug);
 }
 
@@ -149,17 +148,18 @@ static void read_pages(struct address_space *mapping, struct file *filp,
  * Returns the number of pages requested, or the maximum amount of I/O allowed.
  */
 unsigned long __do_page_cache_readahead(struct address_space *mapping,
-		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
+		struct file *filp, pgoff_t start, unsigned long nr_to_read,
 		unsigned long lookahead_size)
 {
 	struct inode *inode = mapping->host;
-	struct page *page;
 	unsigned long end_index;	/* The last page we want to read */
 	LIST_HEAD(page_pool);
 	int page_idx;
+	pgoff_t page_offset = start;
 	unsigned long nr_pages = 0;
 	loff_t isize = i_size_read(inode);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
+	bool use_list = mapping->a_ops->readpages;
 
 	if (isize == 0)
 		goto out;
@@ -170,7 +170,7 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 * Preallocate as many pages as we will need.
 	 */
 	for (page_idx = 0; page_idx < nr_to_read; page_idx++) {
-		pgoff_t page_offset = offset + page_idx;
+		struct page *page;
 
 		if (page_offset > end_index)
 			break;
@@ -178,25 +178,43 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 		page = xa_load(&mapping->i_pages, page_offset);
 		if (page && !xa_is_value(page)) {
 			/*
-			 * Page already present?  Kick off the current batch of
-			 * contiguous pages before continuing with the next
-			 * batch.
+			 * Page already present?  Kick off the current batch
+			 * of contiguous pages before continuing with the
+			 * next batch.
+			 * It's possible this page is the page we should
+			 * be marking with PageReadahead.  However, we
+			 * don't have a stable ref to this page so it might
+			 * be reallocated to another user before we can set
+			 * the bit.  There's probably another page in the
+			 * cache marked with PageReadahead from the other
+			 * process which accessed this file.
 			 */
-			if (nr_pages)
-				read_pages(mapping, filp, &page_pool, nr_pages,
-						gfp_mask);
-			nr_pages = 0;
-			continue;
+			goto skip;
 		}
 
 		page = __page_cache_alloc(gfp_mask);
 		if (!page)
 			break;
-		page->index = page_offset;
-		list_add(&page->lru, &page_pool);
+		if (use_list) {
+			page->index = page_offset;
+			list_add(&page->lru, &page_pool);
+		} else if (add_to_page_cache_lru(page, mapping, page_offset,
+					gfp_mask) < 0) {
+			put_page(page);
+			goto skip;
+		}
+
 		if (page_idx == nr_to_read - lookahead_size)
 			SetPageReadahead(page);
 		nr_pages++;
+		page_offset++;
+		continue;
+skip:
+		if (nr_pages)
+			read_pages(mapping, filp, &page_pool, start, nr_pages);
+		nr_pages = 0;
+		page_offset++;
+		start = page_offset;
 	}
 
 	/*
@@ -205,7 +223,7 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 * will then handle the error.
 	 */
 	if (nr_pages)
-		read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask);
+		read_pages(mapping, filp, &page_pool, start, nr_pages);
 	BUG_ON(!list_empty(&page_pool));
 out:
 	return nr_pages;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (2 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 03/13] mm: Put readahead pages in cache earlier Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-11  4:52   ` Dave Chinner
                     ` (2 more replies)
  2020-02-11  1:03 ` [PATCH v5 05/13] mm: Add page_cache_readahead_limit Matthew Wilcox
                   ` (8 subsequent siblings)
  12 siblings, 3 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

This replaces ->readpages with a saner interface:
 - Return void instead of an ignored error code.
 - Pages are already in the page cache when ->readahead is called.
 - Implementation looks up the pages in the page cache instead of
   having them passed in a linked list.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 Documentation/filesystems/locking.rst |  6 ++-
 Documentation/filesystems/vfs.rst     | 13 +++++++
 include/linux/fs.h                    |  2 +
 include/linux/pagemap.h               | 54 +++++++++++++++++++++++++++
 mm/readahead.c                        | 48 ++++++++++++++----------
 5 files changed, 102 insertions(+), 21 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 5057e4d9dcd1..0ebc4491025a 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -239,6 +239,7 @@ prototypes::
 	int (*readpage)(struct file *, struct page *);
 	int (*writepages)(struct address_space *, struct writeback_control *);
 	int (*set_page_dirty)(struct page *page);
+	void (*readahead)(struct readahead_control *);
 	int (*readpages)(struct file *filp, struct address_space *mapping,
 			struct list_head *pages, unsigned nr_pages);
 	int (*write_begin)(struct file *, struct address_space *mapping,
@@ -271,7 +272,8 @@ writepage:		yes, unlocks (see below)
 readpage:		yes, unlocks
 writepages:
 set_page_dirty		no
-readpages:
+readahead:		yes, unlocks
+readpages:		no
 write_begin:		locks the page		 exclusive
 write_end:		yes, unlocks		 exclusive
 bmap:
@@ -295,6 +297,8 @@ the request handler (/dev/loop).
 ->readpage() unlocks the page, either synchronously or via I/O
 completion.
 
+->readahead() unlocks the pages like ->readpage().
+
 ->readpages() populates the pagecache with the passed pages and starts
 I/O against them.  They come unlocked upon I/O completion.
 
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 7d4d09dd5e6d..cabee16b7406 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -706,6 +706,7 @@ cache in your filesystem.  The following members are defined:
 		int (*readpage)(struct file *, struct page *);
 		int (*writepages)(struct address_space *, struct writeback_control *);
 		int (*set_page_dirty)(struct page *page);
+		void (*readahead)(struct readahead_control *);
 		int (*readpages)(struct file *filp, struct address_space *mapping,
 				 struct list_head *pages, unsigned nr_pages);
 		int (*write_begin)(struct file *, struct address_space *mapping,
@@ -781,12 +782,24 @@ cache in your filesystem.  The following members are defined:
 	If defined, it should set the PageDirty flag, and the
 	PAGECACHE_TAG_DIRTY tag in the radix tree.
 
+``readahead``
+	Called by the VM to read pages associated with the address_space
+	object.  The pages are consecutive in the page cache and are
+	locked.  The implementation should decrement the page refcount
+	after starting I/O on each page.  Usually the page will be
+	unlocked by the I/O completion handler.  If the function does
+	not attempt I/O on some pages, the caller will decrement the page
+	refcount and unlock the pages for you.	Set PageUptodate if the
+	I/O completes successfully.  Setting PageError on any page will
+	be ignored; simply unlock the page if an I/O error occurs.
+
 ``readpages``
 	called by the VM to read pages associated with the address_space
 	object.  This is essentially just a vector version of readpage.
 	Instead of just one page, several pages are requested.
 	readpages is only used for read-ahead, so read errors are
 	ignored.  If anything goes wrong, feel free to give up.
+        This interface is deprecated; implement readahead instead.
 
 ``write_begin``
 	Called by the generic buffered write code to ask the filesystem
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3cd4fe6b845e..d4e2d2964346 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -292,6 +292,7 @@ enum positive_aop_returns {
 struct page;
 struct address_space;
 struct writeback_control;
+struct readahead_control;
 
 /*
  * Write life time hint values.
@@ -375,6 +376,7 @@ struct address_space_operations {
 	 */
 	int (*readpages)(struct file *filp, struct address_space *mapping,
 			struct list_head *pages, unsigned nr_pages);
+	void (*readahead)(struct readahead_control *);
 
 	int (*write_begin)(struct file *, struct address_space *mapping,
 				loff_t pos, unsigned len, unsigned flags,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ccb14b6a16b5..13efafaf7e1f 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -630,6 +630,60 @@ static inline int add_to_page_cache(struct page *page,
 	return error;
 }
 
+/*
+ * Readahead is of a block of consecutive pages.
+ */
+struct readahead_control {
+	struct file *file;
+	struct address_space *mapping;
+/* private: use the readahead_* accessors instead */
+	pgoff_t start;
+	unsigned int nr_pages;
+	unsigned int batch_count;
+};
+
+static inline struct page *readahead_page(struct readahead_control *rac)
+{
+	struct page *page;
+
+	if (!rac->nr_pages)
+		return NULL;
+
+	page = xa_load(&rac->mapping->i_pages, rac->start);
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	rac->batch_count = hpage_nr_pages(page);
+	rac->start += rac->batch_count;
+
+	return page;
+}
+
+#define readahead_for_each(rac, page)					\
+	for (; (page = readahead_page(rac)); rac->nr_pages -= rac->batch_count)
+
+/* The byte offset into the file of this readahead block */
+static inline loff_t readahead_offset(struct readahead_control *rac)
+{
+	return (loff_t)rac->start * PAGE_SIZE;
+}
+
+/* The number of bytes in this readahead block */
+static inline loff_t readahead_length(struct readahead_control *rac)
+{
+	return (loff_t)rac->nr_pages * PAGE_SIZE;
+}
+
+/* The index of the first page in this readahead block */
+static inline unsigned int readahead_index(struct readahead_control *rac)
+{
+	return rac->start;
+}
+
+/* The number of pages in this readahead block */
+static inline unsigned int readahead_count(struct readahead_control *rac)
+{
+	return rac->nr_pages;
+}
+
 static inline unsigned long dir_pages(struct inode *inode)
 {
 	return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >>
diff --git a/mm/readahead.c b/mm/readahead.c
index 96c6ca68a174..933b32e0c90a 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -113,25 +113,30 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 
 EXPORT_SYMBOL(read_cache_pages);
 
-static void read_pages(struct address_space *mapping, struct file *filp,
-		struct list_head *pages, pgoff_t start,
-		unsigned int nr_pages)
+static void read_pages(struct readahead_control *rac, struct list_head *pages)
 {
+	struct page *page;
 	struct blk_plug plug;
+	const struct address_space_operations *aops = rac->mapping->a_ops;
+
+	if (rac->nr_pages == 0)
+		return;
 
 	blk_start_plug(&plug);
 
-	if (mapping->a_ops->readpages) {
-		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
+	if (aops->readahead) {
+		aops->readahead(rac);
+		readahead_for_each(rac, page) {
+			unlock_page(page);
+			put_page(page);
+		}
+	} else if (aops->readpages) {
+		aops->readpages(rac->file, rac->mapping, pages, rac->nr_pages);
 		/* Clean up the remaining pages */
 		put_pages_list(pages);
 	} else {
-		struct page *page;
-		unsigned long index;
-
-		xa_for_each_range(&mapping->i_pages, index, page, start,
-				start + nr_pages - 1) {
-			mapping->a_ops->readpage(filp, page);
+		readahead_for_each(rac, page) {
+			aops->readpage(rac->file, page);
 			put_page(page);
 		}
 	}
@@ -156,10 +161,15 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	LIST_HEAD(page_pool);
 	int page_idx;
 	pgoff_t page_offset = start;
-	unsigned long nr_pages = 0;
 	loff_t isize = i_size_read(inode);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 	bool use_list = mapping->a_ops->readpages;
+	struct readahead_control rac = {
+		.mapping = mapping,
+		.file = filp,
+		.start = start,
+		.nr_pages = 0,
+	};
 
 	if (isize == 0)
 		goto out;
@@ -206,15 +216,14 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 
 		if (page_idx == nr_to_read - lookahead_size)
 			SetPageReadahead(page);
-		nr_pages++;
+		rac.nr_pages++;
 		page_offset++;
 		continue;
 skip:
-		if (nr_pages)
-			read_pages(mapping, filp, &page_pool, start, nr_pages);
-		nr_pages = 0;
+		read_pages(&rac, &page_pool);
+		rac.nr_pages = 0;
 		page_offset++;
-		start = page_offset;
+		rac.start = page_offset;
 	}
 
 	/*
@@ -222,11 +231,10 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 * uptodate then the caller will launch readpage again, and
 	 * will then handle the error.
 	 */
-	if (nr_pages)
-		read_pages(mapping, filp, &page_pool, start, nr_pages);
+	read_pages(&rac, &page_pool);
 	BUG_ON(!list_empty(&page_pool));
 out:
-	return nr_pages;
+	return rac.nr_pages;
 }
 
 /*
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 05/13] mm: Add page_cache_readahead_limit
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (3 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

ext4 and f2fs have duplicated the guts of the readahead code so
they can read past i_size.  Instead, separate out the guts of the
readahead code so they can call it directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/verity.c        | 35 ++--------------------------
 fs/f2fs/verity.c        | 35 ++--------------------------
 include/linux/pagemap.h |  4 ++++
 mm/readahead.c          | 51 ++++++++++++++++++++++++-----------------
 4 files changed, 38 insertions(+), 87 deletions(-)

diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..f6e0bf05933e 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -342,37 +342,6 @@ static int ext4_get_verity_descriptor(struct inode *inode, void *buf,
 	return desc_size;
 }
 
-/*
- * Prefetch some pages from the file's Merkle tree.
- *
- * This is basically a stripped-down version of __do_page_cache_readahead()
- * which works on pages past i_size.
- */
-static void ext4_merkle_tree_readahead(struct address_space *mapping,
-				       pgoff_t start_index, unsigned long count)
-{
-	LIST_HEAD(pages);
-	unsigned int nr_pages = 0;
-	struct page *page;
-	pgoff_t index;
-	struct blk_plug plug;
-
-	for (index = start_index; index < start_index + count; index++) {
-		page = xa_load(&mapping->i_pages, index);
-		if (!page || xa_is_value(page)) {
-			page = __page_cache_alloc(readahead_gfp_mask(mapping));
-			if (!page)
-				break;
-			page->index = index;
-			list_add(&page->lru, &pages);
-			nr_pages++;
-		}
-	}
-	blk_start_plug(&plug);
-	ext4_mpage_readpages(mapping, &pages, NULL, nr_pages, true);
-	blk_finish_plug(&plug);
-}
-
 static struct page *ext4_read_merkle_tree_page(struct inode *inode,
 					       pgoff_t index,
 					       unsigned long num_ra_pages)
@@ -386,8 +355,8 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode,
 		if (page)
 			put_page(page);
 		else if (num_ra_pages > 1)
-			ext4_merkle_tree_readahead(inode->i_mapping, index,
-						   num_ra_pages);
+			page_cache_readahead_limit(inode->i_mapping, NULL,
+					index, LONG_MAX, num_ra_pages, 0);
 		page = read_mapping_page(inode->i_mapping, index, NULL);
 	}
 	return page;
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index d7d430a6f130..71a3e36721fa 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -222,37 +222,6 @@ static int f2fs_get_verity_descriptor(struct inode *inode, void *buf,
 	return size;
 }
 
-/*
- * Prefetch some pages from the file's Merkle tree.
- *
- * This is basically a stripped-down version of __do_page_cache_readahead()
- * which works on pages past i_size.
- */
-static void f2fs_merkle_tree_readahead(struct address_space *mapping,
-				       pgoff_t start_index, unsigned long count)
-{
-	LIST_HEAD(pages);
-	unsigned int nr_pages = 0;
-	struct page *page;
-	pgoff_t index;
-	struct blk_plug plug;
-
-	for (index = start_index; index < start_index + count; index++) {
-		page = xa_load(&mapping->i_pages, index);
-		if (!page || xa_is_value(page)) {
-			page = __page_cache_alloc(readahead_gfp_mask(mapping));
-			if (!page)
-				break;
-			page->index = index;
-			list_add(&page->lru, &pages);
-			nr_pages++;
-		}
-	}
-	blk_start_plug(&plug);
-	f2fs_mpage_readpages(mapping, &pages, NULL, nr_pages, true);
-	blk_finish_plug(&plug);
-}
-
 static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
 					       pgoff_t index,
 					       unsigned long num_ra_pages)
@@ -266,8 +235,8 @@ static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
 		if (page)
 			put_page(page);
 		else if (num_ra_pages > 1)
-			f2fs_merkle_tree_readahead(inode->i_mapping, index,
-						   num_ra_pages);
+			page_cache_readahead_limit(inode->i_mapping, NULL,
+					index, LONG_MAX, num_ra_pages, 0);
 		page = read_mapping_page(inode->i_mapping, index, NULL);
 	}
 	return page;
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 13efafaf7e1f..ddb2d1b43212 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -389,6 +389,10 @@ extern struct page * read_cache_page_gfp(struct address_space *mapping,
 				pgoff_t index, gfp_t gfp_mask);
 extern int read_cache_pages(struct address_space *mapping,
 		struct list_head *pages, filler_t *filler, void *data);
+unsigned long page_cache_readahead_limit(struct address_space *mapping,
+		struct file *file, pgoff_t start, pgoff_t end_index,
+		unsigned long nr_to_read, unsigned long lookahead_size);
+
 
 static inline struct page *read_mapping_page(struct address_space *mapping,
 				pgoff_t index, void *data)
diff --git a/mm/readahead.c b/mm/readahead.c
index 933b32e0c90a..29ca25c8f01e 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -144,38 +144,22 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages)
 	blk_finish_plug(&plug);
 }
 
-/*
- * __do_page_cache_readahead() actually reads a chunk of disk.  It allocates
- * the pages first, then submits them for I/O. This avoids the very bad
- * behaviour which would occur if page allocations are causing VM writeback.
- * We really don't want to intermingle reads and writes like that.
- *
- * Returns the number of pages requested, or the maximum amount of I/O allowed.
- */
-unsigned long __do_page_cache_readahead(struct address_space *mapping,
-		struct file *filp, pgoff_t start, unsigned long nr_to_read,
-		unsigned long lookahead_size)
+unsigned long page_cache_readahead_limit(struct address_space *mapping,
+		struct file *file, pgoff_t start, pgoff_t end_index,
+		unsigned long nr_to_read, unsigned long lookahead_size)
 {
-	struct inode *inode = mapping->host;
-	unsigned long end_index;	/* The last page we want to read */
 	LIST_HEAD(page_pool);
 	int page_idx;
 	pgoff_t page_offset = start;
-	loff_t isize = i_size_read(inode);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 	bool use_list = mapping->a_ops->readpages;
 	struct readahead_control rac = {
 		.mapping = mapping,
-		.file = filp,
+		.file = file,
 		.start = start,
 		.nr_pages = 0,
 	};
 
-	if (isize == 0)
-		goto out;
-
-	end_index = ((isize - 1) >> PAGE_SHIFT);
-
 	/*
 	 * Preallocate as many pages as we will need.
 	 */
@@ -233,9 +217,34 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 */
 	read_pages(&rac, &page_pool);
 	BUG_ON(!list_empty(&page_pool));
-out:
 	return rac.nr_pages;
 }
+EXPORT_SYMBOL_GPL(page_cache_readahead_limit);
+
+/*
+ * __do_page_cache_readahead() actually reads a chunk of disk.  It allocates
+ * the pages first, then submits them for I/O. This avoids the very bad
+ * behaviour which would occur if page allocations are causing VM writeback.
+ * We really don't want to intermingle reads and writes like that.
+ *
+ * Returns the number of pages requested, or the maximum amount of I/O allowed.
+ */
+unsigned long __do_page_cache_readahead(struct address_space *mapping,
+		struct file *file, pgoff_t start, unsigned long nr_to_read,
+		unsigned long lookahead_size)
+{
+	struct inode *inode = mapping->host;
+	unsigned long end_index;	/* The last page we want to read */
+	loff_t isize = i_size_read(inode);
+
+	if (isize == 0)
+		return 0;
+
+	end_index = ((isize - 1) >> PAGE_SHIFT);
+
+	return page_cache_readahead_limit(mapping, file, start, end_index,
+			nr_to_read, lookahead_size);
+}
 
 /*
  * Chunk the readahead into 2 megabyte units, so that we don't pin too much
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (4 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 05/13] mm: Add page_cache_readahead_limit Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-13 22:09   ` Junxiao Bi
  2020-02-11  1:03 ` [PATCH v5 07/13] btrfs: Convert from readpages to readahead Matthew Wilcox
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Implement the new readahead aop and convert all callers (block_dev,
exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
reiserfs & udf).  The callers are all trivial except for GFS2 & OCFS2.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 drivers/staging/exfat/exfat_super.c |  7 +++---
 fs/block_dev.c                      |  7 +++---
 fs/ext2/inode.c                     | 10 +++-----
 fs/fat/inode.c                      |  7 +++---
 fs/gfs2/aops.c                      | 23 ++++++-----------
 fs/hpfs/file.c                      |  7 +++---
 fs/iomap/buffered-io.c              |  2 +-
 fs/isofs/inode.c                    |  7 +++---
 fs/jfs/inode.c                      |  7 +++---
 fs/mpage.c                          | 38 +++++++++--------------------
 fs/nilfs2/inode.c                   | 15 +++---------
 fs/ocfs2/aops.c                     | 34 ++++++++++----------------
 fs/omfs/file.c                      |  7 +++---
 fs/qnx6/inode.c                     |  7 +++---
 fs/reiserfs/inode.c                 |  8 +++---
 fs/udf/inode.c                      |  7 +++---
 include/linux/mpage.h               |  4 +--
 mm/migrate.c                        |  2 +-
 18 files changed, 74 insertions(+), 125 deletions(-)

diff --git a/drivers/staging/exfat/exfat_super.c b/drivers/staging/exfat/exfat_super.c
index b81d2a87b82e..96aad9b16d31 100644
--- a/drivers/staging/exfat/exfat_super.c
+++ b/drivers/staging/exfat/exfat_super.c
@@ -3002,10 +3002,9 @@ static int exfat_readpage(struct file *file, struct page *page)
 	return  mpage_readpage(page, exfat_get_block);
 }
 
-static int exfat_readpages(struct file *file, struct address_space *mapping,
-			   struct list_head *pages, unsigned int nr_pages)
+static void exfat_readahead(struct readahead_control *rac)
 {
-	return  mpage_readpages(mapping, pages, nr_pages, exfat_get_block);
+	mpage_readahead(rac, exfat_get_block);
 }
 
 static int exfat_writepage(struct page *page, struct writeback_control *wbc)
@@ -3104,7 +3103,7 @@ static sector_t _exfat_bmap(struct address_space *mapping, sector_t block)
 
 static const struct address_space_operations exfat_aops = {
 	.readpage    = exfat_readpage,
-	.readpages   = exfat_readpages,
+	.readahead   = exfat_readahead,
 	.writepage   = exfat_writepage,
 	.writepages  = exfat_writepages,
 	.write_begin = exfat_write_begin,
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 69bf2fb6f7cd..2fd9c7bd61f6 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -614,10 +614,9 @@ static int blkdev_readpage(struct file * file, struct page * page)
 	return block_read_full_page(page, blkdev_get_block);
 }
 
-static int blkdev_readpages(struct file *file, struct address_space *mapping,
-			struct list_head *pages, unsigned nr_pages)
+static void blkdev_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block);
+	mpage_readahead(rac, blkdev_get_block);
 }
 
 static int blkdev_write_begin(struct file *file, struct address_space *mapping,
@@ -2062,7 +2061,7 @@ static int blkdev_writepages(struct address_space *mapping,
 
 static const struct address_space_operations def_blk_aops = {
 	.readpage	= blkdev_readpage,
-	.readpages	= blkdev_readpages,
+	.readahead	= blkdev_readahead,
 	.writepage	= blkdev_writepage,
 	.write_begin	= blkdev_write_begin,
 	.write_end	= blkdev_write_end,
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 119667e65890..2a65387f9b12 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -877,11 +877,9 @@ static int ext2_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, ext2_get_block);
 }
 
-static int
-ext2_readpages(struct file *file, struct address_space *mapping,
-		struct list_head *pages, unsigned nr_pages)
+static void ext2_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, ext2_get_block);
+	mpage_readahead(rac, ext2_get_block);
 }
 
 static int
@@ -966,7 +964,7 @@ ext2_dax_writepages(struct address_space *mapping, struct writeback_control *wbc
 
 const struct address_space_operations ext2_aops = {
 	.readpage		= ext2_readpage,
-	.readpages		= ext2_readpages,
+	.readahead		= ext2_readahead,
 	.writepage		= ext2_writepage,
 	.write_begin		= ext2_write_begin,
 	.write_end		= ext2_write_end,
@@ -980,7 +978,7 @@ const struct address_space_operations ext2_aops = {
 
 const struct address_space_operations ext2_nobh_aops = {
 	.readpage		= ext2_readpage,
-	.readpages		= ext2_readpages,
+	.readahead		= ext2_readahead,
 	.writepage		= ext2_nobh_writepage,
 	.write_begin		= ext2_nobh_write_begin,
 	.write_end		= nobh_write_end,
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 594b05ae16c9..3496f5fc3e6d 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -210,10 +210,9 @@ static int fat_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, fat_get_block);
 }
 
-static int fat_readpages(struct file *file, struct address_space *mapping,
-			 struct list_head *pages, unsigned nr_pages)
+static void fat_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, fat_get_block);
+	mpage_readahead(rac, fat_get_block);
 }
 
 static void fat_write_failed(struct address_space *mapping, loff_t to)
@@ -344,7 +343,7 @@ int fat_block_truncate_page(struct inode *inode, loff_t from)
 
 static const struct address_space_operations fat_aops = {
 	.readpage	= fat_readpage,
-	.readpages	= fat_readpages,
+	.readahead	= fat_readahead,
 	.writepage	= fat_writepage,
 	.writepages	= fat_writepages,
 	.write_begin	= fat_write_begin,
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index ba83b49ce18c..5e63c13c12c1 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -577,7 +577,7 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos,
 }
 
 /**
- * gfs2_readpages - Read a bunch of pages at once
+ * gfs2_readahead - Read a bunch of pages at once
  * @file: The file to read from
  * @mapping: Address space info
  * @pages: List of pages to read
@@ -590,31 +590,24 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos,
  *    obviously not something we'd want to do on too regular a basis.
  *    Any I/O we ignore at this time will be done via readpage later.
  * 2. We don't handle stuffed files here we let readpage do the honours.
- * 3. mpage_readpages() does most of the heavy lifting in the common case.
+ * 3. mpage_readahead() does most of the heavy lifting in the common case.
  * 4. gfs2_block_map() is relied upon to set BH_Boundary in the right places.
  */
 
-static int gfs2_readpages(struct file *file, struct address_space *mapping,
-			  struct list_head *pages, unsigned nr_pages)
+static void gfs2_readahead(struct readahead_control *rac)
 {
-	struct inode *inode = mapping->host;
+	struct inode *inode = rac->mapping->host;
 	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct gfs2_holder gh;
-	int ret;
 
 	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
-	ret = gfs2_glock_nq(&gh);
-	if (unlikely(ret))
+	if (gfs2_glock_nq(&gh))
 		goto out_uninit;
 	if (!gfs2_is_stuffed(ip))
-		ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map);
+		mpage_readahead(rac, gfs2_block_map);
 	gfs2_glock_dq(&gh);
 out_uninit:
 	gfs2_holder_uninit(&gh);
-	if (unlikely(gfs2_withdrawn(sdp)))
-		ret = -EIO;
-	return ret;
 }
 
 /**
@@ -828,7 +821,7 @@ static const struct address_space_operations gfs2_aops = {
 	.writepage = gfs2_writepage,
 	.writepages = gfs2_writepages,
 	.readpage = gfs2_readpage,
-	.readpages = gfs2_readpages,
+	.readahead = gfs2_readahead,
 	.bmap = gfs2_bmap,
 	.invalidatepage = gfs2_invalidatepage,
 	.releasepage = gfs2_releasepage,
@@ -842,7 +835,7 @@ static const struct address_space_operations gfs2_jdata_aops = {
 	.writepage = gfs2_jdata_writepage,
 	.writepages = gfs2_jdata_writepages,
 	.readpage = gfs2_readpage,
-	.readpages = gfs2_readpages,
+	.readahead = gfs2_readahead,
 	.set_page_dirty = jdata_set_page_dirty,
 	.bmap = gfs2_bmap,
 	.invalidatepage = gfs2_invalidatepage,
diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
index b36abf9cb345..2de0d3492d15 100644
--- a/fs/hpfs/file.c
+++ b/fs/hpfs/file.c
@@ -125,10 +125,9 @@ static int hpfs_writepage(struct page *page, struct writeback_control *wbc)
 	return block_write_full_page(page, hpfs_get_block, wbc);
 }
 
-static int hpfs_readpages(struct file *file, struct address_space *mapping,
-			  struct list_head *pages, unsigned nr_pages)
+static void hpfs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, hpfs_get_block);
+	mpage_readahead(rac, hpfs_get_block);
 }
 
 static int hpfs_writepages(struct address_space *mapping,
@@ -198,7 +197,7 @@ static int hpfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 const struct address_space_operations hpfs_aops = {
 	.readpage = hpfs_readpage,
 	.writepage = hpfs_writepage,
-	.readpages = hpfs_readpages,
+	.readahead = hpfs_readahead,
 	.writepages = hpfs_writepages,
 	.write_begin = hpfs_write_begin,
 	.write_end = hpfs_write_end,
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7c84c4c027c4..cb3511eb152a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -359,7 +359,7 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
 	}
 
 	/*
-	 * Just like mpage_readpages and block_read_full_page we always
+	 * Just like mpage_readahead and block_read_full_page we always
 	 * return 0 and just mark the page as PageError on errors.  This
 	 * should be cleaned up all through the stack eventually.
 	 */
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index 62c0462dc89f..95b1f377ad09 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -1185,10 +1185,9 @@ static int isofs_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, isofs_get_block);
 }
 
-static int isofs_readpages(struct file *file, struct address_space *mapping,
-			struct list_head *pages, unsigned nr_pages)
+static void isofs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, isofs_get_block);
+	mpage_readahead(rac, isofs_get_block);
 }
 
 static sector_t _isofs_bmap(struct address_space *mapping, sector_t block)
@@ -1198,7 +1197,7 @@ static sector_t _isofs_bmap(struct address_space *mapping, sector_t block)
 
 static const struct address_space_operations isofs_aops = {
 	.readpage = isofs_readpage,
-	.readpages = isofs_readpages,
+	.readahead = isofs_readahead,
 	.bmap = _isofs_bmap
 };
 
diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c
index 9486afcdac76..6f65bfa9f18d 100644
--- a/fs/jfs/inode.c
+++ b/fs/jfs/inode.c
@@ -296,10 +296,9 @@ static int jfs_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, jfs_get_block);
 }
 
-static int jfs_readpages(struct file *file, struct address_space *mapping,
-		struct list_head *pages, unsigned nr_pages)
+static void jfs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, jfs_get_block);
+	mpage_readahead(rac, jfs_get_block);
 }
 
 static void jfs_write_failed(struct address_space *mapping, loff_t to)
@@ -358,7 +357,7 @@ static ssize_t jfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 
 const struct address_space_operations jfs_aops = {
 	.readpage	= jfs_readpage,
-	.readpages	= jfs_readpages,
+	.readahead	= jfs_readahead,
 	.writepage	= jfs_writepage,
 	.writepages	= jfs_writepages,
 	.write_begin	= jfs_write_begin,
diff --git a/fs/mpage.c b/fs/mpage.c
index ccba3c4c4479..166a95ec1341 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -91,7 +91,7 @@ mpage_alloc(struct block_device *bdev,
 }
 
 /*
- * support function for mpage_readpages.  The fs supplied get_block might
+ * support function for mpage_readahead.  The fs supplied get_block might
  * return an up to date buffer.  This is used to map that buffer into
  * the page, which allows readpage to avoid triggering a duplicate call
  * to get_block.
@@ -338,13 +338,10 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
 }
 
 /**
- * mpage_readpages - populate an address space with some pages & start reads against them
+ * mpage_readahead - start reads against pages
  * @mapping: the address_space
- * @pages: The address of a list_head which contains the target pages.  These
- *   pages have their ->index populated and are otherwise uninitialised.
- *   The page at @pages->prev has the lowest file offset, and reads should be
- *   issued in @pages->prev to @pages->next order.
- * @nr_pages: The number of pages at *@pages
+ * @start: The number of the first page to read.
+ * @nr_pages: The number of consecutive pages to read.
  * @get_block: The filesystem's block mapper function.
  *
  * This function walks the pages and the blocks within each page, building and
@@ -381,36 +378,25 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
  *
  * This all causes the disk requests to be issued in the correct order.
  */
-int
-mpage_readpages(struct address_space *mapping, struct list_head *pages,
-				unsigned nr_pages, get_block_t get_block)
+void mpage_readahead(struct readahead_control *rac, get_block_t get_block)
 {
+	struct page *page;
 	struct mpage_readpage_args args = {
 		.get_block = get_block,
 		.is_readahead = true,
 	};
-	unsigned page_idx;
-
-	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
-		struct page *page = lru_to_page(pages);
 
+	readahead_for_each(rac, page) {
 		prefetchw(&page->flags);
-		list_del(&page->lru);
-		if (!add_to_page_cache_lru(page, mapping,
-					page->index,
-					readahead_gfp_mask(mapping))) {
-			args.page = page;
-			args.nr_pages = nr_pages - page_idx;
-			args.bio = do_mpage_readpage(&args);
-		}
+		args.page = page;
+		args.nr_pages = readahead_count(rac);
+		args.bio = do_mpage_readpage(&args);
 		put_page(page);
 	}
-	BUG_ON(!list_empty(pages));
 	if (args.bio)
 		mpage_bio_submit(REQ_OP_READ, REQ_RAHEAD, args.bio);
-	return 0;
 }
-EXPORT_SYMBOL(mpage_readpages);
+EXPORT_SYMBOL(mpage_readahead);
 
 /*
  * This isn't called much at all
@@ -563,7 +549,7 @@ static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
 		 * Page has buffers, but they are all unmapped. The page was
 		 * created by pagein or read over a hole which was handled by
 		 * block_read_full_page().  If this address_space is also
-		 * using mpage_readpages then this can rarely happen.
+		 * using mpage_readahead then this can rarely happen.
 		 */
 		goto confused;
 	}
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 671085512e0f..ceeb3b441844 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -145,18 +145,9 @@ static int nilfs_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, nilfs_get_block);
 }
 
-/**
- * nilfs_readpages() - implement readpages() method of nilfs_aops {}
- * address_space_operations.
- * @file - file struct of the file to be read
- * @mapping - address_space struct used for reading multiple pages
- * @pages - the pages to be read
- * @nr_pages - number of pages to be read
- */
-static int nilfs_readpages(struct file *file, struct address_space *mapping,
-			   struct list_head *pages, unsigned int nr_pages)
+static void nilfs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, nilfs_get_block);
+	mpage_readahead(rac, nilfs_get_block);
 }
 
 static int nilfs_writepages(struct address_space *mapping,
@@ -308,7 +299,7 @@ const struct address_space_operations nilfs_aops = {
 	.readpage		= nilfs_readpage,
 	.writepages		= nilfs_writepages,
 	.set_page_dirty		= nilfs_set_page_dirty,
-	.readpages		= nilfs_readpages,
+	.readahead		= nilfs_readahead,
 	.write_begin		= nilfs_write_begin,
 	.write_end		= nilfs_write_end,
 	/* .releasepage		= nilfs_releasepage, */
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 3a67a6518ddf..e8137efaafec 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -350,14 +350,11 @@ static int ocfs2_readpage(struct file *file, struct page *page)
  * grow out to a tree. If need be, detecting boundary extents could
  * trivially be added in a future version of ocfs2_get_block().
  */
-static int ocfs2_readpages(struct file *filp, struct address_space *mapping,
-			   struct list_head *pages, unsigned nr_pages)
+static void ocfs2_readahead(struct readahead_control *rac)
 {
-	int ret, err = -EIO;
-	struct inode *inode = mapping->host;
+	int ret;
+	struct inode *inode = rac->mapping->host;
 	struct ocfs2_inode_info *oi = OCFS2_I(inode);
-	loff_t start;
-	struct page *last;
 
 	/*
 	 * Use the nonblocking flag for the dlm code to avoid page
@@ -365,36 +362,31 @@ static int ocfs2_readpages(struct file *filp, struct address_space *mapping,
 	 */
 	ret = ocfs2_inode_lock_full(inode, NULL, 0, OCFS2_LOCK_NONBLOCK);
 	if (ret)
-		return err;
+		return;
 
-	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
-		ocfs2_inode_unlock(inode, 0);
-		return err;
-	}
+	if (down_read_trylock(&oi->ip_alloc_sem) == 0)
+		goto out_unlock;
 
 	/*
 	 * Don't bother with inline-data. There isn't anything
 	 * to read-ahead in that case anyway...
 	 */
 	if (oi->ip_dyn_features & OCFS2_INLINE_DATA_FL)
-		goto out_unlock;
+		goto out_up;
 
 	/*
 	 * Check whether a remote node truncated this file - we just
 	 * drop out in that case as it's not worth handling here.
 	 */
-	last = lru_to_page(pages);
-	start = (loff_t)last->index << PAGE_SHIFT;
-	if (start >= i_size_read(inode))
-		goto out_unlock;
+	if (readahead_offset(rac) >= i_size_read(inode))
+		goto out_up;
 
-	err = mpage_readpages(mapping, pages, nr_pages, ocfs2_get_block);
+	mpage_readahead(rac, ocfs2_get_block);
 
-out_unlock:
+out_up:
 	up_read(&oi->ip_alloc_sem);
+out_unlock:
 	ocfs2_inode_unlock(inode, 0);
-
-	return err;
 }
 
 /* Note: Because we don't support holes, our allocation has
@@ -2474,7 +2466,7 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 
 const struct address_space_operations ocfs2_aops = {
 	.readpage		= ocfs2_readpage,
-	.readpages		= ocfs2_readpages,
+	.readahead		= ocfs2_readahead,
 	.writepage		= ocfs2_writepage,
 	.write_begin		= ocfs2_write_begin,
 	.write_end		= ocfs2_write_end,
diff --git a/fs/omfs/file.c b/fs/omfs/file.c
index d640b9388238..d7b5f09d298c 100644
--- a/fs/omfs/file.c
+++ b/fs/omfs/file.c
@@ -289,10 +289,9 @@ static int omfs_readpage(struct file *file, struct page *page)
 	return block_read_full_page(page, omfs_get_block);
 }
 
-static int omfs_readpages(struct file *file, struct address_space *mapping,
-		struct list_head *pages, unsigned nr_pages)
+static void omfs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, omfs_get_block);
+	mpage_readahead(rac, omfs_get_block);
 }
 
 static int omfs_writepage(struct page *page, struct writeback_control *wbc)
@@ -373,7 +372,7 @@ const struct inode_operations omfs_file_inops = {
 
 const struct address_space_operations omfs_aops = {
 	.readpage = omfs_readpage,
-	.readpages = omfs_readpages,
+	.readahead = omfs_readahead,
 	.writepage = omfs_writepage,
 	.writepages = omfs_writepages,
 	.write_begin = omfs_write_begin,
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 345db56c98fd..755293c8c71a 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -99,10 +99,9 @@ static int qnx6_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, qnx6_get_block);
 }
 
-static int qnx6_readpages(struct file *file, struct address_space *mapping,
-		   struct list_head *pages, unsigned nr_pages)
+static void qnx6_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, qnx6_get_block);
+	mpage_readahead(rac, qnx6_get_block);
 }
 
 /*
@@ -499,7 +498,7 @@ static sector_t qnx6_bmap(struct address_space *mapping, sector_t block)
 }
 static const struct address_space_operations qnx6_aops = {
 	.readpage	= qnx6_readpage,
-	.readpages	= qnx6_readpages,
+	.readahead	= qnx6_readahead,
 	.bmap		= qnx6_bmap
 };
 
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index 6419e6dacc39..0031070b3692 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -1160,11 +1160,9 @@ int reiserfs_get_block(struct inode *inode, sector_t block,
 	return retval;
 }
 
-static int
-reiserfs_readpages(struct file *file, struct address_space *mapping,
-		   struct list_head *pages, unsigned nr_pages)
+static void reiserfs_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, reiserfs_get_block);
+	mpage_readahead(rac, reiserfs_get_block);
 }
 
 /*
@@ -3434,7 +3432,7 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr)
 const struct address_space_operations reiserfs_address_space_operations = {
 	.writepage = reiserfs_writepage,
 	.readpage = reiserfs_readpage,
-	.readpages = reiserfs_readpages,
+	.readahead = reiserfs_readahead,
 	.releasepage = reiserfs_releasepage,
 	.invalidatepage = reiserfs_invalidatepage,
 	.write_begin = reiserfs_write_begin,
diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index e875bc5668ee..adaba8e8b326 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -195,10 +195,9 @@ static int udf_readpage(struct file *file, struct page *page)
 	return mpage_readpage(page, udf_get_block);
 }
 
-static int udf_readpages(struct file *file, struct address_space *mapping,
-			struct list_head *pages, unsigned nr_pages)
+static void udf_readahead(struct readahead_control *rac)
 {
-	return mpage_readpages(mapping, pages, nr_pages, udf_get_block);
+	mpage_readahead(rac, udf_get_block);
 }
 
 static int udf_write_begin(struct file *file, struct address_space *mapping,
@@ -234,7 +233,7 @@ static sector_t udf_bmap(struct address_space *mapping, sector_t block)
 
 const struct address_space_operations udf_aops = {
 	.readpage	= udf_readpage,
-	.readpages	= udf_readpages,
+	.readahead	= udf_readahead,
 	.writepage	= udf_writepage,
 	.writepages	= udf_writepages,
 	.write_begin	= udf_write_begin,
diff --git a/include/linux/mpage.h b/include/linux/mpage.h
index 001f1fcf9836..f4f5e90a6844 100644
--- a/include/linux/mpage.h
+++ b/include/linux/mpage.h
@@ -13,9 +13,9 @@
 #ifdef CONFIG_BLOCK
 
 struct writeback_control;
+struct readahead_control;
 
-int mpage_readpages(struct address_space *mapping, struct list_head *pages,
-				unsigned nr_pages, get_block_t get_block);
+void mpage_readahead(struct readahead_control *, get_block_t get_block);
 int mpage_readpage(struct page *page, get_block_t get_block);
 int mpage_writepages(struct address_space *mapping,
 		struct writeback_control *wbc, get_block_t get_block);
diff --git a/mm/migrate.c b/mm/migrate.c
index b1092876e537..a32122095702 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1020,7 +1020,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		 * to the LRU. Later, when the IO completes the pages are
 		 * marked uptodate and unlocked. However, the queueing
 		 * could be merging multiple pages for one bio (e.g.
-		 * mpage_readpages). If an allocation happens for the
+		 * mpage_readahead). If an allocation happens for the
 		 * second or third page, the process can end up locking
 		 * the same page twice and deadlocking. Rather than
 		 * trying to be clever about what pages can be locked,
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 07/13] btrfs: Convert from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (5 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox
@ 2020-02-11  1:03 ` Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 08/13] erofs: Convert uncompressed files " Matthew Wilcox
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in btrfs.  Add a
readahead_for_each_batch() iterator to optimise the loop in the XArray.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/btrfs/extent_io.c    | 48 ++++++++++++++---------------------------
 fs/btrfs/extent_io.h    |  3 +--
 fs/btrfs/inode.c        | 16 ++++++--------
 include/linux/pagemap.h | 26 ++++++++++++++++++++++
 4 files changed, 50 insertions(+), 43 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c0f202741e09..d9f66058e0a7 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4278,52 +4278,36 @@ int extent_writepages(struct address_space *mapping,
 	return ret;
 }
 
-int extent_readpages(struct address_space *mapping, struct list_head *pages,
-		     unsigned nr_pages)
+void extent_readahead(struct readahead_control *rac)
 {
 	struct bio *bio = NULL;
 	unsigned long bio_flags = 0;
 	struct page *pagepool[16];
 	struct extent_map *em_cached = NULL;
-	struct extent_io_tree *tree = &BTRFS_I(mapping->host)->io_tree;
-	int nr = 0;
+	struct extent_io_tree *tree = &BTRFS_I(rac->mapping->host)->io_tree;
 	u64 prev_em_start = (u64)-1;
+	int nr;
 
-	while (!list_empty(pages)) {
-		u64 contig_end = 0;
-
-		for (nr = 0; nr < ARRAY_SIZE(pagepool) && !list_empty(pages);) {
-			struct page *page = lru_to_page(pages);
-
-			prefetchw(&page->flags);
-			list_del(&page->lru);
-			if (add_to_page_cache_lru(page, mapping, page->index,
-						readahead_gfp_mask(mapping))) {
-				put_page(page);
-				break;
-			}
-
-			pagepool[nr++] = page;
-			contig_end = page_offset(page) + PAGE_SIZE - 1;
-		}
-
-		if (nr) {
-			u64 contig_start = page_offset(pagepool[0]);
+	readahead_for_each_batch(rac, pagepool, ARRAY_SIZE(pagepool), nr) {
+		u64 contig_start = page_offset(pagepool[0]);
+		u64 contig_end = page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1;
 
-			ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end);
+		ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end);
 
-			contiguous_readpages(tree, pagepool, nr, contig_start,
-				     contig_end, &em_cached, &bio, &bio_flags,
-				     &prev_em_start);
-		}
+		contiguous_readpages(tree, pagepool, nr, contig_start,
+				contig_end, &em_cached, &bio, &bio_flags,
+				&prev_em_start);
 	}
 
 	if (em_cached)
 		free_extent_map(em_cached);
 
-	if (bio)
-		return submit_one_bio(bio, 0, bio_flags);
-	return 0;
+	if (bio) {
+		int ret = submit_one_bio(bio, 0, bio_flags);
+		if (ret < 0) {
+			/* XXX: unlock the pages here? */
+		}
+	}
 }
 
 /*
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 5d205bbaafdc..bddac32948c7 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -198,8 +198,7 @@ int extent_writepages(struct address_space *mapping,
 		      struct writeback_control *wbc);
 int btree_write_cache_pages(struct address_space *mapping,
 			    struct writeback_control *wbc);
-int extent_readpages(struct address_space *mapping, struct list_head *pages,
-		     unsigned nr_pages);
+void extent_readahead(struct readahead_control *rac);
 int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		__u64 start, __u64 len);
 void set_page_extent_mapped(struct page *page);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5b3ec93ff911..d964b2a78ed8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4794,8 +4794,8 @@ static void evict_inode_truncate_pages(struct inode *inode)
 
 	/*
 	 * Keep looping until we have no more ranges in the io tree.
-	 * We can have ongoing bios started by readpages (called from readahead)
-	 * that have their endio callback (extent_io.c:end_bio_extent_readpage)
+	 * We can have ongoing bios started by readahead that have
+	 * their endio callback (extent_io.c:end_bio_extent_readpage)
 	 * still in progress (unlocked the pages in the bio but did not yet
 	 * unlocked the ranges in the io tree). Therefore this means some
 	 * ranges can still be locked and eviction started because before
@@ -6996,11 +6996,11 @@ static int lock_extent_direct(struct inode *inode, u64 lockstart, u64 lockend,
 			 * for it to complete) and then invalidate the pages for
 			 * this range (through invalidate_inode_pages2_range()),
 			 * but that can lead us to a deadlock with a concurrent
-			 * call to readpages() (a buffered read or a defrag call
+			 * call to readahead (a buffered read or a defrag call
 			 * triggered a readahead) on a page lock due to an
 			 * ordered dio extent we created before but did not have
 			 * yet a corresponding bio submitted (whence it can not
-			 * complete), which makes readpages() wait for that
+			 * complete), which makes readahead wait for that
 			 * ordered extent to complete while holding a lock on
 			 * that page.
 			 */
@@ -8239,11 +8239,9 @@ static int btrfs_writepages(struct address_space *mapping,
 	return extent_writepages(mapping, wbc);
 }
 
-static int
-btrfs_readpages(struct file *file, struct address_space *mapping,
-		struct list_head *pages, unsigned nr_pages)
+static void btrfs_readahead(struct readahead_control *rac)
 {
-	return extent_readpages(mapping, pages, nr_pages);
+	extent_readahead(rac);
 }
 
 static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags)
@@ -10448,7 +10446,7 @@ static const struct address_space_operations btrfs_aops = {
 	.readpage	= btrfs_readpage,
 	.writepage	= btrfs_writepage,
 	.writepages	= btrfs_writepages,
-	.readpages	= btrfs_readpages,
+	.readahead	= btrfs_readahead,
 	.direct_IO	= btrfs_direct_IO,
 	.invalidatepage = btrfs_invalidatepage,
 	.releasepage	= btrfs_releasepage,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ddb2d1b43212..75bdfec49710 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -664,6 +664,32 @@ static inline struct page *readahead_page(struct readahead_control *rac)
 #define readahead_for_each(rac, page)					\
 	for (; (page = readahead_page(rac)); rac->nr_pages -= rac->batch_count)
 
+static inline unsigned int readahead_page_batch(struct readahead_control *rac,
+		struct page **array, unsigned int size)
+{
+	unsigned int batch = 0;
+	XA_STATE(xas, &rac->mapping->i_pages, rac->start);
+	struct page *page;
+
+	rac->batch_count = 0;
+	xas_for_each(&xas, page, rac->start + rac->nr_pages - 1) {
+		array[batch++] = page;
+		rac->batch_count += hpage_nr_pages(page);
+		rac->start += hpage_nr_pages(page);
+		if (PageHead(page))
+			xas_set(&xas, rac->start);
+
+		if (batch == size)
+			break;
+	}
+
+	return batch;
+}
+
+#define readahead_for_each_batch(rac, array, size, nr)			\
+	for (; (nr = readahead_page_batch(rac, array, size));		\
+			rac->nr_pages -= rac->batch_count)
+
 /* The byte offset into the file of this readahead block */
 static inline loff_t readahead_offset(struct readahead_control *rac)
 {
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 08/13] erofs: Convert uncompressed files from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (6 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 07/13] btrfs: Convert from readpages to readahead Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 09/13] erofs: Convert compressed " Matthew Wilcox
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in erofs

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/erofs/data.c              | 39 +++++++++++++-----------------------
 fs/erofs/zdata.c             |  2 +-
 include/trace/events/erofs.h |  6 +++---
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index fc3a8d8064f8..82ebcee9d178 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -280,47 +280,36 @@ static int erofs_raw_access_readpage(struct file *file, struct page *page)
 	return 0;
 }
 
-static int erofs_raw_access_readpages(struct file *filp,
-				      struct address_space *mapping,
-				      struct list_head *pages,
-				      unsigned int nr_pages)
+static void erofs_raw_access_readahead(struct readahead_control *rac)
 {
 	erofs_off_t last_block;
 	struct bio *bio = NULL;
-	gfp_t gfp = readahead_gfp_mask(mapping);
-	struct page *page = list_last_entry(pages, struct page, lru);
-
-	trace_erofs_readpages(mapping->host, page, nr_pages, true);
+	struct page *page;
 
-	for (; nr_pages; --nr_pages) {
-		page = list_entry(pages->prev, struct page, lru);
+	trace_erofs_readpages(rac->mapping->host, readahead_index(rac),
+			readahead_count(rac), true);
 
+	readahead_for_each(rac, page) {
 		prefetchw(&page->flags);
-		list_del(&page->lru);
 
-		if (!add_to_page_cache_lru(page, mapping, page->index, gfp)) {
-			bio = erofs_read_raw_page(bio, mapping, page,
-						  &last_block, nr_pages, true);
+		bio = erofs_read_raw_page(bio, rac->mapping, page, &last_block,
+				readahead_count(rac), true);
 
-			/* all the page errors are ignored when readahead */
-			if (IS_ERR(bio)) {
-				pr_err("%s, readahead error at page %lu of nid %llu\n",
-				       __func__, page->index,
-				       EROFS_I(mapping->host)->nid);
+		/* all the page errors are ignored when readahead */
+		if (IS_ERR(bio)) {
+			pr_err("%s, readahead error at page %lu of nid %llu\n",
+			       __func__, page->index,
+			       EROFS_I(rac->mapping->host)->nid);
 
-				bio = NULL;
-			}
+			bio = NULL;
 		}
 
-		/* pages could still be locked */
 		put_page(page);
 	}
-	DBG_BUGON(!list_empty(pages));
 
 	/* the rare case (end in gaps) */
 	if (bio)
 		submit_bio(bio);
-	return 0;
 }
 
 static int erofs_get_block(struct inode *inode, sector_t iblock,
@@ -358,7 +347,7 @@ static sector_t erofs_bmap(struct address_space *mapping, sector_t block)
 /* for uncompressed (aligned) files and raw access for other files */
 const struct address_space_operations erofs_raw_access_aops = {
 	.readpage = erofs_raw_access_readpage,
-	.readpages = erofs_raw_access_readpages,
+	.readahead = erofs_raw_access_readahead,
 	.bmap = erofs_bmap,
 };
 
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 80e47f07d946..17f45fcb8c5c 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -1315,7 +1315,7 @@ static int z_erofs_readpages(struct file *filp, struct address_space *mapping,
 	struct page *head = NULL;
 	LIST_HEAD(pagepool);
 
-	trace_erofs_readpages(mapping->host, lru_to_page(pages),
+	trace_erofs_readpages(mapping->host, lru_to_page(pages)->index,
 			      nr_pages, false);
 
 	f.headoffset = (erofs_off_t)lru_to_page(pages)->index << PAGE_SHIFT;
diff --git a/include/trace/events/erofs.h b/include/trace/events/erofs.h
index 27f5caa6299a..bf9806fd1306 100644
--- a/include/trace/events/erofs.h
+++ b/include/trace/events/erofs.h
@@ -113,10 +113,10 @@ TRACE_EVENT(erofs_readpage,
 
 TRACE_EVENT(erofs_readpages,
 
-	TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage,
+	TP_PROTO(struct inode *inode, pgoff_t start, unsigned int nrpage,
 		bool raw),
 
-	TP_ARGS(inode, page, nrpage, raw),
+	TP_ARGS(inode, start, nrpage, raw),
 
 	TP_STRUCT__entry(
 		__field(dev_t,		dev	)
@@ -129,7 +129,7 @@ TRACE_EVENT(erofs_readpages,
 	TP_fast_assign(
 		__entry->dev	= inode->i_sb->s_dev;
 		__entry->nid	= EROFS_I(inode)->nid;
-		__entry->start	= page->index;
+		__entry->start	= start;
 		__entry->nrpage	= nrpage;
 		__entry->raw	= raw;
 	),
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 09/13] erofs: Convert compressed files from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (7 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 08/13] erofs: Convert uncompressed files " Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 10/13] ext4: Convert " Matthew Wilcox
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in erofs.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/erofs/zdata.c | 29 +++++++++--------------------
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 17f45fcb8c5c..7c02015d501d 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -1303,28 +1303,23 @@ static bool should_decompress_synchronously(struct erofs_sb_info *sbi,
 	return nr <= sbi->max_sync_decompress_pages;
 }
 
-static int z_erofs_readpages(struct file *filp, struct address_space *mapping,
-			     struct list_head *pages, unsigned int nr_pages)
+static void z_erofs_readahead(struct readahead_control *rac)
 {
-	struct inode *const inode = mapping->host;
+	struct inode *const inode = rac->mapping->host;
 	struct erofs_sb_info *const sbi = EROFS_I_SB(inode);
 
-	bool sync = should_decompress_synchronously(sbi, nr_pages);
+	bool sync = should_decompress_synchronously(sbi, readahead_count(rac));
 	struct z_erofs_decompress_frontend f = DECOMPRESS_FRONTEND_INIT(inode);
-	gfp_t gfp = mapping_gfp_constraint(mapping, GFP_KERNEL);
-	struct page *head = NULL;
+	struct page *page, *head = NULL;
 	LIST_HEAD(pagepool);
 
-	trace_erofs_readpages(mapping->host, lru_to_page(pages)->index,
-			      nr_pages, false);
+	trace_erofs_readpages(inode, readahead_index(rac),
+			readahead_count(rac), false);
 
-	f.headoffset = (erofs_off_t)lru_to_page(pages)->index << PAGE_SHIFT;
-
-	for (; nr_pages; --nr_pages) {
-		struct page *page = lru_to_page(pages);
+	f.headoffset = readahead_offset(rac);
 
+	readahead_for_each(rac, page) {
 		prefetchw(&page->flags);
-		list_del(&page->lru);
 
 		/*
 		 * A pure asynchronous readahead is indicated if
@@ -1333,11 +1328,6 @@ static int z_erofs_readpages(struct file *filp, struct address_space *mapping,
 		 */
 		sync &= !(PageReadahead(page) && !head);
 
-		if (add_to_page_cache_lru(page, mapping, page->index, gfp)) {
-			list_add(&page->lru, &pagepool);
-			continue;
-		}
-
 		set_page_private(page, (unsigned long)head);
 		head = page;
 	}
@@ -1366,11 +1356,10 @@ static int z_erofs_readpages(struct file *filp, struct address_space *mapping,
 
 	/* clean up the remaining free pages */
 	put_pages_list(&pagepool);
-	return 0;
 }
 
 const struct address_space_operations z_erofs_aops = {
 	.readpage = z_erofs_readpage,
-	.readpages = z_erofs_readpages,
+	.readahead = z_erofs_readahead,
 };
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 10/13] ext4: Convert from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (8 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 09/13] erofs: Convert compressed " Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 11/13] f2fs: " Matthew Wilcox
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in ext4

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/ext4.h     |  3 +--
 fs/ext4/inode.c    | 23 ++++++++++-------------
 fs/ext4/readpage.c | 22 ++++++++--------------
 3 files changed, 19 insertions(+), 29 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9a2ee2428ecc..3af755da101d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3276,8 +3276,7 @@ static inline void ext4_set_de_type(struct super_block *sb,
 
 /* readpages.c */
 extern int ext4_mpage_readpages(struct address_space *mapping,
-				struct list_head *pages, struct page *page,
-				unsigned nr_pages, bool is_readahead);
+		struct readahead_control *rac, struct page *page);
 extern int __init ext4_init_post_read_processing(void);
 extern void ext4_exit_post_read_processing(void);
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3313168b680f..f07bacb05df5 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3218,7 +3218,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
 static int ext4_readpage(struct file *file, struct page *page)
 {
 	int ret = -EAGAIN;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = file_inode(file);
 
 	trace_ext4_readpage(page);
 
@@ -3226,23 +3226,20 @@ static int ext4_readpage(struct file *file, struct page *page)
 		ret = ext4_readpage_inline(inode, page);
 
 	if (ret == -EAGAIN)
-		return ext4_mpage_readpages(page->mapping, NULL, page, 1,
-						false);
+		return ext4_mpage_readpages(page->mapping, NULL, page);
 
 	return ret;
 }
 
-static int
-ext4_readpages(struct file *file, struct address_space *mapping,
-		struct list_head *pages, unsigned nr_pages)
+static void ext4_readahead(struct readahead_control *rac)
 {
-	struct inode *inode = mapping->host;
+	struct inode *inode = rac->mapping->host;
 
-	/* If the file has inline data, no need to do readpages. */
+	/* If the file has inline data, no need to do readahead. */
 	if (ext4_has_inline_data(inode))
-		return 0;
+		return;
 
-	return ext4_mpage_readpages(mapping, pages, NULL, nr_pages, true);
+	ext4_mpage_readpages(rac->mapping, rac, NULL);
 }
 
 static void ext4_invalidatepage(struct page *page, unsigned int offset,
@@ -3587,7 +3584,7 @@ static int ext4_set_page_dirty(struct page *page)
 
 static const struct address_space_operations ext4_aops = {
 	.readpage		= ext4_readpage,
-	.readpages		= ext4_readpages,
+	.readahead		= ext4_readahead,
 	.writepage		= ext4_writepage,
 	.writepages		= ext4_writepages,
 	.write_begin		= ext4_write_begin,
@@ -3604,7 +3601,7 @@ static const struct address_space_operations ext4_aops = {
 
 static const struct address_space_operations ext4_journalled_aops = {
 	.readpage		= ext4_readpage,
-	.readpages		= ext4_readpages,
+	.readahead		= ext4_readahead,
 	.writepage		= ext4_writepage,
 	.writepages		= ext4_writepages,
 	.write_begin		= ext4_write_begin,
@@ -3620,7 +3617,7 @@ static const struct address_space_operations ext4_journalled_aops = {
 
 static const struct address_space_operations ext4_da_aops = {
 	.readpage		= ext4_readpage,
-	.readpages		= ext4_readpages,
+	.readahead		= ext4_readahead,
 	.writepage		= ext4_writepage,
 	.writepages		= ext4_writepages,
 	.write_begin		= ext4_da_write_begin,
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index c1769afbf799..e14841ade612 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -7,8 +7,8 @@
  *
  * This was originally taken from fs/mpage.c
  *
- * The intent is the ext4_mpage_readpages() function here is intended
- * to replace mpage_readpages() in the general case, not just for
+ * The ext4_mpage_readahead() function here is intended to
+ * replace mpage_readahead() in the general case, not just for
  * encrypted files.  It has some limitations (see below), where it
  * will fall back to read_block_full_page(), but these limitations
  * should only be hit when page_size != block_size.
@@ -222,8 +222,7 @@ static inline loff_t ext4_readpage_limit(struct inode *inode)
 }
 
 int ext4_mpage_readpages(struct address_space *mapping,
-			 struct list_head *pages, struct page *page,
-			 unsigned nr_pages, bool is_readahead)
+		struct readahead_control *rac, struct page *page)
 {
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
@@ -241,6 +240,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
 	int length;
 	unsigned relative_block = 0;
 	struct ext4_map_blocks map;
+	unsigned int nr_pages = rac ? readahead_count(rac) : 1;
 
 	map.m_pblk = 0;
 	map.m_lblk = 0;
@@ -251,14 +251,9 @@ int ext4_mpage_readpages(struct address_space *mapping,
 		int fully_mapped = 1;
 		unsigned first_hole = blocks_per_page;
 
-		if (pages) {
-			page = lru_to_page(pages);
-
+		if (rac) {
+			page = readahead_page(rac);
 			prefetchw(&page->flags);
-			list_del(&page->lru);
-			if (add_to_page_cache_lru(page, mapping, page->index,
-				  readahead_gfp_mask(mapping)))
-				goto next_page;
 		}
 
 		if (page_has_buffers(page))
@@ -381,7 +376,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
 			bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
 			bio->bi_end_io = mpage_end_io;
 			bio_set_op_attrs(bio, REQ_OP_READ,
-						is_readahead ? REQ_RAHEAD : 0);
+						rac ? REQ_RAHEAD : 0);
 		}
 
 		length = first_hole << blkbits;
@@ -406,10 +401,9 @@ int ext4_mpage_readpages(struct address_space *mapping,
 		else
 			unlock_page(page);
 	next_page:
-		if (pages)
+		if (rac)
 			put_page(page);
 	}
-	BUG_ON(pages && !list_empty(pages));
 	if (bio)
 		submit_bio(bio);
 	return 0;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 11/13] f2fs: Convert from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (9 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 10/13] ext4: Convert " Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 12/13] fuse: " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 13/13] iomap: " Matthew Wilcox
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in f2fs

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/f2fs/data.c              | 50 +++++++++++++++----------------------
 fs/f2fs/f2fs.h              |  5 ++--
 include/trace/events/f2fs.h |  6 ++---
 3 files changed, 25 insertions(+), 36 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index b27b72107911..87964e4cb6b8 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2159,13 +2159,11 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret,
  * use ->readpage() or do the necessary surgery to decouple ->readpages()
  * from read-ahead.
  */
-int f2fs_mpage_readpages(struct address_space *mapping,
-			struct list_head *pages, struct page *page,
-			unsigned nr_pages, bool is_readahead)
+int f2fs_mpage_readpages(struct inode *inode, struct readahead_control *rac,
+		struct page *page)
 {
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
-	struct inode *inode = mapping->host;
 	struct f2fs_map_blocks map;
 #ifdef CONFIG_F2FS_FS_COMPRESSION
 	struct compress_ctx cc = {
@@ -2179,6 +2177,7 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 		.nr_cpages = 0,
 	};
 #endif
+	unsigned nr_pages = rac ? readahead_count(rac) : 1;
 	unsigned max_nr_pages = nr_pages;
 	int ret = 0;
 
@@ -2192,15 +2191,9 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 	map.m_may_create = false;
 
 	for (; nr_pages; nr_pages--) {
-		if (pages) {
-			page = list_last_entry(pages, struct page, lru);
-
+		if (rac) {
+			page = readahead_page(rac);
 			prefetchw(&page->flags);
-			list_del(&page->lru);
-			if (add_to_page_cache_lru(page, mapping,
-						  page_index(page),
-						  readahead_gfp_mask(mapping)))
-				goto next_page;
 		}
 
 #ifdef CONFIG_F2FS_FS_COMPRESSION
@@ -2210,7 +2203,7 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 				ret = f2fs_read_multi_pages(&cc, &bio,
 							max_nr_pages,
 							&last_block_in_bio,
-							is_readahead);
+							rac);
 				f2fs_destroy_compress_ctx(&cc);
 				if (ret)
 					goto set_error_page;
@@ -2233,7 +2226,7 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 #endif
 
 		ret = f2fs_read_single_page(inode, page, max_nr_pages, &map,
-					&bio, &last_block_in_bio, is_readahead);
+					&bio, &last_block_in_bio, rac);
 		if (ret) {
 #ifdef CONFIG_F2FS_FS_COMPRESSION
 set_error_page:
@@ -2242,8 +2235,10 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 			zero_user_segment(page, 0, PAGE_SIZE);
 			unlock_page(page);
 		}
+#ifdef CONFIG_F2FS_FS_COMPRESSION
 next_page:
-		if (pages)
+#endif
+		if (rac)
 			put_page(page);
 
 #ifdef CONFIG_F2FS_FS_COMPRESSION
@@ -2253,16 +2248,15 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 				ret = f2fs_read_multi_pages(&cc, &bio,
 							max_nr_pages,
 							&last_block_in_bio,
-							is_readahead);
+							rac);
 				f2fs_destroy_compress_ctx(&cc);
 			}
 		}
 #endif
 	}
-	BUG_ON(pages && !list_empty(pages));
 	if (bio)
 		__submit_bio(F2FS_I_SB(inode), bio, DATA);
-	return pages ? 0 : ret;
+	return ret;
 }
 
 static int f2fs_read_data_page(struct file *file, struct page *page)
@@ -2281,28 +2275,24 @@ static int f2fs_read_data_page(struct file *file, struct page *page)
 	if (f2fs_has_inline_data(inode))
 		ret = f2fs_read_inline_data(inode, page);
 	if (ret == -EAGAIN)
-		ret = f2fs_mpage_readpages(page_file_mapping(page),
-						NULL, page, 1, false);
+		ret = f2fs_mpage_readpages(inode, NULL, page);
 	return ret;
 }
 
-static int f2fs_read_data_pages(struct file *file,
-			struct address_space *mapping,
-			struct list_head *pages, unsigned nr_pages)
+static void f2fs_readahead(struct readahead_control *rac)
 {
-	struct inode *inode = mapping->host;
-	struct page *page = list_last_entry(pages, struct page, lru);
+	struct inode *inode = rac->mapping->host;
 
-	trace_f2fs_readpages(inode, page, nr_pages);
+	trace_f2fs_readpages(inode, readahead_index(rac), readahead_count(rac));
 
 	if (!f2fs_is_compress_backend_ready(inode))
-		return 0;
+		return;
 
 	/* If the file has inline data, skip readpages */
 	if (f2fs_has_inline_data(inode))
-		return 0;
+		return;
 
-	return f2fs_mpage_readpages(mapping, pages, NULL, nr_pages, true);
+	f2fs_mpage_readpages(inode, rac, NULL);
 }
 
 int f2fs_encrypt_one_page(struct f2fs_io_info *fio)
@@ -3784,7 +3774,7 @@ static void f2fs_swap_deactivate(struct file *file)
 
 const struct address_space_operations f2fs_dblock_aops = {
 	.readpage	= f2fs_read_data_page,
-	.readpages	= f2fs_read_data_pages,
+	.readahead	= f2fs_readahead,
 	.writepage	= f2fs_write_data_page,
 	.writepages	= f2fs_write_data_pages,
 	.write_begin	= f2fs_write_begin,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 5355be6b6755..b5e72dee8826 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3344,9 +3344,8 @@ int f2fs_reserve_new_block(struct dnode_of_data *dn);
 int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index);
 int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from);
 int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index);
-int f2fs_mpage_readpages(struct address_space *mapping,
-			struct list_head *pages, struct page *page,
-			unsigned nr_pages, bool is_readahead);
+int f2fs_mpage_readpages(struct inode *inode, struct readahead_control *rac,
+		struct page *page);
 struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
 			int op_flags, bool for_write);
 struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index);
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 67a97838c2a0..d72da4a33883 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -1375,9 +1375,9 @@ TRACE_EVENT(f2fs_writepages,
 
 TRACE_EVENT(f2fs_readpages,
 
-	TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage),
+	TP_PROTO(struct inode *inode, pgoff_t start, unsigned int nrpage),
 
-	TP_ARGS(inode, page, nrpage),
+	TP_ARGS(inode, start, nrpage),
 
 	TP_STRUCT__entry(
 		__field(dev_t,	dev)
@@ -1389,7 +1389,7 @@ TRACE_EVENT(f2fs_readpages,
 	TP_fast_assign(
 		__entry->dev	= inode->i_sb->s_dev;
 		__entry->ino	= inode->i_ino;
-		__entry->start	= page->index;
+		__entry->start	= start;
 		__entry->nrpage	= nrpage;
 	),
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 12/13] fuse: Convert from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (10 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 11/13] f2fs: " Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  2020-02-11  1:03 ` [PATCH v5 13/13] iomap: " Matthew Wilcox
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in fuse.  Switching away from the
read_cache_pages() helper gets rid of an implicit call to put_page(),
so we can get rid of the get_page() call in fuse_readpages_fill().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/fuse/file.c | 46 +++++++++++++++++++---------------------------
 1 file changed, 19 insertions(+), 27 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 9d67b830fb7a..f64f98708b5e 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -923,9 +923,8 @@ struct fuse_fill_data {
 	unsigned int max_pages;
 };
 
-static int fuse_readpages_fill(void *_data, struct page *page)
+static int fuse_readpages_fill(struct fuse_fill_data *data, struct page *page)
 {
-	struct fuse_fill_data *data = _data;
 	struct fuse_io_args *ia = data->ia;
 	struct fuse_args_pages *ap = &ia->ap;
 	struct inode *inode = data->inode;
@@ -941,10 +940,8 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 					fc->max_pages);
 		fuse_send_readpages(ia, data->file);
 		data->ia = ia = fuse_io_alloc(NULL, data->max_pages);
-		if (!ia) {
-			unlock_page(page);
+		if (!ia)
 			return -ENOMEM;
-		}
 		ap = &ia->ap;
 	}
 
@@ -954,7 +951,6 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 		return -EIO;
 	}
 
-	get_page(page);
 	ap->pages[ap->num_pages] = page;
 	ap->descs[ap->num_pages].length = PAGE_SIZE;
 	ap->num_pages++;
@@ -962,37 +958,33 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	return 0;
 }
 
-static int fuse_readpages(struct file *file, struct address_space *mapping,
-			  struct list_head *pages, unsigned nr_pages)
+static void fuse_readahead(struct readahead_control *rac)
 {
-	struct inode *inode = mapping->host;
+	struct inode *inode = rac->mapping->host;
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct fuse_fill_data data;
-	int err;
+	struct page *page;
 
-	err = -EIO;
 	if (is_bad_inode(inode))
-		goto out;
+		return;
 
-	data.file = file;
+	data.file = rac->file;
 	data.inode = inode;
-	data.nr_pages = nr_pages;
-	data.max_pages = min_t(unsigned int, nr_pages, fc->max_pages);
-;
+	data.nr_pages = readahead_count(rac);
+	data.max_pages = min_t(unsigned int, data.nr_pages, fc->max_pages);
 	data.ia = fuse_io_alloc(NULL, data.max_pages);
-	err = -ENOMEM;
 	if (!data.ia)
-		goto out;
+		return;
 
-	err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
-	if (!err) {
-		if (data.ia->ap.num_pages)
-			fuse_send_readpages(data.ia, file);
-		else
-			fuse_io_free(data.ia);
+	readahead_for_each(rac, page) {
+		if (fuse_readpages_fill(&data, page) != 0)
+			return;
 	}
-out:
-	return err;
+
+	if (data.ia->ap.num_pages)
+		fuse_send_readpages(data.ia, rac->file);
+	else
+		fuse_io_free(data.ia);
 }
 
 static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to)
@@ -3373,10 +3365,10 @@ static const struct file_operations fuse_file_operations = {
 
 static const struct address_space_operations fuse_file_aops  = {
 	.readpage	= fuse_readpage,
+	.readahead	= fuse_readahead,
 	.writepage	= fuse_writepage,
 	.writepages	= fuse_writepages,
 	.launder_page	= fuse_launder_page,
-	.readpages	= fuse_readpages,
 	.set_page_dirty	= __set_page_dirty_nobuffers,
 	.bmap		= fuse_bmap,
 	.direct_IO	= fuse_direct_IO,
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 13/13] iomap: Convert from readpages to readahead
  2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
                   ` (11 preceding siblings ...)
  2020-02-11  1:03 ` [PATCH v5 12/13] fuse: " Matthew Wilcox
@ 2020-02-11  1:03 ` " Matthew Wilcox
  12 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11  1:03 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Matthew Wilcox (Oracle),
	linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in iomap.  Convert XFS and ZoneFS to
use it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 101 ++++++++++++-----------------------------
 fs/iomap/trace.h       |   2 +-
 fs/xfs/xfs_aops.c      |  13 ++----
 fs/zonefs/super.c      |   7 ++-
 include/linux/iomap.h  |   3 +-
 5 files changed, 39 insertions(+), 87 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index cb3511eb152a..e40eb45230fa 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -214,9 +214,8 @@ iomap_read_end_io(struct bio *bio)
 struct iomap_readpage_ctx {
 	struct page		*cur_page;
 	bool			cur_page_in_bio;
-	bool			is_readahead;
 	struct bio		*bio;
-	struct list_head	*pages;
+	struct readahead_control *rac;
 };
 
 static void
@@ -307,11 +306,11 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		if (ctx->bio)
 			submit_bio(ctx->bio);
 
-		if (ctx->is_readahead) /* same as readahead_gfp_mask */
+		if (ctx->rac) /* same as readahead_gfp_mask */
 			gfp |= __GFP_NORETRY | __GFP_NOWARN;
 		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
 		ctx->bio->bi_opf = REQ_OP_READ;
-		if (ctx->is_readahead)
+		if (ctx->rac)
 			ctx->bio->bi_opf |= REQ_RAHEAD;
 		ctx->bio->bi_iter.bi_sector = sector;
 		bio_set_dev(ctx->bio, iomap->bdev);
@@ -367,104 +366,62 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
 }
 EXPORT_SYMBOL_GPL(iomap_readpage);
 
-static struct page *
-iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
-		loff_t length, loff_t *done)
-{
-	while (!list_empty(pages)) {
-		struct page *page = lru_to_page(pages);
-
-		if (page_offset(page) >= (u64)pos + length)
-			break;
-
-		list_del(&page->lru);
-		if (!add_to_page_cache_lru(page, inode->i_mapping, page->index,
-				GFP_NOFS))
-			return page;
-
-		/*
-		 * If we already have a page in the page cache at index we are
-		 * done.  Upper layers don't care if it is uptodate after the
-		 * readpages call itself as every page gets checked again once
-		 * actually needed.
-		 */
-		*done += PAGE_SIZE;
-		put_page(page);
-	}
-
-	return NULL;
-}
-
 static loff_t
-iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length,
+iomap_readahead_actor(struct inode *inode, loff_t pos, loff_t length,
 		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iomap_readpage_ctx *ctx = data;
-	loff_t done, ret;
+	loff_t ret, done = 0;
 
-	for (done = 0; done < length; done += ret) {
-		if (ctx->cur_page && offset_in_page(pos + done) == 0) {
-			if (!ctx->cur_page_in_bio)
-				unlock_page(ctx->cur_page);
-			put_page(ctx->cur_page);
-			ctx->cur_page = NULL;
-		}
+	while (done < length) {
 		if (!ctx->cur_page) {
-			ctx->cur_page = iomap_next_page(inode, ctx->pages,
-					pos, length, &done);
-			if (!ctx->cur_page)
-				break;
+			ctx->cur_page = readahead_page(ctx->rac);
 			ctx->cur_page_in_bio = false;
 		}
 		ret = iomap_readpage_actor(inode, pos + done, length - done,
 				ctx, iomap, srcmap);
+		if (WARN_ON(ret == 0))
+			break;
+		done += ret;
+		if (offset_in_page(pos + done) == 0) {
+			ctx->rac->nr_pages -= ctx->rac->batch_count;
+			if (!ctx->cur_page_in_bio)
+				unlock_page(ctx->cur_page);
+			put_page(ctx->cur_page);
+			ctx->cur_page = NULL;
+		}
 	}
 
 	return done;
 }
 
-int
-iomap_readpages(struct address_space *mapping, struct list_head *pages,
-		unsigned nr_pages, const struct iomap_ops *ops)
+void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
 {
+	struct inode *inode = rac->mapping->host;
 	struct iomap_readpage_ctx ctx = {
-		.pages		= pages,
-		.is_readahead	= true,
+		.rac	= rac,
 	};
-	loff_t pos = page_offset(list_entry(pages->prev, struct page, lru));
-	loff_t last = page_offset(list_entry(pages->next, struct page, lru));
-	loff_t length = last - pos + PAGE_SIZE, ret = 0;
+	loff_t pos = readahead_offset(rac);
+	loff_t length = readahead_length(rac);
 
-	trace_iomap_readpages(mapping->host, nr_pages);
+	trace_iomap_readahead(inode, readahead_count(rac));
 
 	while (length > 0) {
-		ret = iomap_apply(mapping->host, pos, length, 0, ops,
-				&ctx, iomap_readpages_actor);
+		loff_t ret = iomap_apply(inode, pos, length, 0, ops,
+				&ctx, iomap_readahead_actor);
 		if (ret <= 0) {
 			WARN_ON_ONCE(ret == 0);
-			goto done;
+			break;
 		}
 		pos += ret;
 		length -= ret;
 	}
-	ret = 0;
-done:
+
 	if (ctx.bio)
 		submit_bio(ctx.bio);
-	if (ctx.cur_page) {
-		if (!ctx.cur_page_in_bio)
-			unlock_page(ctx.cur_page);
-		put_page(ctx.cur_page);
-	}
-
-	/*
-	 * Check that we didn't lose a page due to the arcance calling
-	 * conventions..
-	 */
-	WARN_ON_ONCE(!ret && !list_empty(ctx.pages));
-	return ret;
+	BUG_ON(ctx.cur_page);
 }
-EXPORT_SYMBOL_GPL(iomap_readpages);
+EXPORT_SYMBOL_GPL(iomap_readahead);
 
 /*
  * iomap_is_partially_uptodate checks whether blocks within a page are
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index 6dc227b8c47e..d6ba705f938a 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -39,7 +39,7 @@ DEFINE_EVENT(iomap_readpage_class, name,	\
 	TP_PROTO(struct inode *inode, int nr_pages), \
 	TP_ARGS(inode, nr_pages))
 DEFINE_READPAGE_EVENT(iomap_readpage);
-DEFINE_READPAGE_EVENT(iomap_readpages);
+DEFINE_READPAGE_EVENT(iomap_readahead);
 
 DECLARE_EVENT_CLASS(iomap_page_class,
 	TP_PROTO(struct inode *inode, struct page *page, unsigned long off,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 3a688eb5c5ae..0897dd71c622 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -621,14 +621,11 @@ xfs_vm_readpage(
 	return iomap_readpage(page, &xfs_read_iomap_ops);
 }
 
-STATIC int
-xfs_vm_readpages(
-	struct file		*unused,
-	struct address_space	*mapping,
-	struct list_head	*pages,
-	unsigned		nr_pages)
+STATIC void
+xfs_vm_readahead(
+	struct readahead_control	*rac)
 {
-	return iomap_readpages(mapping, pages, nr_pages, &xfs_read_iomap_ops);
+	iomap_readahead(rac, &xfs_read_iomap_ops);
 }
 
 static int
@@ -644,7 +641,7 @@ xfs_iomap_swapfile_activate(
 
 const struct address_space_operations xfs_address_space_operations = {
 	.readpage		= xfs_vm_readpage,
-	.readpages		= xfs_vm_readpages,
+	.readahead		= xfs_vm_readahead,
 	.writepage		= xfs_vm_writepage,
 	.writepages		= xfs_vm_writepages,
 	.set_page_dirty		= iomap_set_page_dirty,
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 8bc6ef82d693..8327a01d3bac 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -78,10 +78,9 @@ static int zonefs_readpage(struct file *unused, struct page *page)
 	return iomap_readpage(page, &zonefs_iomap_ops);
 }
 
-static int zonefs_readpages(struct file *unused, struct address_space *mapping,
-			    struct list_head *pages, unsigned int nr_pages)
+static void zonefs_readahead(struct readahead_control *rac)
 {
-	return iomap_readpages(mapping, pages, nr_pages, &zonefs_iomap_ops);
+	iomap_readahead(rac, &zonefs_iomap_ops);
 }
 
 /*
@@ -128,7 +127,7 @@ static int zonefs_writepages(struct address_space *mapping,
 
 static const struct address_space_operations zonefs_file_aops = {
 	.readpage		= zonefs_readpage,
-	.readpages		= zonefs_readpages,
+	.readahead		= zonefs_readahead,
 	.writepage		= zonefs_writepage,
 	.writepages		= zonefs_writepages,
 	.set_page_dirty		= iomap_set_page_dirty,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 8b09463dae0d..bc20bd04c2a2 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -155,8 +155,7 @@ loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length,
 ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
 		const struct iomap_ops *ops);
 int iomap_readpage(struct page *page, const struct iomap_ops *ops);
-int iomap_readpages(struct address_space *mapping, struct list_head *pages,
-		unsigned nr_pages, const struct iomap_ops *ops);
+void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
 int iomap_set_page_dirty(struct page *page);
 int iomap_is_partially_uptodate(struct page *page, unsigned long from,
 		unsigned long count);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
@ 2020-02-11  4:52   ` Dave Chinner
  2020-02-11 12:54     ` Matthew Wilcox
  2020-02-12 18:18   ` Christoph Hellwig
  2020-02-14  5:36   ` John Hubbard
  2 siblings, 1 reply; 31+ messages in thread
From: Dave Chinner @ 2020-02-11  4:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Mon, Feb 10, 2020 at 05:03:39PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> This replaces ->readpages with a saner interface:
>  - Return void instead of an ignored error code.
>  - Pages are already in the page cache when ->readahead is called.
>  - Implementation looks up the pages in the page cache instead of
>    having them passed in a linked list.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

....

>  
> +/*
> + * Readahead is of a block of consecutive pages.
> + */
> +struct readahead_control {
> +	struct file *file;
> +	struct address_space *mapping;
> +/* private: use the readahead_* accessors instead */
> +	pgoff_t start;
> +	unsigned int nr_pages;
> +	unsigned int batch_count;
> +};
> +
> +static inline struct page *readahead_page(struct readahead_control *rac)
> +{
> +	struct page *page;
> +
> +	if (!rac->nr_pages)
> +		return NULL;
> +
> +	page = xa_load(&rac->mapping->i_pages, rac->start);
> +	VM_BUG_ON_PAGE(!PageLocked(page), page);
> +	rac->batch_count = hpage_nr_pages(page);
> +	rac->start += rac->batch_count;

There's no mention of large page support in the patch description
and I don't recall this sort of large page batching in previous
iterations.

This seems like new functionality to me, not directly related to
the initial ->readahead API change? What have I missed?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
@ 2020-02-11  8:19   ` Johannes Thumshirn
  2020-02-11 12:34     ` Matthew Wilcox
  2020-02-12 18:13   ` Christoph Hellwig
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 31+ messages in thread
From: Johannes Thumshirn @ 2020-02-11  8:19 UTC (permalink / raw)
  To: Matthew Wilcox, linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

On 11/02/2020 02:05, Matthew Wilcox wrote:
> even though I'm pretty sure we're not going to readahead more than 2^32
> pages ever.

And 640K is more memory than anyone will ever need on a computer *scnr*


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  8:19   ` Johannes Thumshirn
@ 2020-02-11 12:34     ` Matthew Wilcox
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11 12:34 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Tue, Feb 11, 2020 at 08:19:14AM +0000, Johannes Thumshirn wrote:
> On 11/02/2020 02:05, Matthew Wilcox wrote:
> > even though I'm pretty sure we're not going to readahead more than 2^32
> > pages ever.
> 
> And 640K is more memory than anyone will ever need on a computer *scnr*

Sure, but bandwidth just isn't increasing quickly enough to have
this make sense.  2^32 pages even on our smallest page size machines
is 16GB.  Right now, we cap readahead at just 256kB.  If we did try to
readahead 16GB, we'd be occupying a PCIe gen4 x4 drive for two seconds,
just satisfying this one readahead.  PCIe has historically doubled in
bandwidth every three years or so, so to get this down to something
reasonable like a hundredth of a second, we're looking at PCIe gen12 in
twenty years or so.  And I bet we still won't do it (also, I doubt PCIe
will continue doubling bandwidth every three years).

And Linus has forbidden individual IOs over 2GB anyway, so not happening
until he's forced to see the error of his ways ;-)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11  4:52   ` Dave Chinner
@ 2020-02-11 12:54     ` Matthew Wilcox
  2020-02-11 20:08       ` Dave Chinner
  0 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-11 12:54 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Tue, Feb 11, 2020 at 03:52:30PM +1100, Dave Chinner wrote:
> > +struct readahead_control {
> > +	struct file *file;
> > +	struct address_space *mapping;
> > +/* private: use the readahead_* accessors instead */
> > +	pgoff_t start;
> > +	unsigned int nr_pages;
> > +	unsigned int batch_count;
> > +};
> > +
> > +static inline struct page *readahead_page(struct readahead_control *rac)
> > +{
> > +	struct page *page;
> > +
> > +	if (!rac->nr_pages)
> > +		return NULL;
> > +
> > +	page = xa_load(&rac->mapping->i_pages, rac->start);
> > +	VM_BUG_ON_PAGE(!PageLocked(page), page);
> > +	rac->batch_count = hpage_nr_pages(page);
> > +	rac->start += rac->batch_count;
> 
> There's no mention of large page support in the patch description
> and I don't recall this sort of large page batching in previous
> iterations.
> 
> This seems like new functionality to me, not directly related to
> the initial ->readahead API change? What have I missed?

I had a crisis of confidence when I was working on this -- the loop
originally looked like this:

#define readahead_for_each(rac, page)                                   \
        for (; (page = readahead_page(rac)); rac->nr_pages--)

and then I started thinking about what I'd need to do to support large
pages, and that turned into

#define readahead_for_each(rac, page)                                   \
        for (; (page = readahead_page(rac));				\
		rac->nr_pages -= hpage_nr_pages(page))

but I realised that was potentially a use-after-free because 'page' has
certainly had put_page() called on it by then.  I had a brief period
where I looked at moving put_page() away from being the filesystem's
responsibility and into the iterator, but that would introduce more
changes into the patchset, as well as causing problems for filesystems
that want to break out of the loop.

By this point, I was also looking at the readahead_for_each_batch()
iterator that btrfs uses, and so we have the batch count anyway, and we
might as well use it to store the number of subpages of the large page.
And so it became easier to just put the whole ball of wax into the initial
patch set, rather than introduce the iterator now and then fix it up in
the patch set that I'm basing on this.

So yes, there's a certain amount of excess functionality in this patch
set ... I can remove it for the next release.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11 12:54     ` Matthew Wilcox
@ 2020-02-11 20:08       ` Dave Chinner
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Chinner @ 2020-02-11 20:08 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Tue, Feb 11, 2020 at 04:54:13AM -0800, Matthew Wilcox wrote:
> On Tue, Feb 11, 2020 at 03:52:30PM +1100, Dave Chinner wrote:
> > > +struct readahead_control {
> > > +	struct file *file;
> > > +	struct address_space *mapping;
> > > +/* private: use the readahead_* accessors instead */
> > > +	pgoff_t start;
> > > +	unsigned int nr_pages;
> > > +	unsigned int batch_count;
> > > +};
> > > +
> > > +static inline struct page *readahead_page(struct readahead_control *rac)
> > > +{
> > > +	struct page *page;
> > > +
> > > +	if (!rac->nr_pages)
> > > +		return NULL;
> > > +
> > > +	page = xa_load(&rac->mapping->i_pages, rac->start);
> > > +	VM_BUG_ON_PAGE(!PageLocked(page), page);
> > > +	rac->batch_count = hpage_nr_pages(page);
> > > +	rac->start += rac->batch_count;
> > 
> > There's no mention of large page support in the patch description
> > and I don't recall this sort of large page batching in previous
> > iterations.
> > 
> > This seems like new functionality to me, not directly related to
> > the initial ->readahead API change? What have I missed?
> 
> I had a crisis of confidence when I was working on this -- the loop
> originally looked like this:
> 
> #define readahead_for_each(rac, page)                                   \
>         for (; (page = readahead_page(rac)); rac->nr_pages--)
> 
> and then I started thinking about what I'd need to do to support large
> pages, and that turned into
> 
> #define readahead_for_each(rac, page)                                   \
>         for (; (page = readahead_page(rac));				\
> 		rac->nr_pages -= hpage_nr_pages(page))
> 
> but I realised that was potentially a use-after-free because 'page' has
> certainly had put_page() called on it by then.  I had a brief period
> where I looked at moving put_page() away from being the filesystem's
> responsibility and into the iterator, but that would introduce more
> changes into the patchset, as well as causing problems for filesystems
> that want to break out of the loop.
> 
> By this point, I was also looking at the readahead_for_each_batch()
> iterator that btrfs uses, and so we have the batch count anyway, and we
> might as well use it to store the number of subpages of the large page.
> And so it became easier to just put the whole ball of wax into the initial
> patch set, rather than introduce the iterator now and then fix it up in
> the patch set that I'm basing on this.
> 
> So yes, there's a certain amount of excess functionality in this patch
> set ... I can remove it for the next release.

I'd say "Just document it" as that was the main reason I noticed it.
Or perhaps add the batching function as a stand-alone patch so it's
clear that the batch interface solves two problems at once - large
pages and the btrfs page batching implementation...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
  2020-02-11  8:19   ` Johannes Thumshirn
@ 2020-02-12 18:13   ` Christoph Hellwig
  2020-02-14  3:19   ` John Hubbard
  2020-02-14 19:50   ` Matthew Wilcox
  3 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2020-02-12 18:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 02/13] mm: Ignore return value of ->readpages
  2020-02-11  1:03 ` [PATCH v5 02/13] mm: Ignore return value of ->readpages Matthew Wilcox
@ 2020-02-12 18:13   ` Christoph Hellwig
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2020-02-12 18:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Mon, Feb 10, 2020 at 05:03:37PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> We used to assign the return value to a variable, which we then ignored.
> Remove the pretence of caring.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
  2020-02-11  4:52   ` Dave Chinner
@ 2020-02-12 18:18   ` Christoph Hellwig
  2020-02-14  5:36   ` John Hubbard
  2 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2020-02-12 18:18 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Mon, Feb 10, 2020 at 05:03:39PM -0800, Matthew Wilcox wrote:
> +struct readahead_control {
> +	struct file *file;
> +	struct address_space *mapping;
> +/* private: use the readahead_* accessors instead */
> +	pgoff_t start;
> +	unsigned int nr_pages;
> +	unsigned int batch_count;

We often use __ prefixes for the private fields to make that a little
more clear.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead
  2020-02-11  1:03 ` [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox
@ 2020-02-13 22:09   ` Junxiao Bi
  0 siblings, 0 replies; 31+ messages in thread
From: Junxiao Bi @ 2020-02-13 22:09 UTC (permalink / raw)
  To: Matthew Wilcox, linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs


On 2/10/20 5:03 PM, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> Implement the new readahead aop and convert all callers (block_dev,
> exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
> reiserfs & udf).  The callers are all trivial except for GFS2 & OCFS2.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   drivers/staging/exfat/exfat_super.c |  7 +++---
>   fs/block_dev.c                      |  7 +++---
>   fs/ext2/inode.c                     | 10 +++-----
>   fs/fat/inode.c                      |  7 +++---
>   fs/gfs2/aops.c                      | 23 ++++++-----------
>   fs/hpfs/file.c                      |  7 +++---
>   fs/iomap/buffered-io.c              |  2 +-
>   fs/isofs/inode.c                    |  7 +++---
>   fs/jfs/inode.c                      |  7 +++---
>   fs/mpage.c                          | 38 +++++++++--------------------
>   fs/nilfs2/inode.c                   | 15 +++---------
>   fs/ocfs2/aops.c                     | 34 ++++++++++----------------
>   fs/omfs/file.c                      |  7 +++---
>   fs/qnx6/inode.c                     |  7 +++---
>   fs/reiserfs/inode.c                 |  8 +++---
>   fs/udf/inode.c                      |  7 +++---
>   include/linux/mpage.h               |  4 +--
>   mm/migrate.c                        |  2 +-
>   18 files changed, 74 insertions(+), 125 deletions(-)
>
> diff --git a/drivers/staging/exfat/exfat_super.c b/drivers/staging/exfat/exfat_super.c
> index b81d2a87b82e..96aad9b16d31 100644
> --- a/drivers/staging/exfat/exfat_super.c
> +++ b/drivers/staging/exfat/exfat_super.c
> @@ -3002,10 +3002,9 @@ static int exfat_readpage(struct file *file, struct page *page)
>   	return  mpage_readpage(page, exfat_get_block);
>   }
>   
> -static int exfat_readpages(struct file *file, struct address_space *mapping,
> -			   struct list_head *pages, unsigned int nr_pages)
> +static void exfat_readahead(struct readahead_control *rac)
>   {
> -	return  mpage_readpages(mapping, pages, nr_pages, exfat_get_block);
> +	mpage_readahead(rac, exfat_get_block);
>   }
>   
>   static int exfat_writepage(struct page *page, struct writeback_control *wbc)
> @@ -3104,7 +3103,7 @@ static sector_t _exfat_bmap(struct address_space *mapping, sector_t block)
>   
>   static const struct address_space_operations exfat_aops = {
>   	.readpage    = exfat_readpage,
> -	.readpages   = exfat_readpages,
> +	.readahead   = exfat_readahead,
>   	.writepage   = exfat_writepage,
>   	.writepages  = exfat_writepages,
>   	.write_begin = exfat_write_begin,
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 69bf2fb6f7cd..2fd9c7bd61f6 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -614,10 +614,9 @@ static int blkdev_readpage(struct file * file, struct page * page)
>   	return block_read_full_page(page, blkdev_get_block);
>   }
>   
> -static int blkdev_readpages(struct file *file, struct address_space *mapping,
> -			struct list_head *pages, unsigned nr_pages)
> +static void blkdev_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block);
> +	mpage_readahead(rac, blkdev_get_block);
>   }
>   
>   static int blkdev_write_begin(struct file *file, struct address_space *mapping,
> @@ -2062,7 +2061,7 @@ static int blkdev_writepages(struct address_space *mapping,
>   
>   static const struct address_space_operations def_blk_aops = {
>   	.readpage	= blkdev_readpage,
> -	.readpages	= blkdev_readpages,
> +	.readahead	= blkdev_readahead,
>   	.writepage	= blkdev_writepage,
>   	.write_begin	= blkdev_write_begin,
>   	.write_end	= blkdev_write_end,
> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> index 119667e65890..2a65387f9b12 100644
> --- a/fs/ext2/inode.c
> +++ b/fs/ext2/inode.c
> @@ -877,11 +877,9 @@ static int ext2_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, ext2_get_block);
>   }
>   
> -static int
> -ext2_readpages(struct file *file, struct address_space *mapping,
> -		struct list_head *pages, unsigned nr_pages)
> +static void ext2_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, ext2_get_block);
> +	mpage_readahead(rac, ext2_get_block);
>   }
>   
>   static int
> @@ -966,7 +964,7 @@ ext2_dax_writepages(struct address_space *mapping, struct writeback_control *wbc
>   
>   const struct address_space_operations ext2_aops = {
>   	.readpage		= ext2_readpage,
> -	.readpages		= ext2_readpages,
> +	.readahead		= ext2_readahead,
>   	.writepage		= ext2_writepage,
>   	.write_begin		= ext2_write_begin,
>   	.write_end		= ext2_write_end,
> @@ -980,7 +978,7 @@ const struct address_space_operations ext2_aops = {
>   
>   const struct address_space_operations ext2_nobh_aops = {
>   	.readpage		= ext2_readpage,
> -	.readpages		= ext2_readpages,
> +	.readahead		= ext2_readahead,
>   	.writepage		= ext2_nobh_writepage,
>   	.write_begin		= ext2_nobh_write_begin,
>   	.write_end		= nobh_write_end,
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 594b05ae16c9..3496f5fc3e6d 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -210,10 +210,9 @@ static int fat_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, fat_get_block);
>   }
>   
> -static int fat_readpages(struct file *file, struct address_space *mapping,
> -			 struct list_head *pages, unsigned nr_pages)
> +static void fat_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, fat_get_block);
> +	mpage_readahead(rac, fat_get_block);
>   }
>   
>   static void fat_write_failed(struct address_space *mapping, loff_t to)
> @@ -344,7 +343,7 @@ int fat_block_truncate_page(struct inode *inode, loff_t from)
>   
>   static const struct address_space_operations fat_aops = {
>   	.readpage	= fat_readpage,
> -	.readpages	= fat_readpages,
> +	.readahead	= fat_readahead,
>   	.writepage	= fat_writepage,
>   	.writepages	= fat_writepages,
>   	.write_begin	= fat_write_begin,
> diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
> index ba83b49ce18c..5e63c13c12c1 100644
> --- a/fs/gfs2/aops.c
> +++ b/fs/gfs2/aops.c
> @@ -577,7 +577,7 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos,
>   }
>   
>   /**
> - * gfs2_readpages - Read a bunch of pages at once
> + * gfs2_readahead - Read a bunch of pages at once
>    * @file: The file to read from
>    * @mapping: Address space info
>    * @pages: List of pages to read
> @@ -590,31 +590,24 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos,
>    *    obviously not something we'd want to do on too regular a basis.
>    *    Any I/O we ignore at this time will be done via readpage later.
>    * 2. We don't handle stuffed files here we let readpage do the honours.
> - * 3. mpage_readpages() does most of the heavy lifting in the common case.
> + * 3. mpage_readahead() does most of the heavy lifting in the common case.
>    * 4. gfs2_block_map() is relied upon to set BH_Boundary in the right places.
>    */
>   
> -static int gfs2_readpages(struct file *file, struct address_space *mapping,
> -			  struct list_head *pages, unsigned nr_pages)
> +static void gfs2_readahead(struct readahead_control *rac)
>   {
> -	struct inode *inode = mapping->host;
> +	struct inode *inode = rac->mapping->host;
>   	struct gfs2_inode *ip = GFS2_I(inode);
> -	struct gfs2_sbd *sdp = GFS2_SB(inode);
>   	struct gfs2_holder gh;
> -	int ret;
>   
>   	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> -	ret = gfs2_glock_nq(&gh);
> -	if (unlikely(ret))
> +	if (gfs2_glock_nq(&gh))
>   		goto out_uninit;
>   	if (!gfs2_is_stuffed(ip))
> -		ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map);
> +		mpage_readahead(rac, gfs2_block_map);
>   	gfs2_glock_dq(&gh);
>   out_uninit:
>   	gfs2_holder_uninit(&gh);
> -	if (unlikely(gfs2_withdrawn(sdp)))
> -		ret = -EIO;
> -	return ret;
>   }
>   
>   /**
> @@ -828,7 +821,7 @@ static const struct address_space_operations gfs2_aops = {
>   	.writepage = gfs2_writepage,
>   	.writepages = gfs2_writepages,
>   	.readpage = gfs2_readpage,
> -	.readpages = gfs2_readpages,
> +	.readahead = gfs2_readahead,
>   	.bmap = gfs2_bmap,
>   	.invalidatepage = gfs2_invalidatepage,
>   	.releasepage = gfs2_releasepage,
> @@ -842,7 +835,7 @@ static const struct address_space_operations gfs2_jdata_aops = {
>   	.writepage = gfs2_jdata_writepage,
>   	.writepages = gfs2_jdata_writepages,
>   	.readpage = gfs2_readpage,
> -	.readpages = gfs2_readpages,
> +	.readahead = gfs2_readahead,
>   	.set_page_dirty = jdata_set_page_dirty,
>   	.bmap = gfs2_bmap,
>   	.invalidatepage = gfs2_invalidatepage,
> diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
> index b36abf9cb345..2de0d3492d15 100644
> --- a/fs/hpfs/file.c
> +++ b/fs/hpfs/file.c
> @@ -125,10 +125,9 @@ static int hpfs_writepage(struct page *page, struct writeback_control *wbc)
>   	return block_write_full_page(page, hpfs_get_block, wbc);
>   }
>   
> -static int hpfs_readpages(struct file *file, struct address_space *mapping,
> -			  struct list_head *pages, unsigned nr_pages)
> +static void hpfs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, hpfs_get_block);
> +	mpage_readahead(rac, hpfs_get_block);
>   }
>   
>   static int hpfs_writepages(struct address_space *mapping,
> @@ -198,7 +197,7 @@ static int hpfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>   const struct address_space_operations hpfs_aops = {
>   	.readpage = hpfs_readpage,
>   	.writepage = hpfs_writepage,
> -	.readpages = hpfs_readpages,
> +	.readahead = hpfs_readahead,
>   	.writepages = hpfs_writepages,
>   	.write_begin = hpfs_write_begin,
>   	.write_end = hpfs_write_end,
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 7c84c4c027c4..cb3511eb152a 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -359,7 +359,7 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops)
>   	}
>   
>   	/*
> -	 * Just like mpage_readpages and block_read_full_page we always
> +	 * Just like mpage_readahead and block_read_full_page we always
>   	 * return 0 and just mark the page as PageError on errors.  This
>   	 * should be cleaned up all through the stack eventually.
>   	 */
> diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
> index 62c0462dc89f..95b1f377ad09 100644
> --- a/fs/isofs/inode.c
> +++ b/fs/isofs/inode.c
> @@ -1185,10 +1185,9 @@ static int isofs_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, isofs_get_block);
>   }
>   
> -static int isofs_readpages(struct file *file, struct address_space *mapping,
> -			struct list_head *pages, unsigned nr_pages)
> +static void isofs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, isofs_get_block);
> +	mpage_readahead(rac, isofs_get_block);
>   }
>   
>   static sector_t _isofs_bmap(struct address_space *mapping, sector_t block)
> @@ -1198,7 +1197,7 @@ static sector_t _isofs_bmap(struct address_space *mapping, sector_t block)
>   
>   static const struct address_space_operations isofs_aops = {
>   	.readpage = isofs_readpage,
> -	.readpages = isofs_readpages,
> +	.readahead = isofs_readahead,
>   	.bmap = _isofs_bmap
>   };
>   
> diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c
> index 9486afcdac76..6f65bfa9f18d 100644
> --- a/fs/jfs/inode.c
> +++ b/fs/jfs/inode.c
> @@ -296,10 +296,9 @@ static int jfs_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, jfs_get_block);
>   }
>   
> -static int jfs_readpages(struct file *file, struct address_space *mapping,
> -		struct list_head *pages, unsigned nr_pages)
> +static void jfs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, jfs_get_block);
> +	mpage_readahead(rac, jfs_get_block);
>   }
>   
>   static void jfs_write_failed(struct address_space *mapping, loff_t to)
> @@ -358,7 +357,7 @@ static ssize_t jfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>   
>   const struct address_space_operations jfs_aops = {
>   	.readpage	= jfs_readpage,
> -	.readpages	= jfs_readpages,
> +	.readahead	= jfs_readahead,
>   	.writepage	= jfs_writepage,
>   	.writepages	= jfs_writepages,
>   	.write_begin	= jfs_write_begin,
> diff --git a/fs/mpage.c b/fs/mpage.c
> index ccba3c4c4479..166a95ec1341 100644
> --- a/fs/mpage.c
> +++ b/fs/mpage.c
> @@ -91,7 +91,7 @@ mpage_alloc(struct block_device *bdev,
>   }
>   
>   /*
> - * support function for mpage_readpages.  The fs supplied get_block might
> + * support function for mpage_readahead.  The fs supplied get_block might
>    * return an up to date buffer.  This is used to map that buffer into
>    * the page, which allows readpage to avoid triggering a duplicate call
>    * to get_block.
> @@ -338,13 +338,10 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
>   }
>   
>   /**
> - * mpage_readpages - populate an address space with some pages & start reads against them
> + * mpage_readahead - start reads against pages
>    * @mapping: the address_space
> - * @pages: The address of a list_head which contains the target pages.  These
> - *   pages have their ->index populated and are otherwise uninitialised.
> - *   The page at @pages->prev has the lowest file offset, and reads should be
> - *   issued in @pages->prev to @pages->next order.
> - * @nr_pages: The number of pages at *@pages
> + * @start: The number of the first page to read.
> + * @nr_pages: The number of consecutive pages to read.
>    * @get_block: The filesystem's block mapper function.
>    *
>    * This function walks the pages and the blocks within each page, building and
> @@ -381,36 +378,25 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
>    *
>    * This all causes the disk requests to be issued in the correct order.
>    */
> -int
> -mpage_readpages(struct address_space *mapping, struct list_head *pages,
> -				unsigned nr_pages, get_block_t get_block)
> +void mpage_readahead(struct readahead_control *rac, get_block_t get_block)
>   {
> +	struct page *page;
>   	struct mpage_readpage_args args = {
>   		.get_block = get_block,
>   		.is_readahead = true,
>   	};
> -	unsigned page_idx;
> -
> -	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
> -		struct page *page = lru_to_page(pages);
>   
> +	readahead_for_each(rac, page) {
>   		prefetchw(&page->flags);
> -		list_del(&page->lru);
> -		if (!add_to_page_cache_lru(page, mapping,
> -					page->index,
> -					readahead_gfp_mask(mapping))) {
> -			args.page = page;
> -			args.nr_pages = nr_pages - page_idx;
> -			args.bio = do_mpage_readpage(&args);
> -		}
> +		args.page = page;
> +		args.nr_pages = readahead_count(rac);
> +		args.bio = do_mpage_readpage(&args);
>   		put_page(page);
>   	}
> -	BUG_ON(!list_empty(pages));
>   	if (args.bio)
>   		mpage_bio_submit(REQ_OP_READ, REQ_RAHEAD, args.bio);
> -	return 0;
>   }
> -EXPORT_SYMBOL(mpage_readpages);
> +EXPORT_SYMBOL(mpage_readahead);
>   
>   /*
>    * This isn't called much at all
> @@ -563,7 +549,7 @@ static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
>   		 * Page has buffers, but they are all unmapped. The page was
>   		 * created by pagein or read over a hole which was handled by
>   		 * block_read_full_page().  If this address_space is also
> -		 * using mpage_readpages then this can rarely happen.
> +		 * using mpage_readahead then this can rarely happen.
>   		 */
>   		goto confused;
>   	}
> diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
> index 671085512e0f..ceeb3b441844 100644
> --- a/fs/nilfs2/inode.c
> +++ b/fs/nilfs2/inode.c
> @@ -145,18 +145,9 @@ static int nilfs_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, nilfs_get_block);
>   }
>   
> -/**
> - * nilfs_readpages() - implement readpages() method of nilfs_aops {}
> - * address_space_operations.
> - * @file - file struct of the file to be read
> - * @mapping - address_space struct used for reading multiple pages
> - * @pages - the pages to be read
> - * @nr_pages - number of pages to be read
> - */
> -static int nilfs_readpages(struct file *file, struct address_space *mapping,
> -			   struct list_head *pages, unsigned int nr_pages)
> +static void nilfs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, nilfs_get_block);
> +	mpage_readahead(rac, nilfs_get_block);
>   }
>   
>   static int nilfs_writepages(struct address_space *mapping,
> @@ -308,7 +299,7 @@ const struct address_space_operations nilfs_aops = {
>   	.readpage		= nilfs_readpage,
>   	.writepages		= nilfs_writepages,
>   	.set_page_dirty		= nilfs_set_page_dirty,
> -	.readpages		= nilfs_readpages,
> +	.readahead		= nilfs_readahead,
>   	.write_begin		= nilfs_write_begin,
>   	.write_end		= nilfs_write_end,
>   	/* .releasepage		= nilfs_releasepage, */
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 3a67a6518ddf..e8137efaafec 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -350,14 +350,11 @@ static int ocfs2_readpage(struct file *file, struct page *page)
>    * grow out to a tree. If need be, detecting boundary extents could
>    * trivially be added in a future version of ocfs2_get_block().
>    */
> -static int ocfs2_readpages(struct file *filp, struct address_space *mapping,
> -			   struct list_head *pages, unsigned nr_pages)
> +static void ocfs2_readahead(struct readahead_control *rac)
>   {
> -	int ret, err = -EIO;
> -	struct inode *inode = mapping->host;
> +	int ret;
> +	struct inode *inode = rac->mapping->host;
>   	struct ocfs2_inode_info *oi = OCFS2_I(inode);
> -	loff_t start;
> -	struct page *last;
>   
>   	/*
>   	 * Use the nonblocking flag for the dlm code to avoid page
> @@ -365,36 +362,31 @@ static int ocfs2_readpages(struct file *filp, struct address_space *mapping,
>   	 */
>   	ret = ocfs2_inode_lock_full(inode, NULL, 0, OCFS2_LOCK_NONBLOCK);
>   	if (ret)
> -		return err;
> +		return;
>   
> -	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> -		ocfs2_inode_unlock(inode, 0);
> -		return err;
> -	}
> +	if (down_read_trylock(&oi->ip_alloc_sem) == 0)
> +		goto out_unlock;
>   
>   	/*
>   	 * Don't bother with inline-data. There isn't anything
>   	 * to read-ahead in that case anyway...
>   	 */
>   	if (oi->ip_dyn_features & OCFS2_INLINE_DATA_FL)
> -		goto out_unlock;
> +		goto out_up;
>   
>   	/*
>   	 * Check whether a remote node truncated this file - we just
>   	 * drop out in that case as it's not worth handling here.
>   	 */
> -	last = lru_to_page(pages);
> -	start = (loff_t)last->index << PAGE_SHIFT;
> -	if (start >= i_size_read(inode))
> -		goto out_unlock;
> +	if (readahead_offset(rac) >= i_size_read(inode))
> +		goto out_up;
>   
> -	err = mpage_readpages(mapping, pages, nr_pages, ocfs2_get_block);
> +	mpage_readahead(rac, ocfs2_get_block);
>   
> -out_unlock:
> +out_up:
>   	up_read(&oi->ip_alloc_sem);
> +out_unlock:
>   	ocfs2_inode_unlock(inode, 0);
> -
> -	return err;
>   }
>   
>   /* Note: Because we don't support holes, our allocation has
> @@ -2474,7 +2466,7 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>   
>   const struct address_space_operations ocfs2_aops = {
>   	.readpage		= ocfs2_readpage,
> -	.readpages		= ocfs2_readpages,
> +	.readahead		= ocfs2_readahead,
>   	.writepage		= ocfs2_writepage,
>   	.write_begin		= ocfs2_write_begin,
>   	.write_end		= ocfs2_write_end,

Looks good on ocfs2 part.

Thanks,

Junxiao.

> diff --git a/fs/omfs/file.c b/fs/omfs/file.c
> index d640b9388238..d7b5f09d298c 100644
> --- a/fs/omfs/file.c
> +++ b/fs/omfs/file.c
> @@ -289,10 +289,9 @@ static int omfs_readpage(struct file *file, struct page *page)
>   	return block_read_full_page(page, omfs_get_block);
>   }
>   
> -static int omfs_readpages(struct file *file, struct address_space *mapping,
> -		struct list_head *pages, unsigned nr_pages)
> +static void omfs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, omfs_get_block);
> +	mpage_readahead(rac, omfs_get_block);
>   }
>   
>   static int omfs_writepage(struct page *page, struct writeback_control *wbc)
> @@ -373,7 +372,7 @@ const struct inode_operations omfs_file_inops = {
>   
>   const struct address_space_operations omfs_aops = {
>   	.readpage = omfs_readpage,
> -	.readpages = omfs_readpages,
> +	.readahead = omfs_readahead,
>   	.writepage = omfs_writepage,
>   	.writepages = omfs_writepages,
>   	.write_begin = omfs_write_begin,
> diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
> index 345db56c98fd..755293c8c71a 100644
> --- a/fs/qnx6/inode.c
> +++ b/fs/qnx6/inode.c
> @@ -99,10 +99,9 @@ static int qnx6_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, qnx6_get_block);
>   }
>   
> -static int qnx6_readpages(struct file *file, struct address_space *mapping,
> -		   struct list_head *pages, unsigned nr_pages)
> +static void qnx6_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, qnx6_get_block);
> +	mpage_readahead(rac, qnx6_get_block);
>   }
>   
>   /*
> @@ -499,7 +498,7 @@ static sector_t qnx6_bmap(struct address_space *mapping, sector_t block)
>   }
>   static const struct address_space_operations qnx6_aops = {
>   	.readpage	= qnx6_readpage,
> -	.readpages	= qnx6_readpages,
> +	.readahead	= qnx6_readahead,
>   	.bmap		= qnx6_bmap
>   };
>   
> diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
> index 6419e6dacc39..0031070b3692 100644
> --- a/fs/reiserfs/inode.c
> +++ b/fs/reiserfs/inode.c
> @@ -1160,11 +1160,9 @@ int reiserfs_get_block(struct inode *inode, sector_t block,
>   	return retval;
>   }
>   
> -static int
> -reiserfs_readpages(struct file *file, struct address_space *mapping,
> -		   struct list_head *pages, unsigned nr_pages)
> +static void reiserfs_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, reiserfs_get_block);
> +	mpage_readahead(rac, reiserfs_get_block);
>   }
>   
>   /*
> @@ -3434,7 +3432,7 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr)
>   const struct address_space_operations reiserfs_address_space_operations = {
>   	.writepage = reiserfs_writepage,
>   	.readpage = reiserfs_readpage,
> -	.readpages = reiserfs_readpages,
> +	.readahead = reiserfs_readahead,
>   	.releasepage = reiserfs_releasepage,
>   	.invalidatepage = reiserfs_invalidatepage,
>   	.write_begin = reiserfs_write_begin,
> diff --git a/fs/udf/inode.c b/fs/udf/inode.c
> index e875bc5668ee..adaba8e8b326 100644
> --- a/fs/udf/inode.c
> +++ b/fs/udf/inode.c
> @@ -195,10 +195,9 @@ static int udf_readpage(struct file *file, struct page *page)
>   	return mpage_readpage(page, udf_get_block);
>   }
>   
> -static int udf_readpages(struct file *file, struct address_space *mapping,
> -			struct list_head *pages, unsigned nr_pages)
> +static void udf_readahead(struct readahead_control *rac)
>   {
> -	return mpage_readpages(mapping, pages, nr_pages, udf_get_block);
> +	mpage_readahead(rac, udf_get_block);
>   }
>   
>   static int udf_write_begin(struct file *file, struct address_space *mapping,
> @@ -234,7 +233,7 @@ static sector_t udf_bmap(struct address_space *mapping, sector_t block)
>   
>   const struct address_space_operations udf_aops = {
>   	.readpage	= udf_readpage,
> -	.readpages	= udf_readpages,
> +	.readahead	= udf_readahead,
>   	.writepage	= udf_writepage,
>   	.writepages	= udf_writepages,
>   	.write_begin	= udf_write_begin,
> diff --git a/include/linux/mpage.h b/include/linux/mpage.h
> index 001f1fcf9836..f4f5e90a6844 100644
> --- a/include/linux/mpage.h
> +++ b/include/linux/mpage.h
> @@ -13,9 +13,9 @@
>   #ifdef CONFIG_BLOCK
>   
>   struct writeback_control;
> +struct readahead_control;
>   
> -int mpage_readpages(struct address_space *mapping, struct list_head *pages,
> -				unsigned nr_pages, get_block_t get_block);
> +void mpage_readahead(struct readahead_control *, get_block_t get_block);
>   int mpage_readpage(struct page *page, get_block_t get_block);
>   int mpage_writepages(struct address_space *mapping,
>   		struct writeback_control *wbc, get_block_t get_block);
> diff --git a/mm/migrate.c b/mm/migrate.c
> index b1092876e537..a32122095702 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1020,7 +1020,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>   		 * to the LRU. Later, when the IO completes the pages are
>   		 * marked uptodate and unlocked. However, the queueing
>   		 * could be merging multiple pages for one bio (e.g.
> -		 * mpage_readpages). If an allocation happens for the
> +		 * mpage_readahead). If an allocation happens for the
>   		 * second or third page, the process can end up locking
>   		 * the same page twice and deadlocking. Rather than
>   		 * trying to be clever about what pages can be locked,

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
  2020-02-11  8:19   ` Johannes Thumshirn
  2020-02-12 18:13   ` Christoph Hellwig
@ 2020-02-14  3:19   ` John Hubbard
  2020-02-14  4:21     ` Matthew Wilcox
  2020-02-14 19:50   ` Matthew Wilcox
  3 siblings, 1 reply; 31+ messages in thread
From: John Hubbard @ 2020-02-14  3:19 UTC (permalink / raw)
  To: Matthew Wilcox, linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

On 2/10/20 5:03 PM, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> ra_submit() which is a wrapper around __do_page_cache_readahead() already
> returns an unsigned long, and the 'nr_to_read' parameter is an unsigned
> long, so fix __do_page_cache_readahead() to return an unsigned long,
> even though I'm pretty sure we're not going to readahead more than 2^32
> pages ever.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/internal.h  | 2 +-
>  mm/readahead.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 3cf20ab3ca01..41b93c4b3ab7 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -49,7 +49,7 @@ void unmap_page_range(struct mmu_gather *tlb,
>  			     unsigned long addr, unsigned long end,
>  			     struct zap_details *details);
>  
> -extern unsigned int __do_page_cache_readahead(struct address_space *mapping,
> +extern unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
>  		unsigned long lookahead_size);
>  
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 2fe72cd29b47..6bf73ef33b7e 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -152,7 +152,7 @@ static int read_pages(struct address_space *mapping, struct file *filp,
>   *
>   * Returns the number of pages requested, or the maximum amount of I/O allowed.
>   */
> -unsigned int __do_page_cache_readahead(struct address_space *mapping,
> +unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
>  		unsigned long lookahead_size)
>  {
> @@ -161,7 +161,7 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping,
>  	unsigned long end_index;	/* The last page we want to read */
>  	LIST_HEAD(page_pool);
>  	int page_idx;


What about page_idx, too? It should also have the same data type as nr_pages, as long as
we're trying to be consistent on this point.

Just want to ensure we're ready to handle those 2^33+ page readaheads... :)


thanks,
-- 
John Hubbard
NVIDIA
> -	unsigned int nr_pages = 0;
> +	unsigned long nr_pages = 0;
>  	loff_t isize = i_size_read(inode);
>  	gfp_t gfp_mask = readahead_gfp_mask(mapping);
>  
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 03/13] mm: Put readahead pages in cache earlier
  2020-02-11  1:03 ` [PATCH v5 03/13] mm: Put readahead pages in cache earlier Matthew Wilcox
@ 2020-02-14  3:36   ` John Hubbard
  2020-02-15  1:15     ` Matthew Wilcox
  0 siblings, 1 reply; 31+ messages in thread
From: John Hubbard @ 2020-02-14  3:36 UTC (permalink / raw)
  To: Matthew Wilcox, linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

On 2/10/20 5:03 PM, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> At allocation time, put the pages in the cache unless we're using
> ->readpages.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/readahead.c | 66 ++++++++++++++++++++++++++++++++------------------
>  1 file changed, 42 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index fc77d13af556..96c6ca68a174 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -114,10 +114,10 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
>  EXPORT_SYMBOL(read_cache_pages);
>  
>  static void read_pages(struct address_space *mapping, struct file *filp,
> -		struct list_head *pages, unsigned int nr_pages, gfp_t gfp)
> +		struct list_head *pages, pgoff_t start,
> +		unsigned int nr_pages)
>  {
>  	struct blk_plug plug;
> -	unsigned page_idx;
>  
>  	blk_start_plug(&plug);
>  
> @@ -125,18 +125,17 @@ static void read_pages(struct address_space *mapping, struct file *filp,
>  		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
>  		/* Clean up the remaining pages */
>  		put_pages_list(pages);
> -		goto out;
> -	}
> +	} else {
> +		struct page *page;
> +		unsigned long index;
>  
> -	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
> -		struct page *page = lru_to_page(pages);
> -		list_del(&page->lru);
> -		if (!add_to_page_cache_lru(page, mapping, page->index, gfp))
> +		xa_for_each_range(&mapping->i_pages, index, page, start,
> +				start + nr_pages - 1) {
>  			mapping->a_ops->readpage(filp, page);
> -		put_page(page);
> +			put_page(page);
> +		}
>  	}
>  
> -out:
>  	blk_finish_plug(&plug);
>  }
>  
> @@ -149,17 +148,18 @@ static void read_pages(struct address_space *mapping, struct file *filp,
>   * Returns the number of pages requested, or the maximum amount of I/O allowed.
>   */
>  unsigned long __do_page_cache_readahead(struct address_space *mapping,
> -		struct file *filp, pgoff_t offset, unsigned long nr_to_read,
> +		struct file *filp, pgoff_t start, unsigned long nr_to_read,
>  		unsigned long lookahead_size)
>  {
>  	struct inode *inode = mapping->host;
> -	struct page *page;
>  	unsigned long end_index;	/* The last page we want to read */
>  	LIST_HEAD(page_pool);
>  	int page_idx;
> +	pgoff_t page_offset = start;
>  	unsigned long nr_pages = 0;
>  	loff_t isize = i_size_read(inode);
>  	gfp_t gfp_mask = readahead_gfp_mask(mapping);
> +	bool use_list = mapping->a_ops->readpages;
>  
>  	if (isize == 0)
>  		goto out;
> @@ -170,7 +170,7 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  	 * Preallocate as many pages as we will need.
>  	 */
>  	for (page_idx = 0; page_idx < nr_to_read; page_idx++) {
> -		pgoff_t page_offset = offset + page_idx;
> +		struct page *page;

I see two distinct things happening in this patch, and I think they want to each be
in their own patch:

1) A significant refactoring of the page loop, and

2) Changing the place where the page is added to the page cache. (Only this one is 
   mentioned in the commit description.)

We'll be more likely to spot any errors if these are teased apart.


thanks,
-- 
John Hubbard
NVIDIA

>  
>  		if (page_offset > end_index)
>  			break;
> @@ -178,25 +178,43 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  		page = xa_load(&mapping->i_pages, page_offset);
>  		if (page && !xa_is_value(page)) {
>  			/*
> -			 * Page already present?  Kick off the current batch of
> -			 * contiguous pages before continuing with the next
> -			 * batch.
> +			 * Page already present?  Kick off the current batch
> +			 * of contiguous pages before continuing with the
> +			 * next batch.
> +			 * It's possible this page is the page we should
> +			 * be marking with PageReadahead.  However, we
> +			 * don't have a stable ref to this page so it might
> +			 * be reallocated to another user before we can set
> +			 * the bit.  There's probably another page in the
> +			 * cache marked with PageReadahead from the other
> +			 * process which accessed this file.
>  			 */
> -			if (nr_pages)
> -				read_pages(mapping, filp, &page_pool, nr_pages,
> -						gfp_mask);
> -			nr_pages = 0;
> -			continue;
> +			goto skip;
>  		}
>  
>  		page = __page_cache_alloc(gfp_mask);
>  		if (!page)
>  			break;
> -		page->index = page_offset;
> -		list_add(&page->lru, &page_pool);
> +		if (use_list) {
> +			page->index = page_offset;
> +			list_add(&page->lru, &page_pool);
> +		} else if (add_to_page_cache_lru(page, mapping, page_offset,
> +					gfp_mask) < 0) {
> +			put_page(page);
> +			goto skip;
> +		}
> +
>  		if (page_idx == nr_to_read - lookahead_size)
>  			SetPageReadahead(page);
>  		nr_pages++;
> +		page_offset++;
> +		continue;
> +skip:
> +		if (nr_pages)
> +			read_pages(mapping, filp, &page_pool, start, nr_pages);
> +		nr_pages = 0;
> +		page_offset++;
> +		start = page_offset;
>  	}
>  
>  	/*
> @@ -205,7 +223,7 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  	 * will then handle the error.
>  	 */
>  	if (nr_pages)
> -		read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask);
> +		read_pages(mapping, filp, &page_pool, start, nr_pages);
>  	BUG_ON(!list_empty(&page_pool));
>  out:
>  	return nr_pages;
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-14  3:19   ` John Hubbard
@ 2020-02-14  4:21     ` Matthew Wilcox
  2020-02-14  4:33       ` John Hubbard
  0 siblings, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-14  4:21 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Thu, Feb 13, 2020 at 07:19:53PM -0800, John Hubbard wrote:
> On 2/10/20 5:03 PM, Matthew Wilcox wrote:
> > @@ -161,7 +161,7 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping,
> >  	unsigned long end_index;	/* The last page we want to read */
> >  	LIST_HEAD(page_pool);
> >  	int page_idx;
> 
> 
> What about page_idx, too? It should also have the same data type as nr_pages, as long as
> we're trying to be consistent on this point.
> 
> Just want to ensure we're ready to handle those 2^33+ page readaheads... :)

Nah, this is just a type used internally to the function.  Getting the
API right for the callers is the important part.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-14  4:21     ` Matthew Wilcox
@ 2020-02-14  4:33       ` John Hubbard
  0 siblings, 0 replies; 31+ messages in thread
From: John Hubbard @ 2020-02-14  4:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On 2/13/20 8:21 PM, Matthew Wilcox wrote:
> On Thu, Feb 13, 2020 at 07:19:53PM -0800, John Hubbard wrote:
>> On 2/10/20 5:03 PM, Matthew Wilcox wrote:
>>> @@ -161,7 +161,7 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping,
>>>  	unsigned long end_index;	/* The last page we want to read */
>>>  	LIST_HEAD(page_pool);
>>>  	int page_idx;
>>
>>
>> What about page_idx, too? It should also have the same data type as nr_pages, as long as
>> we're trying to be consistent on this point.
>>
>> Just want to ensure we're ready to handle those 2^33+ page readaheads... :)
> 
> Nah, this is just a type used internally to the function.  Getting the
> API right for the callers is the important part.
> 

Agreed that the real point of this is to match up the API, but why not finish the job
by going all the way through? It's certainly not something we need to lose sleep over,
but it does seem like you don't want to have code like this:

        for (page_idx = 0; page_idx < nr_to_read; page_idx++) {

...with the ability, technically, to overflow page_idx due to it being an int,
while nr_to_read is an unsigned long. (And the new sanitizers and checkers are
apt to complain about it, btw.)

(Apologies if there is some kernel coding idiom that I still haven't learned, about this 
sort of thing.)

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
  2020-02-11  4:52   ` Dave Chinner
  2020-02-12 18:18   ` Christoph Hellwig
@ 2020-02-14  5:36   ` John Hubbard
  2020-02-15  1:15     ` Matthew Wilcox
  2 siblings, 1 reply; 31+ messages in thread
From: John Hubbard @ 2020-02-14  5:36 UTC (permalink / raw)
  To: Matthew Wilcox, linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

On 2/10/20 5:03 PM, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> This replaces ->readpages with a saner interface:
>  - Return void instead of an ignored error code.
>  - Pages are already in the page cache when ->readahead is called.
>  - Implementation looks up the pages in the page cache instead of
>    having them passed in a linked list.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  Documentation/filesystems/locking.rst |  6 ++-
>  Documentation/filesystems/vfs.rst     | 13 +++++++
>  include/linux/fs.h                    |  2 +
>  include/linux/pagemap.h               | 54 +++++++++++++++++++++++++++
>  mm/readahead.c                        | 48 ++++++++++++++----------
>  5 files changed, 102 insertions(+), 21 deletions(-)
> 

A minor question below, but either way you can add:

    Reviewed-by: John Hubbard <jhubbard@nvidia.com>



> diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
> index 5057e4d9dcd1..0ebc4491025a 100644
> --- a/Documentation/filesystems/locking.rst
> +++ b/Documentation/filesystems/locking.rst
> @@ -239,6 +239,7 @@ prototypes::
>  	int (*readpage)(struct file *, struct page *);
>  	int (*writepages)(struct address_space *, struct writeback_control *);
>  	int (*set_page_dirty)(struct page *page);
> +	void (*readahead)(struct readahead_control *);
>  	int (*readpages)(struct file *filp, struct address_space *mapping,
>  			struct list_head *pages, unsigned nr_pages);
>  	int (*write_begin)(struct file *, struct address_space *mapping,
> @@ -271,7 +272,8 @@ writepage:		yes, unlocks (see below)
>  readpage:		yes, unlocks
>  writepages:
>  set_page_dirty		no
> -readpages:
> +readahead:		yes, unlocks
> +readpages:		no
>  write_begin:		locks the page		 exclusive
>  write_end:		yes, unlocks		 exclusive
>  bmap:
> @@ -295,6 +297,8 @@ the request handler (/dev/loop).
>  ->readpage() unlocks the page, either synchronously or via I/O
>  completion.
>  
> +->readahead() unlocks the pages like ->readpage().
> +
>  ->readpages() populates the pagecache with the passed pages and starts
>  I/O against them.  They come unlocked upon I/O completion.
>  
> diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> index 7d4d09dd5e6d..cabee16b7406 100644
> --- a/Documentation/filesystems/vfs.rst
> +++ b/Documentation/filesystems/vfs.rst
> @@ -706,6 +706,7 @@ cache in your filesystem.  The following members are defined:
>  		int (*readpage)(struct file *, struct page *);
>  		int (*writepages)(struct address_space *, struct writeback_control *);
>  		int (*set_page_dirty)(struct page *page);
> +		void (*readahead)(struct readahead_control *);
>  		int (*readpages)(struct file *filp, struct address_space *mapping,
>  				 struct list_head *pages, unsigned nr_pages);
>  		int (*write_begin)(struct file *, struct address_space *mapping,
> @@ -781,12 +782,24 @@ cache in your filesystem.  The following members are defined:
>  	If defined, it should set the PageDirty flag, and the
>  	PAGECACHE_TAG_DIRTY tag in the radix tree.
>  
> +``readahead``
> +	Called by the VM to read pages associated with the address_space
> +	object.  The pages are consecutive in the page cache and are
> +	locked.  The implementation should decrement the page refcount
> +	after starting I/O on each page.  Usually the page will be
> +	unlocked by the I/O completion handler.  If the function does
> +	not attempt I/O on some pages, the caller will decrement the page
> +	refcount and unlock the pages for you.	Set PageUptodate if the
> +	I/O completes successfully.  Setting PageError on any page will
> +	be ignored; simply unlock the page if an I/O error occurs.
> +
>  ``readpages``
>  	called by the VM to read pages associated with the address_space
>  	object.  This is essentially just a vector version of readpage.
>  	Instead of just one page, several pages are requested.
>  	readpages is only used for read-ahead, so read errors are
>  	ignored.  If anything goes wrong, feel free to give up.
> +        This interface is deprecated; implement readahead instead.
>  
>  ``write_begin``
>  	Called by the generic buffered write code to ask the filesystem
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 3cd4fe6b845e..d4e2d2964346 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -292,6 +292,7 @@ enum positive_aop_returns {
>  struct page;
>  struct address_space;
>  struct writeback_control;
> +struct readahead_control;
>  
>  /*
>   * Write life time hint values.
> @@ -375,6 +376,7 @@ struct address_space_operations {
>  	 */
>  	int (*readpages)(struct file *filp, struct address_space *mapping,
>  			struct list_head *pages, unsigned nr_pages);
> +	void (*readahead)(struct readahead_control *);
>  
>  	int (*write_begin)(struct file *, struct address_space *mapping,
>  				loff_t pos, unsigned len, unsigned flags,
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index ccb14b6a16b5..13efafaf7e1f 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -630,6 +630,60 @@ static inline int add_to_page_cache(struct page *page,
>  	return error;
>  }
>  
> +/*
> + * Readahead is of a block of consecutive pages.
> + */
> +struct readahead_control {
> +	struct file *file;
> +	struct address_space *mapping;
> +/* private: use the readahead_* accessors instead */
> +	pgoff_t start;
> +	unsigned int nr_pages;
> +	unsigned int batch_count;
> +};
> +
> +static inline struct page *readahead_page(struct readahead_control *rac)
> +{
> +	struct page *page;
> +
> +	if (!rac->nr_pages)
> +		return NULL;
> +
> +	page = xa_load(&rac->mapping->i_pages, rac->start);


Is it worth asserting that the page was found:

	VM_BUG_ON_PAGE(!page || xa_is_value(page), page);

? Or is that overkill here?


> +	VM_BUG_ON_PAGE(!PageLocked(page), page);
> +	rac->batch_count = hpage_nr_pages(page);
> +	rac->start += rac->batch_count;


The above was surprising, until I saw the other thread with Dave and you.
I was reviewing this patchset in order to have a chance at understanding the 
follow-on patchset ("Large pages in the page cache"), and it seems like that
feature has a solid head start here. :)  


thanks,
-- 
John Hubbard
NVIDIA

> +
> +	return page;
> +}
> +
> +#define readahead_for_each(rac, page)					\
> +	for (; (page = readahead_page(rac)); rac->nr_pages -= rac->batch_count)
> +
> +/* The byte offset into the file of this readahead block */
> +static inline loff_t readahead_offset(struct readahead_control *rac)
> +{
> +	return (loff_t)rac->start * PAGE_SIZE;
> +}
> +
> +/* The number of bytes in this readahead block */
> +static inline loff_t readahead_length(struct readahead_control *rac)
> +{
> +	return (loff_t)rac->nr_pages * PAGE_SIZE;
> +}
> +
> +/* The index of the first page in this readahead block */
> +static inline unsigned int readahead_index(struct readahead_control *rac)
> +{
> +	return rac->start;
> +}
> +
> +/* The number of pages in this readahead block */
> +static inline unsigned int readahead_count(struct readahead_control *rac)
> +{
> +	return rac->nr_pages;
> +}
> +
>  static inline unsigned long dir_pages(struct inode *inode)
>  {
>  	return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 96c6ca68a174..933b32e0c90a 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -113,25 +113,30 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
>  
>  EXPORT_SYMBOL(read_cache_pages);
>  
> -static void read_pages(struct address_space *mapping, struct file *filp,
> -		struct list_head *pages, pgoff_t start,
> -		unsigned int nr_pages)
> +static void read_pages(struct readahead_control *rac, struct list_head *pages)
>  {
> +	struct page *page;
>  	struct blk_plug plug;
> +	const struct address_space_operations *aops = rac->mapping->a_ops;
> +
> +	if (rac->nr_pages == 0)
> +		return;
>  
>  	blk_start_plug(&plug);
>  
> -	if (mapping->a_ops->readpages) {
> -		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
> +	if (aops->readahead) {
> +		aops->readahead(rac);
> +		readahead_for_each(rac, page) {
> +			unlock_page(page);
> +			put_page(page);
> +		}
> +	} else if (aops->readpages) {
> +		aops->readpages(rac->file, rac->mapping, pages, rac->nr_pages);
>  		/* Clean up the remaining pages */
>  		put_pages_list(pages);
>  	} else {
> -		struct page *page;
> -		unsigned long index;
> -
> -		xa_for_each_range(&mapping->i_pages, index, page, start,
> -				start + nr_pages - 1) {
> -			mapping->a_ops->readpage(filp, page);
> +		readahead_for_each(rac, page) {
> +			aops->readpage(rac->file, page);
>  			put_page(page);
>  		}
>  	}
> @@ -156,10 +161,15 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  	LIST_HEAD(page_pool);
>  	int page_idx;
>  	pgoff_t page_offset = start;
> -	unsigned long nr_pages = 0;
>  	loff_t isize = i_size_read(inode);
>  	gfp_t gfp_mask = readahead_gfp_mask(mapping);
>  	bool use_list = mapping->a_ops->readpages;
> +	struct readahead_control rac = {
> +		.mapping = mapping,
> +		.file = filp,
> +		.start = start,
> +		.nr_pages = 0,
> +	};
>  
>  	if (isize == 0)
>  		goto out;
> @@ -206,15 +216,14 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  
>  		if (page_idx == nr_to_read - lookahead_size)
>  			SetPageReadahead(page);
> -		nr_pages++;
> +		rac.nr_pages++;
>  		page_offset++;
>  		continue;
>  skip:
> -		if (nr_pages)
> -			read_pages(mapping, filp, &page_pool, start, nr_pages);
> -		nr_pages = 0;
> +		read_pages(&rac, &page_pool);
> +		rac.nr_pages = 0;
>  		page_offset++;
> -		start = page_offset;
> +		rac.start = page_offset;
>  	}
>  
>  	/*
> @@ -222,11 +231,10 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
>  	 * uptodate then the caller will launch readpage again, and
>  	 * will then handle the error.
>  	 */
> -	if (nr_pages)
> -		read_pages(mapping, filp, &page_pool, start, nr_pages);
> +	read_pages(&rac, &page_pool);
>  	BUG_ON(!list_empty(&page_pool));
>  out:
> -	return nr_pages;
> +	return rac.nr_pages;
>  }
>  
>  /*
> 




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead
  2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
                     ` (2 preceding siblings ...)
  2020-02-14  3:19   ` John Hubbard
@ 2020-02-14 19:50   ` Matthew Wilcox
  3 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-14 19:50 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-mm, linux-kernel, linux-btrfs, linux-erofs, linux-ext4,
	linux-f2fs-devel, cluster-devel, ocfs2-devel, linux-xfs

On Mon, Feb 10, 2020 at 05:03:36PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> ra_submit() which is a wrapper around __do_page_cache_readahead() already
> returns an unsigned long, and the 'nr_to_read' parameter is an unsigned
> long, so fix __do_page_cache_readahead() to return an unsigned long,
> even though I'm pretty sure we're not going to readahead more than 2^32
> pages ever.

I was going through this and realised it's completely pointless -- the
returned value from ra_submit() and __do_page_cache_readahead() is
eventually ignored through all paths.  So I'm replacing this patch with
one that makes everything return void.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/13] mm: Add readahead address space operation
  2020-02-14  5:36   ` John Hubbard
@ 2020-02-15  1:15     ` Matthew Wilcox
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-15  1:15 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Thu, Feb 13, 2020 at 09:36:25PM -0800, John Hubbard wrote:
> > +static inline struct page *readahead_page(struct readahead_control *rac)
> > +{
> > +	struct page *page;
> > +
> > +	if (!rac->nr_pages)
> > +		return NULL;
> > +
> > +	page = xa_load(&rac->mapping->i_pages, rac->start);
> 
> 
> Is it worth asserting that the page was found:
> 
> 	VM_BUG_ON_PAGE(!page || xa_is_value(page), page);
> 
> ? Or is that overkill here?

It shouldn't be possible since they were just added in a locked state.
If it did happen, it'll be caught by the assert below -- dereferencing
a NULL pointer or a shadow entry is not going to go well.

> > +	VM_BUG_ON_PAGE(!PageLocked(page), page);
> > +	rac->batch_count = hpage_nr_pages(page);
> > +	rac->start += rac->batch_count;
> 
> The above was surprising, until I saw the other thread with Dave and you.
> I was reviewing this patchset in order to have a chance at understanding the 
> follow-on patchset ("Large pages in the page cache"), and it seems like that
> feature has a solid head start here. :)  

Right, I'll document that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 03/13] mm: Put readahead pages in cache earlier
  2020-02-14  3:36   ` John Hubbard
@ 2020-02-15  1:15     ` Matthew Wilcox
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-02-15  1:15 UTC (permalink / raw)
  To: John Hubbard
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs

On Thu, Feb 13, 2020 at 07:36:38PM -0800, John Hubbard wrote:
> I see two distinct things happening in this patch, and I think they want to each be
> in their own patch:
> 
> 1) A significant refactoring of the page loop, and
> 
> 2) Changing the place where the page is added to the page cache. (Only this one is 
>    mentioned in the commit description.)
> 
> We'll be more likely to spot any errors if these are teased apart.

Thanks.  I ended up splitting this patch into three, each hopefully
easier to understand.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, back to index

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11  1:03 [PATCH v5 00/13] Change readahead API Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 01/13] mm: Fix the return type of __do_page_cache_readahead Matthew Wilcox
2020-02-11  8:19   ` Johannes Thumshirn
2020-02-11 12:34     ` Matthew Wilcox
2020-02-12 18:13   ` Christoph Hellwig
2020-02-14  3:19   ` John Hubbard
2020-02-14  4:21     ` Matthew Wilcox
2020-02-14  4:33       ` John Hubbard
2020-02-14 19:50   ` Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 02/13] mm: Ignore return value of ->readpages Matthew Wilcox
2020-02-12 18:13   ` Christoph Hellwig
2020-02-11  1:03 ` [PATCH v5 03/13] mm: Put readahead pages in cache earlier Matthew Wilcox
2020-02-14  3:36   ` John Hubbard
2020-02-15  1:15     ` Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 04/13] mm: Add readahead address space operation Matthew Wilcox
2020-02-11  4:52   ` Dave Chinner
2020-02-11 12:54     ` Matthew Wilcox
2020-02-11 20:08       ` Dave Chinner
2020-02-12 18:18   ` Christoph Hellwig
2020-02-14  5:36   ` John Hubbard
2020-02-15  1:15     ` Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 05/13] mm: Add page_cache_readahead_limit Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 06/13] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox
2020-02-13 22:09   ` Junxiao Bi
2020-02-11  1:03 ` [PATCH v5 07/13] btrfs: Convert from readpages to readahead Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 08/13] erofs: Convert uncompressed files " Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 09/13] erofs: Convert compressed " Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 10/13] ext4: Convert " Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 11/13] f2fs: " Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 12/13] fuse: " Matthew Wilcox
2020-02-11  1:03 ` [PATCH v5 13/13] iomap: " Matthew Wilcox

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git