Linux-f2fs-devel Archive on lore.kernel.org
 help / color / Atom feed
* [f2fs-dev] [PATCH v4 00/12] Change readahead API
@ 2020-02-01 15:12 Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: cluster-devel, linux-kernel, Matthew Wilcox \(Oracle\),
	linux-f2fs-devel, linux-xfs, linux-mm, ocfs2-devel, linux-ext4,
	linux-erofs, linux-btrfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

I would particularly value feedback on this from the gfs2 and ocfs2
maintainers.  They have non-trivial changes, and a review on patch 5
would be greatly appreciated.

This series adds a readahead address_space operation to eventually
replace the readpages operation.  The key difference is that
pages are added to the page cache as they are allocated (and
then looked up by the filesystem) instead of passing them on a
list to the readpages operation and having the filesystem add
them to the page cache.  It's a net reduction in code for each
implementation, more efficient than walking a list, and solves
the direct-write vs buffered-read problem reported by yu kuai at
https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/

v4:
 - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'"))
 - Add comment to __do_page_cache_readahead() acknowledging we don't
   care _that_ much about setting PageReadahead.
 - Fix the return value check of add_to_page_cache_lru().
 - Add a missing call to put_page() in __do_page_cache_readahead() if
   we fail to insert the page.
 - Improve the documentation of ->readahead (including indentation
   problem identified by Randy).
 - Fix off by one error in read_pages() (Dave Chinner).
 - Fix nr_pages manipulation in btrfs (Dave Chinner).
 - Remove bogus refcount fix in erofs (Gao Xiang, Dave Chinner).
 - Update ext4 patch for Merkle tree readahead.
 - Update f2fs patch for Merkle tree readahead.
 - Reinstate next_page label in f2fs_readpages() now it's used by the
   compression code.
 - Reinstate call to fuse_wait_on_page_writeback (Miklos Szeredi).
 - Remove a double-unlock in the error path in fuse.
 - Remove an odd fly-speck in fuse_readpages().
 - Make nr_pages loop in fuse_readpages less convoluted (Dave Chinner).

Matthew Wilcox (Oracle) (12):
  mm: Fix the return type of __do_page_cache_readahead
  readahead: Ignore return value of ->readpages
  readahead: Put pages in cache earlier
  mm: Add readahead address space operation
  fs: Convert mpage_readpages to mpage_readahead
  btrfs: Convert from readpages to readahead
  erofs: Convert uncompressed files from readpages to readahead
  erofs: Convert compressed files from readpages to readahead
  ext4: Convert from readpages to readahead
  f2fs: Convert from readpages to readahead
  fuse: Convert from readpages to readahead
  iomap: Convert from readpages to readahead

 Documentation/filesystems/locking.rst |  7 ++-
 Documentation/filesystems/vfs.rst     | 14 +++++
 drivers/staging/exfat/exfat_super.c   |  9 +--
 fs/block_dev.c                        |  9 +--
 fs/btrfs/extent_io.c                  | 19 +++---
 fs/btrfs/extent_io.h                  |  2 +-
 fs/btrfs/inode.c                      | 18 +++---
 fs/erofs/data.c                       | 33 ++++------
 fs/erofs/zdata.c                      | 21 +++----
 fs/ext2/inode.c                       | 12 ++--
 fs/ext4/ext4.h                        |  5 +-
 fs/ext4/inode.c                       | 24 ++++----
 fs/ext4/readpage.c                    | 20 +++---
 fs/ext4/verity.c                      | 16 +++--
 fs/f2fs/data.c                        | 35 +++++------
 fs/f2fs/f2fs.h                        |  5 +-
 fs/f2fs/verity.c                      | 16 +++--
 fs/fat/inode.c                        |  8 +--
 fs/fuse/file.c                        | 37 +++++------
 fs/gfs2/aops.c                        | 20 +++---
 fs/hpfs/file.c                        |  8 +--
 fs/iomap/buffered-io.c                | 74 +++++-----------------
 fs/iomap/trace.h                      |  2 +-
 fs/isofs/inode.c                      |  9 +--
 fs/jfs/inode.c                        |  8 +--
 fs/mpage.c                            | 38 ++++--------
 fs/nilfs2/inode.c                     | 13 ++--
 fs/ocfs2/aops.c                       | 32 +++++-----
 fs/omfs/file.c                        |  8 +--
 fs/qnx6/inode.c                       |  8 +--
 fs/reiserfs/inode.c                   | 10 +--
 fs/udf/inode.c                        |  8 +--
 fs/xfs/xfs_aops.c                     | 10 +--
 include/linux/fs.h                    |  2 +
 include/linux/iomap.h                 |  2 +-
 include/linux/mpage.h                 |  2 +-
 include/linux/pagemap.h               | 12 ++++
 include/trace/events/erofs.h          |  6 +-
 include/trace/events/f2fs.h           |  6 +-
 mm/internal.h                         |  2 +-
 mm/migrate.c                          |  2 +-
 mm/readahead.c                        | 89 ++++++++++++++++++---------
 42 files changed, 332 insertions(+), 349 deletions(-)

-- 
2.24.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [f2fs-dev] [PATCH v4 03/12] readahead: Put pages in cache earlier
  2020-02-01 15:12 [f2fs-dev] [PATCH v4 00/12] Change readahead API Matthew Wilcox
@ 2020-02-01 15:12 ` Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: cluster-devel, linux-kernel, Matthew Wilcox \(Oracle\),
	linux-f2fs-devel, linux-xfs, linux-mm, ocfs2-devel, linux-ext4,
	linux-erofs, linux-btrfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

At allocation time, put the pages in the cache unless we're using
->readpages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: linux-btrfs@vger.kernel.org
Cc: linux-erofs@lists.ozlabs.org
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-xfs@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: ocfs2-devel@oss.oracle.com
---
 mm/readahead.c | 64 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 20 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index fc77d13af556..7daef0038b14 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -114,10 +114,10 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 EXPORT_SYMBOL(read_cache_pages);
 
 static void read_pages(struct address_space *mapping, struct file *filp,
-		struct list_head *pages, unsigned int nr_pages, gfp_t gfp)
+		struct list_head *pages, pgoff_t start,
+		unsigned int nr_pages)
 {
 	struct blk_plug plug;
-	unsigned page_idx;
 
 	blk_start_plug(&plug);
 
@@ -125,18 +125,17 @@ static void read_pages(struct address_space *mapping, struct file *filp,
 		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
 		/* Clean up the remaining pages */
 		put_pages_list(pages);
-		goto out;
-	}
+	} else {
+		struct page *page;
+		unsigned long index;
 
-	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
-		struct page *page = lru_to_page(pages);
-		list_del(&page->lru);
-		if (!add_to_page_cache_lru(page, mapping, page->index, gfp))
+		xa_for_each_range(&mapping->i_pages, index, page, start,
+				start + nr_pages - 1) {
 			mapping->a_ops->readpage(filp, page);
-		put_page(page);
+			put_page(page);
+		}
 	}
 
-out:
 	blk_finish_plug(&plug);
 }
 
@@ -153,13 +152,14 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 		unsigned long lookahead_size)
 {
 	struct inode *inode = mapping->host;
-	struct page *page;
 	unsigned long end_index;	/* The last page we want to read */
 	LIST_HEAD(page_pool);
 	int page_idx;
+	pgoff_t page_offset;
 	unsigned long nr_pages = 0;
 	loff_t isize = i_size_read(inode);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
+	bool use_list = mapping->a_ops->readpages;
 
 	if (isize == 0)
 		goto out;
@@ -170,21 +170,32 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 * Preallocate as many pages as we will need.
 	 */
 	for (page_idx = 0; page_idx < nr_to_read; page_idx++) {
-		pgoff_t page_offset = offset + page_idx;
+		struct page *page;
 
+		page_offset = offset + page_idx;
 		if (page_offset > end_index)
 			break;
 
 		page = xa_load(&mapping->i_pages, page_offset);
 		if (page && !xa_is_value(page)) {
 			/*
-			 * Page already present?  Kick off the current batch of
-			 * contiguous pages before continuing with the next
-			 * batch.
+			 * Page already present?  Kick off the current batch
+			 * of contiguous pages before continuing with the
+			 * next batch.
 			 */
 			if (nr_pages)
-				read_pages(mapping, filp, &page_pool, nr_pages,
-						gfp_mask);
+				read_pages(mapping, filp, &page_pool,
+						page_offset - nr_pages,
+						nr_pages);
+			/*
+			 * It's possible this page is the page we should
+			 * be marking with PageReadahead.  However, we
+			 * don't have a stable ref to this page so it might
+			 * be reallocated to another user before we can set
+			 * the bit.  There's probably another page in the
+			 * cache marked with PageReadahead from the other
+			 * process which accessed this file.
+			 */
 			nr_pages = 0;
 			continue;
 		}
@@ -192,8 +203,20 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 		page = __page_cache_alloc(gfp_mask);
 		if (!page)
 			break;
-		page->index = page_offset;
-		list_add(&page->lru, &page_pool);
+		if (use_list) {
+			page->index = page_offset;
+			list_add(&page->lru, &page_pool);
+		} else if (add_to_page_cache_lru(page, mapping, page_offset,
+					gfp_mask) < 0) {
+			if (nr_pages)
+				read_pages(mapping, filp, &page_pool,
+						page_offset - nr_pages,
+						nr_pages);
+			put_page(page);
+			nr_pages = 0;
+			continue;
+		}
+
 		if (page_idx == nr_to_read - lookahead_size)
 			SetPageReadahead(page);
 		nr_pages++;
@@ -205,7 +228,8 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping,
 	 * will then handle the error.
 	 */
 	if (nr_pages)
-		read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask);
+		read_pages(mapping, filp, &page_pool, page_offset - nr_pages,
+				nr_pages);
 	BUG_ON(!list_empty(&page_pool));
 out:
 	return nr_pages;
-- 
2.24.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [f2fs-dev] [PATCH v4 04/12] mm: Add readahead address space operation
  2020-02-01 15:12 [f2fs-dev] [PATCH v4 00/12] Change readahead API Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox
@ 2020-02-01 15:12 ` Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 10/12] f2fs: Convert from readpages to readahead Matthew Wilcox
  2020-02-04 15:32 ` [f2fs-dev] [PATCH v4 00/12] Change readahead API David Sterba
  3 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: cluster-devel, linux-kernel, Matthew Wilcox \(Oracle\),
	linux-f2fs-devel, linux-xfs, linux-mm, ocfs2-devel, linux-ext4,
	linux-erofs, linux-btrfs

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

This replaces ->readpages with a saner interface:
 - Return the number of pages not read instead of an ignored error code.
 - Pages are already in the page cache when ->readahead is called.
 - Implementation looks up the pages in the page cache instead of
   having them passed in a linked list.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: linux-btrfs@vger.kernel.org
Cc: linux-erofs@lists.ozlabs.org
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-xfs@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: ocfs2-devel@oss.oracle.com
---
 Documentation/filesystems/locking.rst |  7 ++++++-
 Documentation/filesystems/vfs.rst     | 14 ++++++++++++++
 include/linux/fs.h                    |  2 ++
 include/linux/pagemap.h               | 12 ++++++++++++
 mm/readahead.c                        | 13 ++++++++++++-
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 5057e4d9dcd1..3d10729caf44 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -239,6 +239,8 @@ prototypes::
 	int (*readpage)(struct file *, struct page *);
 	int (*writepages)(struct address_space *, struct writeback_control *);
 	int (*set_page_dirty)(struct page *page);
+	unsigned (*readahead)(struct file *, struct address_space *,
+				 pgoff_t start, unsigned nr_pages);
 	int (*readpages)(struct file *filp, struct address_space *mapping,
 			struct list_head *pages, unsigned nr_pages);
 	int (*write_begin)(struct file *, struct address_space *mapping,
@@ -271,7 +273,8 @@ writepage:		yes, unlocks (see below)
 readpage:		yes, unlocks
 writepages:
 set_page_dirty		no
-readpages:
+readahead:		yes, unlocks
+readpages:		no
 write_begin:		locks the page		 exclusive
 write_end:		yes, unlocks		 exclusive
 bmap:
@@ -295,6 +298,8 @@ the request handler (/dev/loop).
 ->readpage() unlocks the page, either synchronously or via I/O
 completion.
 
+->readahead() unlocks the pages like ->readpage().
+
 ->readpages() populates the pagecache with the passed pages and starts
 I/O against them.  They come unlocked upon I/O completion.
 
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 7d4d09dd5e6d..c2bc345f2169 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -706,6 +706,8 @@ cache in your filesystem.  The following members are defined:
 		int (*readpage)(struct file *, struct page *);
 		int (*writepages)(struct address_space *, struct writeback_control *);
 		int (*set_page_dirty)(struct page *page);
+		unsigned (*readahead)(struct file *filp, struct address_space *mapping,
+				 pgoff_t start, unsigned nr_pages);
 		int (*readpages)(struct file *filp, struct address_space *mapping,
 				 struct list_head *pages, unsigned nr_pages);
 		int (*write_begin)(struct file *, struct address_space *mapping,
@@ -781,6 +783,18 @@ cache in your filesystem.  The following members are defined:
 	If defined, it should set the PageDirty flag, and the
 	PAGECACHE_TAG_DIRTY tag in the radix tree.
 
+``readahead``
+	Called by the VM to read pages associated with the address_space
+	object.  The pages are consecutive in the page cache and
+	are locked.  The implementation should decrement the page
+	refcount after attempting I/O on each page.  Usually the
+	page will be unlocked by the I/O completion handler.  If the
+	function does not attempt I/O on some pages, return the number
+	of pages which were not read so the caller can unlock the pages
+	for you.  Set PageUptodate if the I/O completes successfully.
+	Setting PageError on any page will be ignored; simply unlock
+	the page if an I/O error occurs.
+
 ``readpages``
 	called by the VM to read pages associated with the address_space
 	object.  This is essentially just a vector version of readpage.
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 41584f50af0d..3bfc142e7d10 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -375,6 +375,8 @@ struct address_space_operations {
 	 */
 	int (*readpages)(struct file *filp, struct address_space *mapping,
 			struct list_head *pages, unsigned nr_pages);
+	unsigned (*readahead)(struct file *, struct address_space *,
+			pgoff_t start, unsigned nr_pages);
 
 	int (*write_begin)(struct file *, struct address_space *mapping,
 				loff_t pos, unsigned len, unsigned flags,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ccb14b6a16b5..a2cf007826f2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -630,6 +630,18 @@ static inline int add_to_page_cache(struct page *page,
 	return error;
 }
 
+/*
+ * Only call this from a ->readahead implementation.
+ */
+static inline
+struct page *readahead_page(struct address_space *mapping, pgoff_t index)
+{
+	struct page *page = xa_load(&mapping->i_pages, index);
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+
+	return page;
+}
+
 static inline unsigned long dir_pages(struct inode *inode)
 {
 	return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >>
diff --git a/mm/readahead.c b/mm/readahead.c
index 7daef0038b14..b2ed0baf3a5d 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -121,7 +121,18 @@ static void read_pages(struct address_space *mapping, struct file *filp,
 
 	blk_start_plug(&plug);
 
-	if (mapping->a_ops->readpages) {
+	if (mapping->a_ops->readahead) {
+		unsigned left = mapping->a_ops->readahead(filp, mapping,
+				start, nr_pages);
+
+		while (left) {
+			struct page *page = readahead_page(mapping,
+					start + nr_pages - left);
+			unlock_page(page);
+			put_page(page);
+			left--;
+		}
+	} else if (mapping->a_ops->readpages) {
 		mapping->a_ops->readpages(filp, mapping, pages, nr_pages);
 		/* Clean up the remaining pages */
 		put_pages_list(pages);
-- 
2.24.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [f2fs-dev] [PATCH v4 10/12] f2fs: Convert from readpages to readahead
  2020-02-01 15:12 [f2fs-dev] [PATCH v4 00/12] Change readahead API Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox
@ 2020-02-01 15:12 ` Matthew Wilcox
  2020-02-04 15:32 ` [f2fs-dev] [PATCH v4 00/12] Change readahead API David Sterba
  3 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-mm, linux-kernel, Matthew Wilcox \(Oracle\), linux-f2fs-devel

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new readahead operation in f2fs

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
---
 fs/f2fs/data.c              | 35 ++++++++++++++---------------------
 fs/f2fs/f2fs.h              |  5 ++---
 fs/f2fs/verity.c            | 16 +++++++++++-----
 include/trace/events/f2fs.h |  6 +++---
 4 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 8bd9afa81c54..80803f8b1b40 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2159,9 +2159,8 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret,
  * use ->readpage() or do the necessary surgery to decouple ->readpages()
  * from read-ahead.
  */
-int f2fs_mpage_readpages(struct address_space *mapping,
-			struct list_head *pages, struct page *page,
-			unsigned nr_pages, bool is_readahead)
+int f2fs_mpage_readpages(struct address_space *mapping, pgoff_t start,
+		struct page *page, unsigned nr_pages, bool is_readahead)
 {
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
@@ -2192,15 +2191,10 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 	map.m_may_create = false;
 
 	for (; nr_pages; nr_pages--) {
-		if (pages) {
-			page = list_last_entry(pages, struct page, lru);
+		if (is_readahead) {
+			page = readahead_page(mapping, start++);
 
 			prefetchw(&page->flags);
-			list_del(&page->lru);
-			if (add_to_page_cache_lru(page, mapping,
-						  page_index(page),
-						  readahead_gfp_mask(mapping)))
-				goto next_page;
 		}
 
 #ifdef CONFIG_F2FS_FS_COMPRESSION
@@ -2243,7 +2237,7 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 			unlock_page(page);
 		}
 next_page:
-		if (pages)
+		if (is_readahead)
 			put_page(page);
 
 #ifdef CONFIG_F2FS_FS_COMPRESSION
@@ -2259,10 +2253,9 @@ int f2fs_mpage_readpages(struct address_space *mapping,
 		}
 #endif
 	}
-	BUG_ON(pages && !list_empty(pages));
 	if (bio)
 		__submit_bio(F2FS_I_SB(inode), bio, DATA);
-	return pages ? 0 : ret;
+	return ret;
 }
 
 static int f2fs_read_data_page(struct file *file, struct page *page)
@@ -2282,27 +2275,27 @@ static int f2fs_read_data_page(struct file *file, struct page *page)
 		ret = f2fs_read_inline_data(inode, page);
 	if (ret == -EAGAIN)
 		ret = f2fs_mpage_readpages(page_file_mapping(page),
-						NULL, page, 1, false);
+						0, page, 1, false);
 	return ret;
 }
 
-static int f2fs_read_data_pages(struct file *file,
+static unsigned f2fs_readahead(struct file *file,
 			struct address_space *mapping,
-			struct list_head *pages, unsigned nr_pages)
+			pgoff_t start, unsigned nr_pages)
 {
 	struct inode *inode = mapping->host;
-	struct page *page = list_last_entry(pages, struct page, lru);
 
-	trace_f2fs_readpages(inode, page, nr_pages);
+	trace_f2fs_readpages(inode, start, nr_pages);
 
 	if (!f2fs_is_compress_backend_ready(inode))
 		return 0;
 
 	/* If the file has inline data, skip readpages */
 	if (f2fs_has_inline_data(inode))
-		return 0;
+		return nr_pages;
 
-	return f2fs_mpage_readpages(mapping, pages, NULL, nr_pages, true);
+	f2fs_mpage_readpages(mapping, start, NULL, nr_pages, true);
+	return 0;
 }
 
 int f2fs_encrypt_one_page(struct f2fs_io_info *fio)
@@ -3778,7 +3771,7 @@ static void f2fs_swap_deactivate(struct file *file)
 
 const struct address_space_operations f2fs_dblock_aops = {
 	.readpage	= f2fs_read_data_page,
-	.readpages	= f2fs_read_data_pages,
+	.readahead	= f2fs_readahead,
 	.writepage	= f2fs_write_data_page,
 	.writepages	= f2fs_write_data_pages,
 	.write_begin	= f2fs_write_begin,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 5355be6b6755..db00907f90f1 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3344,9 +3344,8 @@ int f2fs_reserve_new_block(struct dnode_of_data *dn);
 int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index);
 int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from);
 int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index);
-int f2fs_mpage_readpages(struct address_space *mapping,
-			struct list_head *pages, struct page *page,
-			unsigned nr_pages, bool is_readahead);
+int f2fs_mpage_readpages(struct address_space *mapping, pgoff_t start,
+		struct page *page, unsigned nr_pages, bool is_readahead);
 struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
 			int op_flags, bool for_write);
 struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index);
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index d7d430a6f130..71e92b9b3aa6 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -231,7 +231,6 @@ static int f2fs_get_verity_descriptor(struct inode *inode, void *buf,
 static void f2fs_merkle_tree_readahead(struct address_space *mapping,
 				       pgoff_t start_index, unsigned long count)
 {
-	LIST_HEAD(pages);
 	unsigned int nr_pages = 0;
 	struct page *page;
 	pgoff_t index;
@@ -240,16 +239,23 @@ static void f2fs_merkle_tree_readahead(struct address_space *mapping,
 	for (index = start_index; index < start_index + count; index++) {
 		page = xa_load(&mapping->i_pages, index);
 		if (!page || xa_is_value(page)) {
-			page = __page_cache_alloc(readahead_gfp_mask(mapping));
+			gfp_t gfp = readahead_gfp_mask(mapping);
+			page = __page_cache_alloc(gfp);
 			if (!page)
 				break;
-			page->index = index;
-			list_add(&page->lru, &pages);
+			if (add_to_page_cache_lru(page, mapping, index, gfp)) {
+				put_page(page);
+				break;
+			}
 			nr_pages++;
 		}
 	}
+
+	if (!nr_pages)
+		return;
+
 	blk_start_plug(&plug);
-	f2fs_mpage_readpages(mapping, &pages, NULL, nr_pages, true);
+	f2fs_mpage_readpages(mapping, start_index, NULL, nr_pages, true);
 	blk_finish_plug(&plug);
 }
 
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 67a97838c2a0..d72da4a33883 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -1375,9 +1375,9 @@ TRACE_EVENT(f2fs_writepages,
 
 TRACE_EVENT(f2fs_readpages,
 
-	TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage),
+	TP_PROTO(struct inode *inode, pgoff_t start, unsigned int nrpage),
 
-	TP_ARGS(inode, page, nrpage),
+	TP_ARGS(inode, start, nrpage),
 
 	TP_STRUCT__entry(
 		__field(dev_t,	dev)
@@ -1389,7 +1389,7 @@ TRACE_EVENT(f2fs_readpages,
 	TP_fast_assign(
 		__entry->dev	= inode->i_sb->s_dev;
 		__entry->ino	= inode->i_ino;
-		__entry->start	= page->index;
+		__entry->start	= start;
 		__entry->nrpage	= nrpage;
 	),
 
-- 
2.24.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH v4 00/12] Change readahead API
  2020-02-01 15:12 [f2fs-dev] [PATCH v4 00/12] Change readahead API Matthew Wilcox
                   ` (2 preceding siblings ...)
  2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 10/12] f2fs: Convert from readpages to readahead Matthew Wilcox
@ 2020-02-04 15:32 ` David Sterba
  2020-02-04 17:16   ` Matthew Wilcox
  3 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2020-02-04 15:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: cluster-devel, linux-kernel, linux-f2fs-devel, linux-xfs,
	linux-mm, ocfs2-devel, linux-fsdevel, linux-ext4, linux-erofs,
	linux-btrfs

On Sat, Feb 01, 2020 at 07:12:28AM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> I would particularly value feedback on this from the gfs2 and ocfs2
> maintainers.  They have non-trivial changes, and a review on patch 5
> would be greatly appreciated.
> 
> This series adds a readahead address_space operation to eventually
> replace the readpages operation.  The key difference is that
> pages are added to the page cache as they are allocated (and
> then looked up by the filesystem) instead of passing them on a
> list to the readpages operation and having the filesystem add
> them to the page cache.  It's a net reduction in code for each
> implementation, more efficient than walking a list, and solves
> the direct-write vs buffered-read problem reported by yu kuai at
> https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/
> 
> v4:
>  - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'"))

I've tried to test the patchset but haven't got very far, it crashes at boot
ritht after VFS mounts the root. The patches are from mailinglist, applied on
current master, bug I saw the same crash with the git branch in your
repo (probably v1).

(gdb) l *(ext4_mpage_readpages+0x1da/0xc20)
0xffffffff813753f0 is in ext4_mpage_readpages (fs/ext4/readpage.c:226).
221             return i_size_read(inode);
222     }
223
224     int ext4_mpage_readpages(struct address_space *mapping, pgoff_t start,
225                     struct page *page, unsigned nr_pages, bool is_readahead)
226     {
227             struct bio *bio = NULL;
228             sector_t last_block_in_bio = 0;
229
230             struct inode *inode = mapping->host;

[    8.008531] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    8.011482] #PF: supervisor read access in kernel mode
[    8.014121] #PF: error_code(0x0000) - not-present page
[    8.016767] PGD 0 P4D 0
[    8.018352] Oops: 0000 [#1] SMP
[    8.019716] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.5.0-default+ #955
[    8.021746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[    8.025244] RIP: 0010:ext4_mpage_readpages+0x1da/0xc20
[    8.026817] Code: 7c 24 4e 00 0f 85 23 04 00 00 44 29 74 24 3c 83 6c 24 48 01 0f 84 4d 04 00 00 80 7c 24 4e 00 0f 85 fc 05 00 00 48 8b 4c 24 18 <48> 8b 01 f6 c4 20 75 89 4c 8b 69 20 b9 0c 00 00 00 2b 4c 24 38 83
[    8.031957] RSP: 0000:ffffb34f40013988 EFLAGS: 00010292
[    8.033691] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    8.035533] RDX: 0000000000000001 RSI: ffffffff960934c0 RDI: ffffffff9681a080
[    8.036900] RBP: 0000000000000001 R08: ffffb34f40013a68 R09: 0000000000000000
[    8.038461] R10: 0000000000000038 R11: 0000000000000000 R12: 0000000000000004
[    8.040698] R13: ffff9668ba4e18e0 R14: 0000000000000001 R15: 0000000000000000
[    8.042805] FS:  0000000000000000(0000) GS:ffff9668bda00000(0000) knlGS:0000000000000000
[    8.045396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.047233] CR2: 0000000000000000 CR3: 000000002e011001 CR4: 0000000000160ee0
[    8.049337] Call Trace:
[    8.050435]  ? __lock_acquire+0xee0/0x1320
[    8.051833]  ? release_pages+0x310/0x380
[    8.053265]  ? mark_held_locks+0x50/0x80
[    8.054468]  ext4_readahead+0x3b/0x50
[    8.055877]  read_pages+0x65/0x1a0
[    8.057167]  ? put_pages_list+0x90/0x90
[    8.058689]  __do_page_cache_readahead+0x24b/0x2a0
[    8.060394]  generic_file_buffered_read+0x7cf/0x9f0
[    8.062137]  ? sched_clock+0x5/0x10
[    8.063451]  ? up_read+0x18/0x240
[    8.064774]  ? ext4_xattr_get+0x97/0x2c0
[    8.066178]  new_sync_read+0x111/0x1a0
[    8.067423]  vfs_read+0xc5/0x180
[    8.068572]  kernel_read+0x2c/0x40
[    8.069788]  prepare_binprm+0x171/0x1b0
[    8.071311]  load_script+0x1c1/0x250
[    8.072643]  search_binary_handler+0x5f/0x210
[    8.074135]  exec_binprm+0xd7/0x290
[    8.075463]  __do_execve_file.isra.0+0x570/0x800
[    8.077400]  ? rest_init+0x2f1/0x2f5
[    8.078979]  do_execve+0x21/0x30
[    8.080420]  kernel_init+0xa4/0x11b
[    8.081856]  ? rest_init+0x2f5/0x2f5
[    8.083173]  ret_from_fork+0x24/0x30
[    8.084695] Modules linked in:
[    8.086055] CR2: 0000000000000000
[    8.087572] ---[ end trace 0890c371a706b34a ]---
[    8.089417] RIP: 0010:ext4_mpage_readpages+0x1da/0xc20
[    8.116836] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:38
[    8.119626] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
[    8.122392] INFO: lockdep is turned off.
[    8.123694] irq event stamp: 18341344
[    8.124735] hardirqs last  enabled at (18341343): [<ffffffff95230c42>] free_unref_page_list+0x232/0x270
[    8.127918] hardirqs last disabled at (18341344): [<ffffffff95002b4b>] trace_hardirqs_off_thunk+0x1a/0x1c
[    8.131145] softirqs last  enabled at (18341250): [<ffffffff95a00358>] __do_softirq+0x358/0x52b
[    8.143060] softirqs last disabled at (18341243): [<ffffffff9508ae3d>] irq_exit+0x9d/0xb0
[    8.145603] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G      D           5.5.0-default+ #955
[    8.148474] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[    8.152440] Call Trace:
[    8.153747]  dump_stack+0x71/0xa0
[    8.155238]  ___might_sleep.cold+0xa6/0xf9
[    8.156903]  exit_signals+0x31/0x310
[    8.158431]  ? __do_execve_file.isra.0+0x570/0x800
[    8.160179]  do_exit+0xa8/0xd60
[    8.161632]  ? rest_init+0x2f1/0x2f5
[    8.163204]  rewind_stack_do_exit+0x17/0x20
[    8.164931] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    8.167575] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH v4 00/12] Change readahead API
  2020-02-04 15:32 ` [f2fs-dev] [PATCH v4 00/12] Change readahead API David Sterba
@ 2020-02-04 17:16   ` Matthew Wilcox
  0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-02-04 17:16 UTC (permalink / raw)
  To: dsterba, linux-fsdevel, linux-mm, linux-kernel, linux-btrfs,
	linux-erofs, linux-ext4, linux-f2fs-devel, linux-xfs,
	cluster-devel, ocfs2-devel

On Tue, Feb 04, 2020 at 04:32:27PM +0100, David Sterba wrote:
> On Sat, Feb 01, 2020 at 07:12:28AM -0800, Matthew Wilcox wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > I would particularly value feedback on this from the gfs2 and ocfs2
> > maintainers.  They have non-trivial changes, and a review on patch 5
> > would be greatly appreciated.
> > 
> > This series adds a readahead address_space operation to eventually
> > replace the readpages operation.  The key difference is that
> > pages are added to the page cache as they are allocated (and
> > then looked up by the filesystem) instead of passing them on a
> > list to the readpages operation and having the filesystem add
> > them to the page cache.  It's a net reduction in code for each
> > implementation, more efficient than walking a list, and solves
> > the direct-write vs buffered-read problem reported by yu kuai at
> > https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/
> > 
> > v4:
> >  - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'"))
> 
> I've tried to test the patchset but haven't got very far, it crashes at boot
> ritht after VFS mounts the root. The patches are from mailinglist, applied on
> current master, bug I saw the same crash with the git branch in your
> repo (probably v1).

Yeah, I wasn't able to test at the time due to what turned out to be
the hpet bug in Linus' tree.  Now that's fixed, I've found & fixed a
couple more bugs.  There'll be a v5 once I fix the remaining problem
(looks like a missing page unlock somewhere).



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-01 15:12 [f2fs-dev] [PATCH v4 00/12] Change readahead API Matthew Wilcox
2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox
2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox
2020-02-01 15:12 ` [f2fs-dev] [PATCH v4 10/12] f2fs: Convert from readpages to readahead Matthew Wilcox
2020-02-04 15:32 ` [f2fs-dev] [PATCH v4 00/12] Change readahead API David Sterba
2020-02-04 17:16   ` Matthew Wilcox

Linux-f2fs-devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-f2fs-devel/0 linux-f2fs-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-f2fs-devel linux-f2fs-devel/ https://lore.kernel.org/linux-f2fs-devel \
		linux-f2fs-devel@lists.sourceforge.net
	public-inbox-index linux-f2fs-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/net.sourceforge.lists.linux-f2fs-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git